SYNTAX AND SEMANTICS VOLUME 32
EDITORIAL BOARD
Series Editors BRIAN D. JOSEPH AND CARL POLLARD Department of Linguistics The Ohio State University Columbus, Ohio
Editorial Advisory Board JUDITH AISSEN University of California, Santa Cruz
PAULINE JACOBSON Brown University
PETER CULICOVER The Ohio State University
MANFRED KRIFKA University of Texas
ELISABET ENGDAHL University of Gothenburg
WILLIAM A. LADUSAW University of California, Santa Cruz
JANET FODOR City University of New York
BARBARA H. PARTEE University of Massachusetts
ERHARD HINRICHS University of Tubingen
PAUL M. POSTAL Scarsdale, New York
A list of titles in this series appears at the end of this book.
SYNTAX and SEMANTICS VOLUME 32 The Nature and Function of Syntactic Categories Edited by
Robert D. Borsley Department of Linguistics University of Wales Bangor, Wales
ACADEMIC PRESS San Diego London Boston New York Sydney Tokyo Toronto
This book is printed on acid-free paper. Copyright © 2000 by ACADEMIC PRESS All Rights Reserved. No parts of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the Publisher. The appearance of the code at the bottom of the first page of a chapter in this book indicates the Publisher's consent that copies of the chapter may be made for personal or internal use of specific clients. This consent is given on the condition, however, that the copier pay the stated per copy fee through the Copyright Clearance Center, Inc. (222 Rosewood Drive, Danvers, Massachusetts 01923), for copying beyond that permitted by Sections 107 or 108 of the U.S. Copyright Law. This consent does not extend to other kinds of copying, such as copying for general distribution, for advertising or promotional purposes, for creating new collective works, or for resale. Copy fees for pre-1998 chapters are as shown on the title pages, if no fee code appears on the title page, the copy fee is the same as for current chapters. 0092-4563/99 $30.00
Academic Press A Division of Harcourt, Inc. 525 B Street, Suite 1900, San Diego, CA 92101-4495 http://www.apnet.com Academic Press 24-28 Oval Road, London NW1 7DX http://www.hbuk.co.uk/ap/ International Standard Book Number: 0-12-613532-0 PRINTED IN THE UNITED STATES OF AMERICA 99 00 01 02 03 04 BB 9 8 7 6 5 4 3 2 1
CONTENTS
Contributors
ix
Introduction ROBERT D. BORSLEY
1
1. Some Background 2. The Chapters References
1 3 6
Grammar without Functional Categories RICHARD HUDSON 1. 2. 3. 4. 5. 6. 7. 8. 9.
Introduction Functional Categories Complementizer Pronoun Valency and Its Irrelevance to Classification Determiner FWCs as Classes of Function Words FWCs as Closed Classes Grammar without FWCs References
V
7 7 7 10 15 19 22 25 28 30 34
vi
Contents
Functional versus Lexical: A Cognitive Dichotomy RONNIE CANN 1. 2. 3. 4. 5.
Introduction Characterizing Functional Expressions The Psycholinguisticstic Evidence Categorizing Functional Expressions Conclusion References
Feature Checking under Adjacency and VSO Clause Structure DAVID ADGER 1. 2. 3. 4.
Introduction Feature Checking Subject Positions in Irish and Scottish Gaelic Conclusions References
Mixed Extended Projections ROBERT D. BORSLEY AND JAKLIN KORNFILT 1. 2. 3. 4. 5. 6. 7. 8.
Introduction A Proposal Some Constructions Some Alternative Approaches Some Impossible Structures Some Other Analyses A Further Issue Conclusions References
Verbal Gerunds as Mixed Categories in Head-Driven Phrase Structure Grammar ROBERT MALOUF 1. 2. 3. 4.
Introduction Properties of Verbal Gerunds Previous Analyses Theoretical Preliminaries
37 37 39 51 59 70 74 79 79 80 87 97 99 101 101 102 104 117 120 123 125 126 129
133 133 135 140 148
Contents
5. A Mixed Lexical Category Analysis of Verbal Gerunds 6. Conclusion References English Auxiliaries without Lexical Rules ANTHONY WARNER 1. 2. 3. 4. 5. 6. 7.
Introduction Auxiliary Constructions in Head-Driven Phrase Structure Grammar Negation Subject-Auxiliary Inversion Linear Precedence Postauxiliary Ellipsis Conclusion References
The Discrete Nature of Syntactic Categories: Against a Prototype-Based Account FREDERICK J. NEWMEYER 1. 2. 3. 4. 5.
Prototypes, Fuzzy Categories, and Grammatical Theory Prototype Theory and Syntactic Categories Prototypicality and Paradigmatic Complexity The Nonexistence of Fuzzy Categories Conclusion References
Vll
152 163 164 167 167 168 175 194 201 202 211 218
221 221 226 228 242 245 247
Syntactic Computation as Labeled Deduction: WH a Case Study 251 RUTH KEMPSON, WILFRIED MEYER VIOL, AND Dov GABBAY 1. 2. 3. 4. 5. 6.
The Question The Proposed Answer The Dynamics Crossover: The Basic Restriction Towards a Typology for Wh-Construal Conclusion References
251 256 264 272 278 287 291
viii
Contents
Finiteness and Second Position in Long Verb Movement Languages: Breton and Slavic MARIA-LUISA RIVERO 1. 2. 3. 4. 5.
PF Conditions on Tense Long Verb Movement versus Verb Second Breton South and West Slavic Summary and Conclusions References
French Word Order and Lexical Weight ANNE ABEILLE AND DANIELE GODARD 1. 2. 3. 4. 5. 6.
Introduction The Order of Complements in the VP A Feature-Based Treatment The Position of Adjectives in the NP Ordering Adverbs in the VP Conclusion References
Index
295 296 297 304 312 318 321 325 325 326 329 338 346 354 358 361
CONTRIBUTORS Numbers in parentheses indicate the pages on which authors' contributions begin.
Anne Abeille (325), IUF, Universite Paris, UFRL, Paris, France David Adger (79), Department of Language and Language Science, University of York, Heslington, York, United Kingdom, YO1 5DD Robert D. Borsley (1,101), Linguistics Department, University of Wales, Bangor, Wales, United Kingdom, LL57 2DG Ronnie Cann (37), Department of Linguistics, University of Edinburgh, Edinburgh, United Kingdom, EH8 9LL Dov Gabbay (251), Department of Computing, Kings College London, London, United Kingdom, WC2R 2LS Daniele Godard (325), CNRS, Universite Lille 3, Villeneuve d'Ascq, France Richard Hudson (7), Linguistics Department, University College, London, London, United Kingdom WC1E 6BT Ruth Kempson (251), Department of Philosophy, Kings College London, University of London, London, United Kingdom, WC1H OXG Jaklin Kornfilt (101), Department of FLL, Syracuse University, Syracuse, New York 13244 Robert Malouf (133), Stanford University, Stanford, California and University of California, Berkeley Frederick J. Newmeyer (221), Department of Linguistics, University of Washington, Seattle, Washington 98195 Maria-Luisa Rivero (295), Department of Linguistics, University of Ottawa, Ottawa, Ontario, Canada KIN 6N5 Wilfried Meyer Viol (251), Department of Computing, Kings College London, London, United Kingdom WC2R 2LS Anthony Warner (167), Department of Language and Linguistic Science, University of York, Heslington, York, United Kingdom YO1 5DD ix
This page intentionally left blank
INTRODUCTION ROBERT D. BORSLEY Linguistics Department University of Wales Bangor, Wales
For any theory of syntax, major questions arise about its classificatory scheme. What sort of syntactic categories does it assume? What properties do they have? How do they relate to each other?' The questions are prominent in different ways in two of the main contemporary theories of syntax, Principles and Parameters theory (P&P) and Head-driven Phrase Structure Grammar (HPSG), but they are also important in other theoretical frameworks. This book brings together ten chapters that discuss questions that arise in connection with the nature and function of syntactic categories. The book has its origins in a conference held at the University of Wales, Bangor, in June 1996, where earlier versions of all but three of the papers included here, those of Malouf, Rivero, and Warner, were presented.2 In this introduction I will sketch some background and then introduce each chapter.
1. SOME BACKGROUND The idea that syntactic categories are complex entities related in various ways is implicit in traditional discussion of grammar, where labels like "masculine singular noun" and "feminine plural noun" are employed. In spite of this, the earliest work in generative grammar assumed simple atomic categories and had no theory of syntactic categories.3 The work of the 1960s began to lay the basis Syntax and Semantics, Volume 32 The Nature and Function of Syntactic Categories
1
Copyright © 2000 by Academic Press All rights of reproduction in any form reserved. 0092-4563/99 $30
2
Robert D. Borsley
for a theory of syntactic categories. The idea that syntactic categories are complex entities was a central feature of Harman (1963), and it gained general acceptance after it was adopted by Chomsky (1965). However, Chomsky proposed not that all syntactic categories are complex but only that lexical categories like noun and verb are. He used features to provide a more refined classification of lexical items than is possible with labels like "N" and "V." In particular, he employed features to subclassify verbs, to distinguish, for example, between verbs like die, which take no complements, and verbs like kill, which take a noun phrase (NP) complement, marking the former as [+_#] and the latter as [+_NP]. He argued that "There is apparently no motivation for allowing complex symbols to appear above the level of lexical categories" (1965:188). Chomsky later abandoned this position (Chomsky, 1970) and proposed that all categories are "sets of features" (1970:49). In this work, he laid the foundations for X-bar theory, which was in part a theory of syntactic categories. Whereas Chomsky (1965) provided a way of recognizing subclasses of certain lexical classes, Xbar theory by breaking up categories into a basic categorial component and a bar level provided a way of recognizing certain superclasses of expressions. Thus, an NP like the picture of Mary, an N' like picture of Mary, and an N like picture are all identified as nominal expressions. Later work in X-bar theory analyzed nominal, verbal, adjectival, and prepositional expressions in terms of the features ± N, ± V and recognized further intersecting superclasses. Thus, all nominal and adjectival expressions are +N, and all nominal and prepositional expressions are —V. A rather different analysis of nominal, verbal, and so on was advanced in Jackendoff (1977). Thus, much was unclear. It was generally accepted, however, that syntactic categories are complex entities "going together" in various ways. A number of important ideas about categories have developed since 1980. Within P&P, a distinction has been drawn between lexical categories like noun and verb and functional categories like complementizer and determiner. Chomsky (1986) proposed that functional categories head phrases in just the same way as lexical categories. Subsequent work proposed a large number of abstract phonologically empty functional categories. (See Webelhuth, 1995, for a useful list.) Such categories generally play a role in licensing certain kinds of morphology or act as landing sites for movement processes and thus account for certain wordorder facts. For example, it is widely assumed that English aspectual auxiliaries precede the negative particle not because they are moved to a T(ense) functional category, whereas lexical verbs follow because they remain in situ. Similarly, it is assumed that the different position of finite verbs in main and subordinate clauses in German is a result of their movement to C in main clauses. In much the same way, Cinque (1994) proposes that Italian NPs have the order noun + adjective because nouns move to a functional head, whereas English NPs have the order adjective + noun because nouns remain in situ. Thus, functional categories account for differences in the distribution of members of the same broad category
Introduction
3
either within a single language or across languages. Given the proliferation of functional categories in P&P work, it is natural to ask whether they can be classified in some way. An important idea here is Grimshaw's (1991) proposal that functional categories are associated with specific lexical categories. Thus, C(omplementizer) and T(ense) are verbal categories, whereas D(eterminer) and Num(ber) are nominal categories.4 One point that we should stress here is that P&P ideas in this area go well beyond the claim that there is a significant distinction between lexical and functional expressions. Hence one might accept this claim without accepting many of the other P&P ideas. Also since 1980 the descriptive and explanatory potential of complex categories has been explored first within Generalized Phrase Structure Grammar (GPSG) and then within HPSG. Thus, early work in GPSG showed how a category-valued feature SLASH permitted an interesting account of long-distance dependencies. Subsequent work in HPSG showed the value of features with feature groups and lists and sets of various kinds as their value. Hence, whereas P&P has assumed a large number of relatively simple categories, GPSG and HPSG have assumed a smaller number of more complex categories. The GPSG/HPSG conception of syntactic categories naturally leads to different analyses of many syntactic phenomena. In particular, it leads to rather different approaches to morphology and linear order. Clearly, there are major questions about the relation between these two different conceptions of syntactic categories. Other work since 1980 has focused on the relation between syntactic and semantic information. This has been a central concern for work in various versions of categorial grammar, which assumes a very close relation between syntactic and semantic categories. The syntax-semantics relation has also been central for cognitive linguistics, which also, although in a very different way, assumes a close connection between syntactic and semantic categories. The relation between syntactic and semantic information has also been an important concern for GPSG and HPSG. It is hoped that this brief sketch makes it clear that syntacticians have developed a rich body of ideas about syntactic categories and raised a variety of important questions. The chapters in this volume explore some of these ideas and address some of the questions.
2. THE CHAPTERS A number of the chapters in the volume are concerned with functional categories. Chapters by Hudson and Cann consider whether there is a clear distinction between lexical and functional categories. Hudson argues against this idea, focusing in particular on determiners and complementizers. He argues that determiners
4
Robert D. Borsley
are pronouns and hence a subclass of nouns and not a functional category, and that the main putative examples of complementizers, that, for, and if, do not form a natural class, but are syncategorematic words, which are the sole member of a unique category. If his arguments are sound, they cast some doubt on an important element of P&P theorizing. Cann considers the distinction between lexical and functional categories from both a linguistic and a psycholinguistic point of view. He argues that there are no necessary or sufficient linguistic conditions that identify an expression as being of one type or the other. This suggests that the difference between them is not categorial. He argues, however, that evidence from processing, acquisition, and breakdown suggests that the distinction is categorial. He goes on to argue that the contrast between the linguistic and the psycholinguistic evidence reflects a difference between properties of E-language and properties of I-language and that the functional/lexical distinction holds of the former but not necessarily of the latter. He then develops a theory of categorization that incorporates this idea. For those who assume a distinction between lexical and functional categories, a variety of questions arise. For example, there are questions about what sort of functional categories should be assumed. In particular, a question arises about whether it is necessary to assume functional categories with no semantic import. Adger's chapter addresses this issue, and he argues that a proper consideration of the way that syntax interfaces with other components of grammar obviates this need. He illustrates this with a case study of subject licensing in Scottish Gaelic and Modern Irish. The consensus in the literature is that this requires the postulation of functional categories with no semantic import. Adger shows that an alternative appoach, which licenses the subject via a morphological process, allows the elimination of such abstract functional categories. Questions also arise about the relation between lexical and functional categories. As we noted earlier, Grimshaw proposes that some functional categories are inherently nominal and others inherently verbal. She also proposes that there are no "mixed extended projections," in which a functional category occurs not with the associated lexical category but with some other lexical category. Borsley and Korafilt argue against this claim. They consider a variety of constructions in a variety of languages that display a mix of nominal and verbal properties, and argue that these constructions should be analyzed as structures in which a verb is associated with one or more nominal functional categories instead of or in addition to the normal verbal functional categories. If this is right, then Grimshaw's claim is too strong. A very different approach to one of the constructions discussed by Borsley and Kornfilt, the English poss-ing construction, is developed by Malouf. This utilizes the hierarchical lexicon of HPSG to analyze verbal gerunds as both nouns, a category that also includes common nouns, and verbals, a category that also includes verbs and adjectives. He also shows how this approach can be extended to the English ace-ing construction.
Introduction
5
The hierarchical lexicon of HPSG is also exploited in Warner's chapter. Warner develops a detailed HPSG analysis of English auxiliaries, dealing in particular with negation, inversion, and ellipsis. He shows that the complex array of data in this domain can be accommodated through inheritance and that there is no need for lexical rules. As noted above, an important question about syntactic categories is how they relate to semantic categories. This is the main concern of Newmeyer's chapter. He focuses on the idea central to cognitive linguistics that categories have "best case" members and members that systematically depart from the "best case" and that the optimal grammatical description of morphosyntactic processes makes reference to the degree of categorial deviation from the "best case." He argues that the phenomena that have been seen to support this view are better explained in terms of the interaction of independently needed principles from syntax, semantics, and pragmatics. The relation between syntax and semantics is a major concern in the area of wh-questions, which are the focus of Kempson, Meyer-Viol, and Gabbay's chapter. They are concerned with why wh-questions have the properties that they do: long-distance dependencies, wh-in situ, partial movement constructions, reconstruction, crossover, and so on. They argue that this array of properties can be explained within a model of natural language understanding in context, where the task of understanding is taken to be the incremental building of a structure over which the semantic content is defined. The model involves a dynamic concept of syntax rather different from that assumed in the other chapters. Questions also arise about the relation between syntactic and morphological information. The role of certain morphological features in syntax is a central concern for P&P work. Rivero's chapter focuses on one aspect of this. She is concerned in particular with the licensing of Tense in Breton and South and West Slavic languages, which have main clauses in which an untensed verb precedes a tensed auxiliary. She argues that this is the result of Long Verb Movement, which is triggered by a PF interface condition. A very different approach to word order phenomena is developed in Abeille and Godard's chapter on French. They argue that a variety of French wordorder facts cannot be captured using only functional or categorial distinctions but also require a distinction in terms of weight. In addition to the traditional heavy constituents (which have to come last in their syntactic domain), they propose to distinguish between light constituents (consisting of certain words used bare or in minor phrases) that tend to cluster with the head, and middle weight constituents (including typical phrases) that allow for more permutations. They capture these distinctions within HPSG with a ternary-valued WEIGHT feature. The chapters collected here obviously do not discuss all the questions that arise about syntactic categories, but they do discuss many of the most important issues.
6
Robert D. Borsley
Above all, they highlight the centrality of questions about syntactic categories for a number of different frameworks.
NOTES 1 1 am grateful to Anne Abeille, David Adger, Ruth Kempson, Fritz Newmeyer, Marisa Rivero, and Anthony Warner for helpful comments on this introduction. 2 1 am grateful to the British Academy for help with the funding of the conference and to Ian Roberts for help with the organization. 3 Gazdar and Mellish (1989:141) trace the idea that syntactic categories are complex back to Yngve( 1958). 4 A rather similar conception of functional categories is developed in Netter (1994).
REFERENCES Chomsky, N. A. (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press. Chomsky, N. A. (1970). Remarks on nominalization. In R. Jacobs and P. S. Rosenbaum (Eds.), Readings in English transformational grammar. Waltham, MA: Ginn and Co. Chomsky, N. A. (1986). Barriers. Cambridge, MA: MIT Press. Cinque, G. (1994). On the evidence for partial N-movement in the Romance DP. In G. Cinque, J. Koster, J.-Y. Pollock, L. Rizzi, and R. Zanuttini (Eds.), Paths towards universal grammar: Studies in honor of Richard S. Kayne. Washington, DC: Georgetown University Press. Gazdar, G., and C. Mellish (1989). Natural language processing in PROLOG: An introduction to computational linguistics. New York: Addison Wesley. Grimshaw, J. (1991). Extended projection. Unpublished manuscript, Brandeis University, Waltham, MA. Harman, G. (1963). Generative grammar without transformational rules: A defense of phrase structure. Language, 39, 597-616. Jackendoff, R. S. (1977). X'-syntax: A study of phrase structure. Cambridge, MA: MIT Press. Netter, K. (1994). Towards a theory of functional heads: German nominal phrases. In J. Nerbonne, K. Netter, and C. Pollard (Eds.), German grammar in HPSG, CSLI, (297-340). Stanford, CA: Stanford University Press. Webelhuth, G. (1995). X-bar theory and case theory. In G. Webelhuth (Ed.), Government and binding theory and the minimalist program, (15-95). Oxford: Blackwell. Yngve, V. (1958). A programming language for mechanical translation. Mechanical Translation, 5, 25-41.
GRAMMAR WITHOUT FUNCTIONAL CATEGORIES RICHARD HUDSON Linguistics Department University College, London London, United Kingdom
1. INTRODUCTION The chapter considers the notion functional category and concludes that, at least as far as overt words are concerned, the notion is ill founded. First, none of the definitions that have been offered (in terms of function words, closed classes, or nonthematicity) are satisfactory, because they either define a continuum when we need a sharp binary distinction, or they conflict with the standard examples. Second, the two most commonly quoted examples of word classes that are functional categories cannot even be justified as word classes. Complementizers (Comp) have no distinctive and shared characteristic, and Determiners are all pronouns that are distinguished only by taking a common noun as complement—a distinction that is better handled in terms of lexical valency than in terms of a word class.
2. FUNCTIONAL CATEGORIES The notion functional category' has played a major part in discussions of syntactic theory. For example, Chomsky introduces it as follows: Syntax and Semantics, Volume 32 The Nature and Function of Syntactic Categories
7
Copyright © 2000 by Academic Press All rights of reproduction in any form reserved. 0092-4563/99 $30
8
Richard Hudson Virtually all items of the lexicon belong to the substantive categories, which we will take to be noun, verb, adjective and particle, . . . The other categories we will call functional (tense, complementizer, etc.). (Chomsky, 1995:6)
He later suggests that only functional categories carry strong features (Chomsky, 1995:232), and that they "have a central place in the conception of language . . . primarily because of their presumed role in feature checking, which is what drives Attract/Move" (Chomsky, 1995:349). Similarly, it has been suggested that functional categories cannot assign thetaroles (Abney, 1987; Radford, 1997:328), and that they can constitute the "extended projection" of their complement's lexical category (Grimshaw, 1991; Borsley and Kornfilt, this volume). According to the Functional Parameterization Hypothesis, functional categories are the special locus of the parameters that distinguish the grammars of different languages (Atkinson, 1994:2942; Ouhalla, 1991; Pollock, 1989; Smith and Tsimpli, 1995:24), and Radford (1990) has suggested that they are missing from child language. Such claims have not been restricted to the Chomskyan school: In Head-driven Phrase Structure Grammar (HPSG) we find the suggestion that only functional categories may act as "markers" (Pollard and Sag, 1994:45), and in Lexical Functional Grammar (LFG) that functional categories always correspond to the same part of f-structure as their complements (Bresnan, this volume). Any notion as important as Functional Category2 should be subjected to the most rigorous scrutiny, but this seems not to have happened to this particular construct. Instead it has been accepted more or less without question, and has become part of mainstream theorizing simply through frequent mention by leading figures. I suggest in this chapter that the notion is in fact deeply problematic. The attempts that have been made to define it are flawed, and all the individual categories that have been given as examples present serious problems. The issues raised here should at least be considered by proponents of the notion. If the criticisms are well founded, the consequences for syntactic theory are serious; but even if these worries turn out to be groundless, the debate will have made this key notion that much clearer and stronger. To avoid confusion it is important to distinguish three kinds of category, which we can call Word Category, Subword Category, and Position Category. Word categories are simply word classes—Noun, Determiner, and so on. Every theory accepts that there are words and that these fall into various classes, so Word Category is uncontroversial even if the validity of particular word categories is debatable. Subword categories are elements of syntactic structure that (in surface structure) are smaller than words—morphemes or zero. (Clitics are on the border between these types, but it makes no difference here whether we classify them as belonging to word or sub word categories.) The obvious example of a subword category is
Grammar without Functional Categories
9
inflection (INFL), to the extent that this corresponds merely to the verb's inflection or to zero. It is a matter of debate whether subword categories have any place at all in syntactic theory, and most theories at least restrict their use (e.g., by Pullum and Zwicky's principle of Morphology-free Syntax—Zwicky, 1994:4477). This issue is orthogonal to the questions about functional categories that I wish to raise here, so I shall avoid it by focusing on word categories. Position categories are a further extension of word and subword categories, where the name of the category is used to label a structural position. For example, the standard Barriers analysis of clause structure recognizes C and I as positions in an abstract tree structure. The labels C and I are abbreviations of Comp (for Complementizer) and INFL (for Inflection), but the link to the original word and subword categories is broken because these positions may be either empty or filled by a verb—which is not, of course, classified inherently as a complementizer or inflection, even if the relevant feature structures overlap. Such position categories are also controversial and raise problems of both fact (Hudson, 1995) and theory that go beyond the scope of this chapter. The central question to be addressed, therefore, is the status of the construct Functional Word Category (FWC), rather than the more general question of functional categories. Given this focus it is important to acknowledge that subword and position categories are also central to the discussion of functional categories. The conclusion of this chapter is that FWC is not justified, but even if this conclusion is correct, it will still remain possible that some subword and position categories are functional. I shall argue, for example, that Complementizer is not a valid word category, but it could still be true that the position category C is valid. On the other hand, FWCs are part of the evidence that is normally adduced in support of the more abstract categories, so anything that casts doubt on the former must affect the credibility of the latter. This chapter moves from the particular to the general. Sections 2 and 5 will discuss the categories Complementizer and Determiner, which are among the most frequently quoted examples of FWC. The discussion of Pronoun in section 3 is needed as a preparation for the proposed analysis of determiners, as is the general theorizing about valency and classification in section 4. The conclusion of these sections will be that neither Complementizer nor Determiner is a valid word class, so (a fortiori) neither can be a FWC. Sections 6 and 7 will consider two standard definitions of FWC: as a class of function words and as a closed class. It will be argued that Function Word is indeed an important and valid construct with a good deal of empirical support, and similarly (but to a lesser extent) for Closed Class. However, I shall also show that neither of these two concepts is suitable as a basis for FWC. The conclusion, in section 8, will be that FWC plays no part in grammar, though there may be a small role for Function Word. Encouragingly, Cann (this volume) reaches a similar conclusion by a different route.
10
Richard Hudson
3. COMPLEMENTIZER The following argument rests in part on a general principle of categorization that should be laid out before I proceed. The principle amounts to no more than Occam's razor, so it should be sufficiently bland to be acceptable regardless of theoretical inclinations. (1)
Principle 1 A word class should be recognized only if it allows generalizations that would not otherwise be possible.
The classic word classes satisfy this principle well. Take Noun, for example. Without it, one could say that some words can head a verb's subject, and that some words can head its object, but in each case one would have to simply list all the words concerned. Given the category Noun, however, we can express the generalization that the lists are the same—not to mention the lists needed for many other facts about distribution, morphology, and semantics. Similarly for Auxiliary Verb, a word class defined by a collection of characteristics that include negation, inversion, contraction, and ellipsis. Without this word class it would not be possible to show that these characteristics all applied to the same list of words. In contrast with these very well-established classes, some traditional word classes have a rather uncertain status, with Adverb as the classic case of a "dustbin" that has very few characteristics of its own, though probably enough to justify it among the major word classes. In short, every word class must earn its place by doing some work in the grammar. How does the word class Complementizer fare when tested against this principle? The history of this class is not encouraging, as its very existence escaped the notice of traditional grammarians; if it really does allow generalizations that would not otherwise be possible, how did traditional grammar manage without it? Even the name Complementizer suggests some uncertainty about the distinctive characteristics of its members: Do they introduce complement clauses or subordinate clauses in general? In English, the words concerned are (according to Radford, 1997:54) that, if and for. Every introductory book tells us that these form a class, with the possible addition of whether, but what precisely are the generalizations that this class allows? The answer seems to be that there are no such generalizations. This claim is controversial and requires justification, but before we consider the evidence I should reiterate that we are discussing the "word category," whose members are overt words, and not the "position category," which includes the structural position "C." I have argued elsewhere (Hudson, 1995) that this category is invalid as well, but that is a separate debate.3 What, then, do all the three core complementizers have in common? As Radford
Grammar without Functional Categories
11
points out (1997:54), they can all introduce a subordinate clause that is the complement of a higher verb or adjective. His examples are the following: (2) a. I think [that you may be right]. b. I doubt [if you can help]. c. I'm anxious [for you to receive the best treatment possible]. Radford's generalization is that complementizers: 1. indicate that the following clause is a complement of some other word, 2. show whether this clause is finite, and 3. mark its semantic role in the higher clause (which Radford calls its illocutionary force). Unfortunately these characteristics do not justify Complementizer, as we shall now see. • Claim A (indicating complement-hood) is false because the clause introduced by a complementizer need not be the complement of another word. That and for allow a subject link: (3) a. [That you may be right] is beyond doubt. b. [For you to receive the best treatment possible] is important. Moreover, for also allows an adjunct link: (4) a. I bought it [for you to wear]. b. A good book [for you to read] is this one. According to standard analyses of relative clauses, the same is even true of that, which is assumed to occur in combination with a zero relative pronoun (Radford, 1997:305): (5) He is someone [that we can identify with]. Furthermore, although it is true that all the complementizers may be used to introduce a complement clause, the same is also true of words that are not complementizers, most obviously the interrogative words. (6) a. I wonder [who came]. b. I know [what happened]. It is true that standard analyses assume a zero complementizer in addition to the interrogative word, but the claim is that complementizers indicate the clause's function, which must be a claim about overt words. • Claim B (indicating finiteness) is true, but again not unique to complementizers. The same is in fact true of every word that can introduce a clause: there
12
Richard Hudson
is no word that allows a clause as its complement without placing some kind of restriction on its finiteness. For example, why requires a tensed clause, whereas how allows either a tensed clause or an infinitival: (7) a. I wonder [how/why he did it], b. I wonder [how/*why to do it]. Similar remarks apply to all the traditional subordinating conjunctions, such as while, because, and unless, none of which are generally considered to be complementizers. • Claim C (indicating semantic role) is only partially true, as Radford's own second example illustrates: after doubt either that or if is possible without change of meaning. (8) I doubt [if/that you can help]. Moreover, to the extent that it is true, this characteristic is again not peculiar to complementizers. Most obviously, the same is (again) true of interrogative pronouns. Having considered and rejected Radford's generalizations, we should consider whether there are any other generalizations that might justify Complementizer. A plausible candidate concerns extraposition: all the complementizers allow extraposition. (9) a. It surprises me [that John is late]. b. It is unclear [if it rained]. c. It surprises me [for John to be late]. However, if Complementizer was valid this should be the end of the possibilities, but it is not. The same is also true for TO (which is not a complementizer) and for all the interrogative words, including whether: (10) a. It surprises me to see John here. b. It is unclear whether/when it rained. Indeed, extraposition is even possible for some noun-headed phrases, such as those containing nouns like WAY (but not MANNER) and NUMBER: (11) a. It is astonishing the way/*manner she drinks. b. It is astonishing the number of beers she can drink. These nouns can only be extraposed if they are modified by what is at least syntactically a relative clause: (12) a. *It is astonishing the clear way. b. *It is astonishing the incredibly large number.
Grammar without Functional Categories
13
Once again Complementizer does not prove particularly helpful. If there is a single thread running through all the phrases that can be extraposed, it may be semantic rather than syntactic. In short, whatever all three core complementizers have in common does not distinguish them from interrogative words. This suggests that Radford's three claims can and should be handled without mentioning Complementizer. Let us consider how this can be done. • Claim A. To the extent that complementizers do indicate a complement link between the following clause and some preceding word, this is because the latter selects it as the head of its complement. However, words that select complementizers always select specifically. This is illustrated in Table 1, which shows that think allows that or zero4 but not if or for, and so on. Furthermore, almost every verb that allows if also allows whether and the full range of interrogative pronouns. (The only exception is doubt.) In short, no valency statement will ever say, "such-and-such word takes as its complement a clause introduced by a complementizer."5 • Claim B. Precisely because different complementizers select different tenses, Complementizer as such will not help in constraining the choice of tense inflection. This selection must be handled separately for different complementizers: tensed or subjunctive6 after that, tensed after if, to after for. (13) a. b. c. d.
I know that Pat is/*be/*to be leader. I recommend that Pat is/be/*to be leader. I wonder if Pat is/*be/*to be leader. I long for Pat to be/*is/*be leader.
• Claim C. The same logic applies here, too. Different complementizers indicate different semantic roles, so verbs will select specific complementizers rather than the general category Complementizer. As mentioned above, almost every verb that selects if also allows any interrogative word, which makes Complementizer even less relevant to semantic selection. TABLE 1 SELECTIONAL DIFFERENCES AMONG COMPLEMENTIZERS Complement clause Verb
that/zero
Think Wonder Long Know
+ 0 0 +
if (whether, who . ..)
for ... to
+
0 0
0
+
+
0
0
14
Richard Hudson
In short, Complementizer has no role to play in defining the use of the words that, if, and for. It should be noted that we arrived at this conclusion while considering only the "core" examples, so the status of Complementizer is not likely to be improved by including more peripheral examples like whether. On the contrary, in fact, since whether is even more like the interrogative words than if is. Unlike if, but like interrogative words, it allows a following infinitive and a subject link: (14) a. I wonder [whether/when/*if to go]. b. [Whether/when/*if to go] is the big question. However we analyze whether, it is unlikely that we shall gain by invoking Complementizer. The conclusion, therefore, must be that Principle 1 rules out Complementizer. If these words are not complementizers, what are they? We might consider assigning them individually to existing word classes; for example, Haegeman classifies for as a prepositional complementizer, or more simply as a preposition (1994:167), in recognition of the fact that it licenses an accusative subject NP. But even if this is the right analysis for for, it is certainly not right for that (nor for if, though this is less obvious), and in any case it raises other problems. If for is a preposition, its projection should presumably be a PP and yet it is said to head a CP. Its classification should explain why a for-clause can be used equally easily as complement, as subject, or as adjunct, but no single established category has this distribution. The problems of classifying that and if are similar, but if anything even more acute. The alternative to problematic classification is no classification at all—an analysis in which these words are each treated as unique ("syncategorematic"). This is my preferred analysis, as it reflects exactly the outcome of the earlier discussion in which we found that each word is, in fact, unique. Thus that is simply a word, and so are if and for; they are recognized as lexical items, but have no grammatical features and belong to no categories. When the grammar mentions them, it defines them simply as lexical items whose word class is irrelevant. The only complementizer whose classification is at all straightforward is whether, whose similarities to interrogative pronouns have already been pointed out. At least some linguists (e.g., Larson, 1985) argue that it is in fact a whpronoun, and I myself agree with this conclusion (Hudson, 1990:374). Even this analysis is problematic, however, because whether, not being a true pronoun, has no grammatical role within its complement clause. In this respect it is just like if and that, as well as all the subordinating conjunctions, so it is at best a highly unusual wh-pronoun. In conclusion, we have found no justification for Complementizer because there seem to be no generalizations that apply to all the core members. This means that it is not enabling the grammar to express any generalizations, so according to
15
Grammar without Functional Categories
Principle 1, Complementizer does not exist as a category, so (a fortiori) it is not anFWC.
4. PRONOUN We now make a slight detour from the main discussion in order to establish the controversial claim that pronouns are nouns, which will play an important part in the next section's discussion of determiners. As it happens, Pronoun is itself claimed to be an FWC (Radford, 1997:48-49) on the grounds that pronouns are determiners and that Determiner is an FWC. The status of Pronoun as an FWC is thus tied up with that of Determiner, which is the topic for the next section. If, as I shall argue, Determiner is not an FWC, Pronoun cannot be either. However, the argument there presupposes a specific set of analytical assumptions about the classification of non-standard pronouns: that Pronoun is a subclass of Noun, and that determiners are pronouns. Before we can consider the status of Pronoun as an FWC, therefore, we must attend to these analytical questions. Why should we take Pronoun as a subclass of Noun? Radford's discussion considers only one kind of pronoun, personal pronouns, but it is uncontroversial that there are other subclasses, including reflexive, reciprocal, interrogative, relative, demonstrative, negative, distributive, and compound. These subclasses are presented in Table 2, a reminder that our Pronoun is the traditional word class, not the much smaller category that Chomsky (1995:41) TABLE 2 SUBCLASSES OF PRONOUN
a
Class
Definiteness
Personal Reflexive Reciprocal Relative Demonstrative Possessive Distributive Universal Existential Negative Interrogative Compound
Definite Definite Definite Definite Definite Definite Definite Indefinite Indefinite Indefinite Indefinite Indefinite
Examples I/me, you, he/him, one(?) Myself, yourself, himself Each other, one another Who, which, whose, where, when This /these, that /those Mine, yours, his; -'sa Each All, both Some, any, either None, neither Who, what, which, how, why Someone, anybody, nothing, everywhere
The analysis of possessive 's is controversial. I shall simply assume that it is a possessive pronoun; for evidence see Hudson (1990:277).
16
Richard Hudson
calls Pronoun. His category excludes reflexive and reciprocal pronouns and is claimed always to refer, so it presumably excludes the indefinites as well. What all these words share is the ability to be used in the range of phrasal environments available for a full NP; for example, they/them has almost exactly the same overall distribution as the students: (15) a. b. c. d. e.
They/the students have finished. Have they/the students finished? I like them/the students. We talked about them/the students. I saw them and the students.
There are obvious differences of detail that apply to specific subclasses (e.g., personal pronouns cannot be used before possessive - 's, reflexives cannot be used as subject) but the overall similarity between pronouns and NPs is striking. On the other hand, pronouns are different from common nouns in several ways, but in particular in their inability to combine with a preceding determiner (*the I, *a who, etc.). The range of possible modifiers is also strictly limited compared with common nouns, though some allow adjuncts (someone nice, who that you know), and depending on one's analysis, some allow complements (who came, this book).1 Traditional grammar recognizes Pronoun as a supercategory, one of the basic parts of speech alongside Noun, and links the two classes by the enigmatic statement that pronouns "stand for" nouns. In modern phrase-structure analyses the similarity is shown at the phrase level by giving the same label to the phrases that they head. This label is either DP or NP, depending on analysis, but this choice is crucial to the following argument so we shall keep it open by temporarily adopting the neutral term "Nominal Phrase." Thus they is not only a pronoun but also a nominal phrase, and the students is a nominal phrase. Suppose we accept this analysis, and also the general X-bar principle that a phrase's classification must be that of its head word. Given these two assumptions, what follows for the classification of they and students? (The next section will discuss the classification of the.) We have to choose between two answers: Al (the standard analysis): They belongs to the same class as the, which (for the time being) we can call "determiner"; we shall revise the name in the next section. A2 (my preferred analysis): They belongs to the same class as students: Noun. The choice between the two analyses revolves around the analysis of the one-word phrase students, where there is no determiner: (16) a. I like students. b. I found students waiting for me.
Grammar without Functional Categories
17
If students really is the only word in this phrase (as I shall argue), its classification must project to the phrase that must therefore be an NP, so they must also head an NP and must itself be a noun. If, on the other hand, the phrase students is headed by a covert determiner, it must be a determiner phrase and they must be a determiner. The standard analysis stands or falls with the covert determiner. We shall now consider some arguments for it, and an argument against it. One argument for the covert determiner is that it is required by the DP analysis (Abney, 1987), which is widely accepted. If the is the head of the students, the phrase the students must be a projection of the: DP. Therefore the one-word phrase students must be a DP, with a covert determiner. However, this argument rests on the assumption that the is not a noun. If it were, both phrases would be NPs and there would be no need for a covert determiner. The next section will be devoted to this claim, so I shall simply note at this point that (as far as I know) the category Determiner has always been taken for granted in discussions of the DP hypothesis so there is no "standard" evidence for it. I cannot prove that such evidence does not exist, but I shall prove that a coherent analysis can be built without this assumption. Radford gives some more direct evidence in support of the covert determiner (1997:152). He points out that (if it exists) it selects the same kind of complement as enough,8 namely a plural or mass common noun: (17) a. I found things/stuff/*thing. b. I found enough things/stuff/*thing. It also has an identifiable determiner-like meaning, which is either "existential," as in the above examples, or "generic," as in the following: (18)
I collect things /stuff/*thing.
In short, the covert determiner is a normal determiner in its semantic and syntactic characteristics, so its only peculiarity is its lack of phonology. However, this argument is open to various empirical objections: • It is because of the semantics of enough that its complement must be either plural or mass; so we might predict that the word meaning enough in any other language will have the same restriction. In contrast, the restrictions on the hypothetical covert determiner vary between languages—in many languages it would not be as restricted as in English. So even if the covert determiner selects the same kind of complement as some determiners, it does not select in the same way. • The word sufficient imposes the same restriction on the semantics of the noun as does enough, and also, like enough, it excludes (other) determiners. (19) a. enough/sufficient things/stuff/*thing b. some/the *enough/*sufficient stuff
18
Richard Hudson
But to judge by the adverb sufficiently, sufficient is an adjective, which weakens the argument for a covert determiner wherever the noun must be plural or mass. It could equally be argued either that enough is an adjective, or that there is a covert adjective whose meaning and distribution are like those of sufficient. • The fact that the covert pronoun allows generic and existential reference only shows that it places no restrictions on that aspect of reference. In contrast, overt determiners typically do restrict it; for example, some excludes generic reference. • What the covert pronoun does exclude is "definite" reference—reference to an object that is already known to the addressee; but definiteness is one of the main differences between common nouns and proper nouns, which are inherently definite. This suggests that the indefiniteness of the one-word phrase students is inherent to the common noun, rather than due to a covert determiner. We now consider the alternative analysis of the one-word phrase students in which there is no covert determiner, my analysis A2. The syntactic restrictions can be reversed: instead of saying that the covert determiner selects plural or mass common nouns, we have Rule 1.9 (20)
Rule 1 Singular, countable common nouns must be the complement of a determiner.10
As for the indefinite meaning of students, we can follow the suggestion made above: just as proper nouns are inherently definite, so common nouns are inherently indefinite. In both cases the default meaning may be overridden—in the terms of Pustejovsky (1991), "coerced"—by the meaning imposed by a determiner; so a common noun may be coerced into definiteness (the students), and a proper noun into indefiniteness (a certain John Smith). In this analysis, a has a special status similar to that of the dummy auxiliary do. There are purely syntactic patterns such as subject inversion that are only available for auxiliary verbs; so if the meaning to be expressed does not require any other auxiliary verb, do can be used because it has no meaning of its own, and therefore does not affect the meaning. Similarly for a: Rule 1 sometimes requires a determiner, so if no other determiner is required by the meaning to be expressed, a "dummy" determiner is needed which will not affect the meaning. This is a, whose only contribution to meaning is to restrict the sense to a single individual (thereby excluding plural and mass interpretations). This analysis of a leads to a positive argument against the covert-determiner analysis. The argument involves predicative uses such as the following: (21) a. They seem (*some/*no/*the) good linguists, b. He is a/*any/*no/*the good linguist. One of the restrictions imposed by verbs like seem is that a complement nominal must not contain a determiner other than a.'' Ignoring the exception of a, this
Grammar without Functional Categories
19
restriction is quite natural if we assume, first, that determiners relate semantically to a nominal's referent rather than to its sense, and, second, that a predicative nominal has no referent (as witnessed by the oddity of the question Which good linguists do they seem?). On that assumption, it is natural to assume that good linguists has no determiner at all, rather than that it has a covert one: so in Radford's terms, it is an NP, not a DP. But in that case it is hard to explain the determiner a in the second example—why is a not only possible, but obligatory? The DP analysis forces a disjunction: the predicative complement of verbs like seem is either an NP or a DP headed by a. (Alternatively, the disjunction may be between a and the covert determiner as the head of the DP.) Now consider the no-determiner analysis. Suppose we assume, first, that seem requires its complement to have no referent, and second, that most determiners have a referent. These two assumptions immediately explain the basic fact, which is the impossibility of determiners. The other fact is the appearance of a, which is also explained by two assumptions that we have already made: that Rule 1 requires a determiner before singular countable common nouns, and that a does not have to have a referent—it is a semantically empty, dummy word like the auxiliary do. The result is that a is both obligatory and possible with linguist, but neither needed nor possible with linguists. Rather tentatively, therefore, we may be able to conclude that nominal phrases need not contain a determiner, so they must be projections of Noun. Therefore pronouns too must be nouns. We can still distinguish Pronoun from Common Noun and Proper Noun as subclasses of Noun. As shown in Table 2, Pronoun has its own subclasses, and no doubt the same is true for Common Noun. The hierarchical structure is shown (using Word Grammar notation) in Figure 1, and more details of the assumed analysis can be found in Hudson (1990:268-335). The conclusion of this section is that Pronoun is a subclass of Noun. This view is quite traditional and fairly widely held (Huddleston, 1984:96), but it is controversial. On the one hand the traditional part-of-speech system treats Pronoun as a separate superclass, and this tradition persists in most modern descriptive grammars (e.g., Quirk et al., 1985:67). On the other hand, the modern DP analysis treats it as a subclass of Determiner, which itself is a distinct supercategory (Radford, 1997:154). If the present section is right, at least this part of the DP analysis must be wrong.
5. VALENCY AND ITS IRRELEVANCE TO CLASSIFICATION In preparation for the discussion of determiners we must establish another general principle, which is merely a particular application of Principle 1 (Occam's razor). It concerns the treatment of valency (alias subcategorization), the restrictions that words place on their complements.12 Various devices are available for
20
Richard Hudson
Figure 1
stating these restrictions: subcategorization frames, Case-marking, SUBCAT lists, linking rules, and so on. However formulated, these restrictions can be stated on an item-by-item basis, as facts about individual lexical items. There is no need to recognize a word class for every valency pattern unless the pattern correlates with some other shared characteristic. In fact, more strongly, it would be wrong to recognize a word class because the class would do no work. As we can see in the following abstract example, it would actually make the grammar more complex without permitting any additional generalization. Given some valency pattern V, which is shared by words A, B, and C, the simplest grammar relates V directly to A, B, and C, giving at most13 three rules:
(22) a. A has V. b. B has V. c. C has V. Now consider the alternative grammar in which A, B, and C are grouped into a word class W, whose sole defining characteristic is that its members have valency pattern V. In this grammar there must be four rules, because the membership of A, B, and C in W must be stipulated:14
(23) a. b. c. d.
A is a W. B is a W. C is a W. W has V.
Grammar without Functional Categories
21
So long as V is the sole characteristic shared by these three words, the grammar with W is clearly inferior to the one without it. In short, valency facts have the same status as any other facts. The above conclusion follows directly from Principle 1, but I shall state it as a separate principle: (24) Principle 2 A word class should not be recognized if its sole basis is in valency/ subcategorization. For example, if a verb's lexical entry shows that it needs a particle (e.g., give up), there is no point in duplicating this information by also classifying it as a "particle-taking verb" unless such verbs also share some other characteristic. So long as the complementation pattern is their only shared feature, the class Particletaking Verb is redundant. In most cases this principle is quite innocuous. It does, however, conflict with the traditional Bloomfieldian idea that differences of "distribution" justify differences of word class. This idea is still widely accepted (or at least taught): The syntactic evidence for assigning words to categories essentially relates to the fact that different categories of words have different distributions (i.e., occupy a different range of positions within phrases or sentences). (Radford, 1997:40; italics in original)
Principle 2 means that some distributional differences are not relevant to categorization, because they are best handled by means of lexical valency. Consider, for instance, the traditional classification of verbs as transitive or intransitive. These terms are simply informal descriptions of valency patterns that can be stated better in valency terms so far as they correlate with nothing else. Indeed, valency descriptions may be preferable to word classes even when there are other correlating characteristics. For example, as the classic Relational Grammar analyses showed (Blake, 1990), it is better to describe the facts of the French faire-faire' construction in terms of valency than in terms of transitive and intransitive verbs. (25) a. Paul fait rire Marie. Paul makes laugh Mary. 'Paul makes Mary laugh.' b. Paul fait lire la lettre a Marie. Paul makes read the letter to Mary. 'Paul makes Mary read the letter.' Described in terms of verb classes, as in (26a), the facts appear arbitrary; but explanation (26b) allows them to follow from the assumption that a verb cannot have two direct objects.
22
Richard Hudson
(26) a. The direct object of faire is demoted to an indirect object if its infinitive complement is transitive. b. The direct object of faire is demoted to an indirect object if it also has a direct object raised from its infinitive complement. It should be noted that Principle 2 is not a wholesale "rejection of distributionalism" (as one reader complained), but simply a recognition that the syntactic distribution of a word has two separate components. One component involves its relations to "higher" structures, via the phrase that it heads. Thus, when seen as head, a preposition is used in "prepositional" environments, a noun in nominal environments, and so on. These are the distributional facts for which word classes are essential. However, the other component involves its valency, its relations to "lower" structures. Here word classes are less helpful because the facts concerned vary lexically and/or semantically: different members of the same word class allow complementation patterns that vary in complex ways that have little to do with word classes, but have a great deal to do with semantics.
6. DETERMINER Turning then to Determiner, this is another class that Radford presents as a functional category (1997:45-48). In this case I shall use Principle 2 to argue that there is in fact no word class Determiner because Determiner would be a subclass that was defined solely by valency. As in other analyses, I shall classify determiners with pronouns, but I shall also argue that the superclass that contains both determiners and pronouns is actually Pronoun, not Determiner. The analysis will build on the idea of section 3 that pronouns are nouns. The first step in the argument is to establish that many determiners can also be used as pronouns. This overlap of membership has often been noticed, and has led to various analyses in which pronouns are treated as determiners (Radford, 1997:154). The earliest of these analyses was Postal (1966), who argued that pronouns are really "articles" generated in what would nowadays be called the Determiner position, and which may be followed by a noun, as in we linguists. The DP analysis of nominals continues the same tradition in which the classes of determiners and pronouns overlap, so the general idea is now very familiar and widely accepted. As Radford points out (1997:49), some of the words vary morphologically according to whether they are used as pronouns or as determiners; for example, none is the pronoun form that corresponds to the determiner form no, and similarly for mine/my, yours/your, hers/her, ours/our, and theirs/their. However the recent tradition assumes that these variations can be left to the morphology and ignored in the syntax. This seems correct (Hudson, 1990:269).
Grammar without Functional Categories
23
The second step involves the syntactic relationship between the determiner and the common noun. The DP tradition that Radford reviews takes the determiner as the head of the phrase. This is also the analysis that I have advocated for some time (Hudson, 1984:90), so I shall simply accept it as given. One of the benefits of the analysis is that it explains why the common noun is optional after some determiners but obligatory after the others (namely, the, a, and every): each determiner has a valency that not only selects a common noun but decides whether it is obligatory or optional, in just the same way that a verb's valency determines the optionality of its object. In other words, the lexical variation between determiners that I shall review is just what one would expect if the common noun is the determiner's complement. The final step is to show that this analysis is only partially right. Determiners are in fact pronouns, rather than the other way round. This may sound like a splitting of hairs, but it will make a great deal of difference to our conclusion. This alternative is not considered in the DP tradition, so there is no case to argue against. In favor of it, we can cite the following facts: • When a determiner/pronoun occurs without a common noun, it is traditionally called a pronoun, not a determiner. It seems perverse to call she and me determiners in She loves me, as required by the DP analysis. In contrast the traditional analysis would (incorrectly) treat this in this book as an adjective, so it is no more perverse to call it a pronoun than to call it a determiner. • Almost every determiner can also be used as a traditional pronoun, but most traditional pronouns cannot also be used as determiners. As mentioned earlier, the only determiners that cannot be used without a common noun are the articles the and a, and every. If we ignore the morphological variation discussed above, all the others can be used either with or without a common noun: (27) a. b. c. d. e.
I like this (book). Each (book) weighs a pound. I found his (book). I found the *(book). Every *(book) weighs a pound.
Given that Determiner is almost entirely contained in Pronoun, it seems perverse to call the superset Determiner. It seems, therefore, that "determiner" may be just an informal name for a particular kind of pronoun, namely a pronoun whose complement is a common noun (or, in more orthodox terms, a phrase headed by a common noun). I believe I may have been the first to suggest this (Hudson, 1984:90), but others have arrived independently at the same conclusion (Grimshaw, 1991; Netter, 1994). As promised, this change of terminology has far-reaching consequences. The pronouns that allow nominal complements are scattered unsystematically across
24
Richard Hudson
the subclasses distinguished in Table 2, with representation in nine of the subclasses: personal (we, you), relative (whose), demonstrative (this/these, that/ those), possessive (my, your, etc.), distributive (each, every), universal (all, both), existential (some, any, either), negative (no, neither), and interrogative (which, what). (The classification also needs to accommodate the articles the and a, but this is irrelevant here.) So far as I know there is no other characteristic that picks out this particular subset of pronouns. It is true, for example, that determiners are also distinguished by the fact of being limited to one per nominal phrase (e.g., unlike Italian, we cannot combine both the and my to give *the my house). But this simply reflects a more general fact about pronouns: they are not allowed as the complement of a pronoun. So long as the complement of a pronoun is limited to a phrase headed by a common noun, one-per-NP restriction will follow automatically.15 What this observation suggests is that valency is the sole distinguishing characteristic of determiners; in short, that determiners are just the subset of pronouns that happen to be "transitive." If this is so, Determiner is ruled out by Principle 2 as redundant. Instead of classifying this as a determiner, therefore, we just say that it allows a complement, and similarly for all the other determiners. Admittedly, this misses an important shared characteristic of determiners, which is that their complements are common nouns; but this generalization can be captured in terms of Pronoun. Indeed, we can even generalize across subclasses of pronoun about their valency, but without invoking Determiner. The following mini-grammar suggests how determiners should be treated:16 (28) a. b. c. d. e.
which is an interrogative pronoun. which allows a complement. this is a demonstrative pronoun. A demonstrative pronoun allows a complement. The complement of a pronoun is a common noun.
Rule (b) treats the valency of which as an arbitrary fact about this pronoun, whereas (d) generalizes across both the demonstrative pronouns; and (e) defines the possible complements for all pronouns (without, however, implying that all pronouns allow a complement). If determiners really are transitive pronouns, two things follow. First, so-called DPs must in fact be NPs because their head, the determiner, is a pronoun and pronouns are nouns. Therefore the head projects to NP, not DP. Second, and most important for present purposes, at least in English there is no category Determiner, so Determiner cannot be any kind of category, functional or otherwise. If correct, this conclusion should be worrying for those who advocate FWCs because I have now eliminated the two most quoted examples of FWC, Complementizer and Determiner.
Grammar without Functional Categories
25
7. FWCs AS CLASSES OF FUNCTION WORDS I turn now to a more general consideration of FWC in terms of its defining characteristics: what is it about a category that makes it qualify as an FWC? A satisfactory answer will draw a boundary around FWCs which satisfies three criteria. First, the boundary should be "in the right place," corresponding to the general intuitions that Auxiliary Verb and Pronoun are candidates for FWC, but Full Verb and Common Noun are not. How can we decide where the right place is, other than by invoking these intuitions? Fortunately, we already have a criterion: Principle 1. The category FWC will be justified to the extent that it allows generalizations that are not otherwise possible. This means that we must look for distinct characteristics that correlate with each other, comparable to those that distinguish (say) Verb from Noun. Without such correlations, any choice of criteria is arbitrary; but with them it becomes a matter of fact. If it turns out that criteria A and B correlate, and both characterize categories X, Y, and Z, then it is just a matter of fact, not opinion, that X, Y, and Z belong to a supercategory. If A and B are among the standard characteristics of FWCs, then we can call this supercategory "FWC"—provided it satisfies the remaining criteria. Second, the boundary should have "the right degree of clarity." By this I mean that it should probably be an all-or-nothing contrast, and not a matter of degree. This criterion is important because of the kinds of generalizations that are expressed in terms of FWC—categorical generalizations such as the one quoted earlier about the links between FWC and feature checking. It would be impossible to evaluate such generalizations if categories had different degrees of "functionality." Third, FWC should have the "right kind of membership," because as a supercategory its members must be categories, not words. If it includes some members of a category, it must include them all. We start then with a widely accepted definition of FWC, which links it to the traditional notion Function Word (FW): The lexical/functional dichotomy is rooted in the distinction drawn by descriptive grammarians . . . between two different types of words—namely (i) contentives or content words (which have idiosyncratic descriptive content or sense properties), and (ii) function words (or functors), i.e. words which serve primarily to carry information about the grammatical properties of expressions within the sentence, for instance information about number, gender, person, case, etc. (Radford, 1997:45)
As Radford says, this distinction is often drawn by descriptive grammarians, and is beyond dispute in the sense that there is an obvious difference between the meaning of a word like book or run and that of a function word such as the or will.
26
Richard Hudson
Cann (this volume) surveys a wide variety of criteria by which Function Word (his "FE") has been defined, and which all tend to coincide. The criteria are semantic, syntactic, and formal, and in each case FWs are in some way "reduced" in comparison with typical words. Semantically they lack what is variously called the "denotational sense" or "descriptive content" of words like tomato, syntactically their distribution is more rigidly restricted, and formally they tend to be short. We could even add to Cann's list of formal distinctions that are relevant to English: • In terms of phonology, only FWs may have /a/ as their only vowel. • In terms of spelling, only FWs may have fewer than three letters.17 • In terms of orthography, only FWs are consistently left without capital letters in the titles of books and articles.18 In short, there can be no doubt about the reality and importance of FW as the carrier of a large number of correlating characteristics. Nevertheless, FW does not justify FWC because it fails the second and third criteria. As far as clarity of boundaries is concerned, there are too many unclear borderline cases that either have one of the characteristics only to some degree, or that have some characteristics of FWC but not all of them. These uncertainties and conflicts are well documented by Cann. For example, there is no clear cutoff between having descriptive content and not having it, so there are borderline cases such as personal pronouns. Unlike clear FWs, these each have a distinct referent, and they also involve "descriptive" categories such as sex, number, and person. Perhaps because of this they tend to be capitalized in book titles (contrary to the pattern for FWs mentioned above). Similarly, it is hard to see where to draw the line, on purely semantic grounds, between the FW and and what is presumably not an FW, therefore: where, for example, would so belong? If even one of the characteristics of FW is indeed a matter of degree, determined by the "amount of meaning" carried, it cannot be mapped onto the binary contrast between functional and substantive categories. This is not an isolated case: formal brevity is even more obviously a matter of degree, and it is hard to imagine that there is a clear threshold for syntactic limitation. This applies similarly to the third criterion, concerning the membership of FWC. If it is indeed a set of word classes, then for any given word class either all of its members should belong to FWC, or none of them should. There should be no split classes (what Cann calls "crossover" expressions). And yet, as Cann points out, split classes do seem to exist. The clearest example of a split word class is Preposition. Some prepositions are very clear FWs—for example, of, at, in, to, and by all qualify in terms of both meaning and length. Indeed, all these prepositions have regular uses in which they could be said to have no independent meaning at all:
Grammar without Functional Categories
(29) a. b. c. d. e.
27
the city of Glasgow at home believe in fairies take to someone kidnapped by gangsters
In each example the preposition is selected by the construction, and does not contrast semantically with any other. On the other hand, there are also prepositions like during and after that have just as much meaning as some adverbs—indeed, there are adverbs that are synonymous except for being anaphoric (meanwhile, afterwards). If the adverbs are content words, the same should presumably be true of their prepositional synonyms. But if this is right, some prepositions are content words and some are FWs. This should not be possible if it is whole word classes that are classified as FWCs. A similar split can be found even among nouns and verbs, though in these classes there are very few members that have the "wrong" classification. The anaphoric one is an ordinary common noun (with the regular plural ones): (30) a. He lost the first game and won the second one. b. He lost the first game and won the other ones. One (in this sense) has no inherent meaning except "countable," since it borrows its sense from its antecedent by identity-of-sense anaphora. But it behaves in almost every other respect just like an ordinary common noun such as book—it accepts attributive adjectives, it inflects for number, and so on. Similarly for the British English anaphoric do, which is an ordinary nonauxiliary verb: (31) a. He didn't call today, but he may do tomorrow. b. A. Does he like her? B. Yes, he must do—just look how he talks to her. This too is completely empty of meaning—it can borrow any kind of sense from its antecedent, stative or active—and yet we use it syntactically exactly as we use an ordinary verb like run. Their lack of meaning suggests that both these words are function words—an analysis that is further supported by their shortness (do has only two letters, and one has a variant with /a/, which is often shown orthographically as 'un: a big 'un). And yet they are clear members of classes whose other members are content words. A similar problem arises with a widely accepted definition of FW in terms of "thematicity" (Radford, 1990:53, quoting in part Abney, 1987:64-65). According to Radford, FWs are "nonthematic," by which he means that even if they assign a theta role to their complement, they do not assign one to their specifier: for example, consider the auxiliary may in the following: (32) It may rain.
28
Richard Hudson
This has a thematic relationship to its complement "it . . . rain," but not to its subject. This may well be a general characteristic of FWs, but it does not apply in a uniform way to all members of the two main verb classes, Full Verb and Auxiliary Verb. On the one hand, as Radford recognizes, there are full verbs that are nonthematic (e.g., seem) and on the other, there are auxiliary verbs that are thematic. Perhaps the clearest example of a thematic auxiliary is the possessive auxiliary have, which for most British speakers can be used as in the following examples: (33) a. Have you a moment? b. I haven't time to do that. The inversion and negation show that have is an auxiliary, but it means "possess" and assigns a thematic role to both its subject and its complement. Some modal verbs also qualify as thematic auxiliaries. The clearest example is dare, but we might even make the same claim regarding can if we can trust the evidence of examples like the following: (34) a. Pat can swim a mile. b. Pat's ability to swim a mile is impressive. The words Pat's ability to swim a mile are a close paraphrase of example (a), so they should receive a similar semantic (or thematic) structure; but if the ability is attributed directly to Pat, as in (b), there must be a thematic link not only in (b) but also in (a). If this is true, can in (a) must be thematic, because it assigns a role to its subject as well as to its complement. In short, although most auxiliary verbs qualify as FWs, there are some that do not. This is not what we expect if Auxiliary Verb, as a whole, is an FWC. In conclusion, we cannot define FWC in terms of FW because the latter does not have suitable properties. As Cann says, FW is a "cluster concept," which brings together a range of characteristics that correlate more or less strongly with one another, but which does not map cleanly onto word classes. The boundary of FW runs through the middle of some word classes, and the criteria that define FW are themselves split when applied to word classes. There is no doubt that a grammar should accommodate FW in some way, but not by postulating FWC.
8. FWCs AS CLOSED CLASSES Another definition that has been offered for Functional Category refers to the distinction between open and closed classes, which again is part of a fairly long tradition in descriptive linguistics (Quirk et al., 1985:71; Huddleston, 1984:120). For example, Haegeman (1994:115-116) invokes the contrast between closed
Grammar without Functional Categories
29
and open classes when she first introduces the notion Functional Projection (her equivalent of Functional Category). It is also one of the criteria in Abney (1987:64). This distinction is different from the function-content distinction because it applies to classes rather than to their members. A class is open if it can accept new members, and closed if it cannot, regardless of what those members are like. This looks promising as a basis for the defmiton of FWC—at least it should satisfy our third criterion of having whole classes rather than individual words as members. However, this definition fares badly in relation to the other two criteria. Once again one problem is that the closed-open distinction is a matter of degree, whereas categories must be either functional or substantive; in short, this criterion fails on the second test, clarity of the boundary. The distinction really belongs to historical linguistics because the addition of new vocabulary changes the language and involves two diachronic processes: creative word formation and borrowing. Borrowing is the most relevant of these because it is the one most usually mentioned in discussions of the closed-open distinction. The question, then, is whether there is, in fact, a clear distinction between word classes that do accept loans (or caiques) from other languages, and those that do not. Among historical linguists the answer is uncontroversial (e.g., Bynon, 1977: 255; Hudson, 1996:58-59). There is no such distinction, only a gradient from the most "open" class, Noun, to the most closed ones (such as Coordinating Conjunction). Even the most closed classes do accept some new members. For example, in English, the list of personal pronouns has seen some changes through time, with the recent addition of one, "people," and the much older addition of they, them, and their; and even Coordinating Conjunction has a penumbra of semimembers (yet, so, nor—see Quirk et al., 1985:920), which may presage a future change of membership. Another way to approach the closed-open distinction would be to consider the actual size of the membership, giving a distinction between "large" classes and "small" ones. However, this is obviously likely to be a matter of degree as well, and it is precisely in the classes of intermediate size that uncertainty arises. Preposition is a clear example, with about seventy clear single-word members (Quirk et al., 1985:665), several of which are loans (via, per, qua, circa, versus, vis-avis, save). Chomsky (1995:6) appears to classify Preposition as a functional category,19 but Radford does not (1997:45). Quirk et al. (1985:67) classify Preposition as a closed class, in spite of the evidence in their own list, and Haegeman (1994:115) recognizes that it is a "relatively closed class," but nevertheless classifies it as a substantive category. As we saw in the last section, Preposition is also a troublesome borderline case for the definition of FWC in terms of FW. The closed-class definition of FWC also fails on the first test by not putting the boundary in the right place. The problem is that it is easy to find examples of closed classes that are not FWC by any other criteria—and in particular, not FWs.
30
Richard Hudson
Cann lists a number of examples such as points of the compass and days of the week. These have some idiosyncratic syntactic characteristics, but in most respects they are ordinary common or proper nouns (to the north, on Wednesday). The membership of these classes is rigidly closed, so should we conclude that they are FWCs in spite of being nouns? This discussion has suggested that FWC is not the natural extension of Closed Class that it may seem to be. Classes are more or less closed, but categories are not more or less functional, and a closed class may be a subset of an open one.
9. GRAMMAR WITHOUT FWCs If the previous arguments are right, the notion FWC has never been defined coherently, so we cannot be sure which categories are functional and which are not. Moreover, we have found that two of the clearest examples, Complementizer and Determiner, are not even word classes, let alone functional word classes. It therefore seems fair to conclude (with appropriate reservations about subword and position categories) that there may not in fact be any functional categories at all. However, it would be wrong to end on such a negative note, because the discussion also has a positive outcome: the validity of the notion Function Word as a cluster concept defined by a combination of characteristics. Even if FW does not justify FWC, it deserves some place in a grammar, but what place should it have? The basis for FW is that words that have very little meaning tend also to have very little form and very little syntactic freedom. One possibility is that this is a fact that is available only to academic observers of language, comparable with the facts of history and worldwide variation; but this seems unlikely, as the raw data are freely available to every speaker, and the correlations are both obvious and natural—indeed, iconic. It seems much more likely that FW is part of every speaker's competence, but cluster concepts are a challenge for currently available theories of grammar,20 especially when some of the concepts are quantitative (amount of meaning, amount of form, amount of freedom).
NOTES 1
This chapter has changed considerably since my presentation at the Bangor conference on syntactic categories in 1996. It has benefited greatly from discussion at that conference and at a seminar at University College London, as well as from the individual comments of And Rosta, Annabel Cormack, and, in particular, Bob Borsley. It also takes account of what two anonymous referees said about an earlier version. I am grateful to all these colleagues who have helped me along the way, and especially to Ronnie Cann for generously showing me several versions of his paper.
Grammar without Functional Categories
31
2
Grammatical terms may be used either as common nouns (e.g., a noun; two nouns) or as proper nouns (e.g., (The class) Noun is larger than (the class) Adjective). I shall capitalize them when used as proper nouns. 3 To give a flavor of the debate, consider the argument that the position C allows a simple explanation for word-order facts in Germanic languages. In V2 order, the verb is said to be in C, so if C is filled by an overt complementizer, the verb cannot move to C—hence clause-final verbs in subordinate clauses. Unfortunately for this explanation, it is not only complementizers that trigger clause-final verb position—the same is true of all traditional "subordinating conjunctions," relative pronouns, interrogative pronouns, and so on, none of which are assumed to be in C position. If they are (say) in Spec of C, why can't the verb move to C, as in a main clause? 4 The treatment of "zero" is irrelevant to our present concerns because any solution will pair zero with just one complementizer, that. My own preferred solution was suggested by Rosta (1997), and involves a special relationship "proxy." Verbs that allow either that or zero select a proxy of a tensed verb, which is either the tensed verb itself, or the instance of that on which it depends. In this way I avoid positing a "zero complementizer," while also avoiding the repeated disjunction "a tensed clause introduced by that or by nothing." As Rosta points out, another advantage of this analysis is that it allows us to refer directly to the finiteness of the lower verb. One part of finiteness is the contrast between indicative and "subjunctive," as in (1). (1) I recommend that Pat be the referee. A verb such as recommend may select the proxy of a subjunctive verb as its complement, which means that followed by a subjunctive verb. If it turns out that subjunctive verbs almost always occur with that, this fact can be built into the definition of "proxy of subjunctive verb," just as the optionality of that is built into that of "proxy of indicative verb." 5 A referee comments that the same is true of Preposition: no verb selects generally for PP, but many verbs select either individual prepositions (e.g., depend selects on) or some meaning which may, inter alia, be expressed by a PP (e.g., put selects a place expression such as on the table or here). This is true, but all it shows is that Preposition is not relevant to subcategorization. It is not a reductio ad absurdum of Principle 1, because Preposition can be justified in other ways (e.g., in terms of preposition stranding and pied-piping). 6 See footnote 4 for my preferred treatment of subjunctive selection. 7 According to the analysis for which I shall argue in section 5, determiners are pronouns that take common nouns as their complements. Wh-pronouns also take complements, though their complements are the finite verb in the clause that they introduce (Hudson, 1990:362). Given these two analyses, it follows that one pronoun may even have two complements, a common noun and a finite verb, as in which students came? 8 The word enough is a poor example of a determiner, as it is more like a quantity expression such as much and many—indeed, Rosta has suggested (1997) that the surface word enough corresponds to a pair of syntactic words, much/many enough. 9 Rule 1 ignores examples like the one Radford cites, our (8a): (1)
Pat is head of the department.
This shows that some singular countable common nouns can sometimes be used without a determiner, but this possibility depends both on the noun itself and on the containing
32
Richard Hudson
phrase. It is possible for names of professions (as in French), but not generally; and it is possible after the verb be or become, but not generally: (2) a. Pat is head/*bore/*admirer of the department. b. Pat is/became/*introduced/*looked for head of the department. If anything, the pattern confirms Rule 1, because the exceptions also involve a complement noun selecting the word on which it depends: Rule 1 (exception) A profession noun may be the complement of the verb be or become. Similar remarks apply to other well-known examples like the following: (3) a. We were at school/college/*cinema. b. He was respected both as scholar and as administrator. 10
This analysis reverses the usual relationship between complements and heads. In general, heads select complements, but in this analysis it is the complement (the common noun) that selects the head (the determiner). This relationship is not without precedent, however. In Romance languages, the perfect auxiliary is selected by its complement (e.g., unaccusative verbs such as "go," select "be," while other verbs select "have"). In English, the adjective same selects the determiner the (thel*al*my same person), and own selects a possessive determiner (my/*the/*an own house). 11 The possibilities are different if the complement contains a superlative adjective: (1) a. That seems the best solution, b. That seems my best option. Thanks to Annabel Cormack for this detail. And Rosta also points out the possibility of no before certain adjectives: (2) a. She seems no mean linguist. b. That seems no small achievement. 12
It could be argued that valency should also cover subjects/specifiers, but this is a separate issue. 13 It makes no difference to this argument whether valency patterns are stipulated or derived by general linking rules from argument structure. In either case, the links between the words and the word class have to be stipulated. 14 It makes no difference how the class-membership is expressed. I use Word Grammar terminology here, but the same objection would apply to a feature notation because the relevant feature [+W] would be stipulated lexically. 15 If the equivalent of *the my house is permitted in some other language, this must be because the valency of the article allows a disjunction: either a common noun or a possessive pronoun. Similar minor variations are found in English—for example, universal pronouns (all, both) allow a pronoun or a common noun (all (the) men, both (my) friends). This is to be expected if the complement is selected lexically by the determiner, as claimed here. 16 This little grammar speaks for itself, but illustrates some important principles of Word
Grammar without Functional Categories
33
Grammar, such as the independence of rules about "possibility" and "identity," and generalization by default inheritance. (For more details see Hudson, 1990, 1998.) 17 The only nonfunction words whose only vowel is /a/ are the interesting pair Ms. and Saint (pointed out by Piers Messun); and those that have fewer than three letters are go, do, and ox. The observation about spelling is in Albrow (1972) and Carney (1997). 18 For example, we all write Aspects of the Theory of Syntax, with the FWs of and the treated differently from the others. Even quite long FWs such as since and with may be treated in this way, but some short FWs tend to be capitalized, as in the following examples: (1) a. b. c. d.
But Some of Us Are Brave. What Do We Mean by Relationships? Up to You, Porky. Cosmetics: What the Ads Don't Tell You.
Usage is remarkably consistent across authors, but not completely consistent; for example I found its treated in both ways: (2) a. Huddersfield and its Manufacturers: Official Handbook, b. The English Noun Phrase in Its Sentential Aspect. 19 More accurately, Chomsky lists just four substantive categories, including "particle," but gives an incomplete list of functional categories that does not include Preposition. His view of Preposition therefore depends on whether or not he intends it to be subsumed under Particle. To add to the uncertainty, in another place (1995:34) he includes Pre- or Postposition among the categories that are defined by the features [N, V], which presumably means that it is a substantive category, but Particle is not. He does not mention Adverb in either place. 20 1 believe that Word Grammar offers as good a basis as any current theory for the treatment of cluster concepts. The "isa" relationship allows the members of FW to be either whole word classes (e.g., Auxiliary Verb) or individual words (e.g., that). Default inheritance allows individual FWs to lack any of the default characteristics, as required by any cluster concept. Functional definitions allow FW to have the same range of attributes as any specific word—a default meaning schema, a default valency pattern, even a default phonological and spelling schema. Thus the definition of FW might look something like this (where "whole" is the full inflected form—Creider and Hudson, 1999):
(1) a. b. c. d. e.
FW has a complement. FW's sense and referent are the same as those of its complement. FW's whole is one syllable. FW's vowel may be /a/. FW's written whole may be one letter.
This set of characteristics defines the typical FW, such as a, do, or of. Less typical FWs override one or more characteristics, so the more characteristics are overridden, the less typical they are. This captures one aspect of the quantitative variability of FWs. The other aspect involves the amount of variation on each of the individual characteristics; for example, a word that contains two syllables is clearly less exceptional than one with three, and a word whose sense supplies only one feature (e.g., "countable") is less exceptional than one that supplies two (e.g., "male," "singular").
34
Richard Hudson
REFERENCES Abney, S. (1987). The English Noun Phrase in its Sentential Aspects. MIT dissertation. Albrow, K. (1972). The English writing system: Notes towards a description. London: Longman. Atkinson, M. (1994). "Parameters." In R. Asher (ed.), Encyclopedia of Language and Linguistics (pp. 2941-2). Oxford: Pergamon Press. Blake, B. (1990). Relational grammar. London: Routledge. Bynon, T. (1977). Historical linguistics. Cambridge: Cambridge University Press. Carney, E. (1997). English spelling. London: Routledge. Crystal, D. (1995). The Cambridge encyclopedia of the English language. Cambridge: Cambridge University Press. Chomsky, N. (1995). The minimalist program. Cambridge, MA: MIT Press. Creider, C., and Hudson, R. (1999) Inflectional morphology in Word Grammar. Lingua 107: 163-87. Grimshaw, J. (1991). Extended projection. Unpublished manuscript. Haegeman, L. (1994). Introduction to Government and Binding Theory. (2nd ed.) Oxford: Blackwell. Huddleston, R. (1984). An introduction to the grammar of English. Cambridge: Cambridge University Press. Hudson, R. (1984). Word grammar. Oxford: Blackwell. Hudson, R. (1990). English word grammar. Oxford: Blackwell. Hudson, R. (1995). Competence without Comp? In B. Aarts and C. Meyer (eds.), The verb in contemporary English (pp. 40-53). Cambridge: Cambridge University Press. Hudson, R. (1996). Sociolinguistics. Cambridge: Cambridge University Press. Hudson, R. (1997). The rise of auxiliary DO: verb-non-raising or category-strengthening? Transactions of the Philological Society 95:1, 41-72. Hudson, R. (1998). An encyclopedia of Word Grammar. Accessible via http://www. phon.ucl.ac.uk/home/dick/wg.htm Larson, R. (1985). On the syntax of disjunction scope. Natural Language and Linguistic Theory 3:217-264. Netter, K. (1994). Towards a theory of functional heads: German nominal phrases. In J. Nerbonne, K. Netter, and C. Pollard (Eds.), "German grammar in HPSG. CSLI Lecture Notes 46." Stanford: CSLI. Ouhalla, J. (1991). Functional categories and parametric variation. London: Routledge. Pollard, C., and Sag, I. (1994). Head-driven Phrase Structure Grammar. Chicago: University of Chicago Press. Pollock, J.-Y. (1989). Verb-movement, universal grammar and the structure of IP. Linguistic Inquiry, 20, 365-424. Postal, P. (1966). On so-called 'Pronouns' in English. Georgetown Monographs on Languages and Linguistics, 177-206. Pustejovsky, J. (1991). The generative lexicon. Computational Linguistics, 17, 409-441. Quirk, R., Greenbaum, S., Leech, G., and Svartvik, J. (1985). A comprehensive grammar of the English language. London: Longman. Radford, A. (1990). Syntactic theory and the acquisition of English syntax. Oxford: Blackwell.
Grammar without Functional Categories
35
Radford, A. (1997). Syntactic theory and the structure of English: A minimalist approach. Cambridge: Cambridge University Press. Rosta, A. (1997). English Syntax and Word Grammar Theory. London PhD. Smith, N., and Tsimpli, I. (1995). The mind of a savant: Language learning and modularity. Oxford: Blackwell. Trudgill, P. (1990). The dialects of England. Oxford: Blackwell. Zwicky, A. (1994). Syntax and phonology. In R. Asher (ed.), Encyclopedia of language and linguistics (pp. 4476-81). Oxford: Pergamon Press.
This page intentionally left blank
FUNCTIONAL VERSUS LEXICAL: A COGNITIVE DICHOTOMY RONNIE CANN Department of Linguistics University of Edinburgh Edinburgh, United Kingdom
1. INTRODUCTION1 A persistent tendency within the grammatical tradition has been to divide grammatical categories and parts of speech into two superclasses. The distinction appears, for example, in the differentiation made between "grammatical" (or functor) expressions and "contentive" ones (Bolinger, 1975). The former consist of those expressions, words and bound morphemes, that serve a purely grammatical function, whereas the latter provide the principal semantic information of the sentence. In recent years within transformational syntax, the distinction has (re-)surfaced as a contrast between "functional" and "lexical" categories (Chomsky, 1995; Kayne, 1994; Ouhalla, 1991; Stowell, 1981; etc.). This distinction shares properties with that made between grammatical and contentive expressions in that it applies to bound morphs as well as to independent words and reflects a primary semantic distinction between theta-assigning (contentive) categories and nontheta-assigning (functional) ones (Grimshaw, 1990). It also reflects the distinction made in the classical grammatical tradition between "accidence" and "substance." The former refers primarily to the grammatical (morphological) categories exhibited by a language (such as case, tense, etc.) that are the parochial characteristics of word formation of a particular language, whereas the substantives are the linguistically universal classes and properties. Hence, functional elements may be associated with the accidental morphological properties of a Syntax and Semantics, Volume 32 The Nature and Function of Syntactic Categories
37
Copyright © 2000 by Academic Press All rights of reproduction in any form reserved. 0092-4563/99 $30
38
Ronnie Cann
language and so implicated in parametric variation. Lexical expressions, on the other hand, provide the universal substance of the sentence through their semantic content. The significance of this distinction has apparently received strong psycholinguistic support over recent years, with extensive evidence that the processing of functional expressions differs from that of contentive ones (see below for references). Evidence from aphasic breakdown, language acquisition, priming experiments, and so on all indicate that a small subset of words are processed differently from the majority of the basic expressions of a language. This difference in processing may be argued to reflect the different syntactic properties exhibited by the two macroclasses of elements and hence provide a sound psychological underpinning to recent developments in linguistic theory. However, despite the centrality of functional categories within current linguistic theory and the robustness of the psycholinguistic evidence for their significance in processing, there remains considerable vagueness about what exactly the term functional picks out from the expressions of a language, what constitutes a functional category, and what is the relationship between functional expressions, broadly construed, and the functional categories identified for a language, either specifically or universally. Within transformational grammar, the functional categories typically include complementizer, tense, and agreement, and are distinguished from the major categories of noun, verb, adjective, adverb, and (to a certain degree) preposition. In the psycholinguistic literature, however, expressions, such as, there, here, and so on, within the major classes, and discourse markers, such as therefore, are often included in the set of functional elements, whereas certain expressions often considered to be members of functional classes (like certain quantifiers, e.g., many, several, and the numerals) are treated as nonfunctional. The relation between the experimental evidence and the theoretical distinction is thus more problematic than at first appears. In particular, the question arises as to whether the functional distinction is categorial, as has been suggested in certain studies of first language acquisition (see Morgan, Shi, and Alopenna, 1996). If it is, then the nature of this categorial split and the way that it interacts with further categorization becomes an important question. If it is not, then one must ask what is the relation between the set of functional expressions and the functional categories recognized within syntactic theory. In this chapter, I explore these questions, beginning with a review of the general linguistic properties considered illustrative of the distinction and the psycholinguistic evidence for the nature of the functional-lexical divide. The main problem centers around whether the distinction should be made at the level of the expression or at some more abstract level of categorization. Noting that the evidence for a categorial distinction to be made between functional and lexical expressions comes principally from psycholinguistic studies, I argue that the distinction is best viewed in terms of Chomsky's (1986) differentiation be-
Functional versus Lexical: A Cognitive Dichotomy
39
tween I-language and E-language. The discussion then moves to the nature of E-linguistic categorization and its relation to I-linguistic (universal) categories. The chapter ends by questioning the need to set up the specific functional categories independently of functional lexemes themselves and suggests a model of the grammar that attempts to reconcile the psycholinguistic properties of functional expressions and their position within theoretical syntax.
2. CHARACTERIZING FUNCTIONAL EXPRESSIONS Within general linguistic theory, the identification of functional expressions and, especially, functional classes is controversial and problematic. Within transformational grammar, the syntactically significant functional classes include complementizer, determiner, and inflection (INFL), the latter of which is now often decomposed into Tense and Agr(eement), following Pollock (1989).2 Other functional categories are regularly added to the list, most frequently verbal categories such as Neg(ation) (Pollock, 1989), Asp(ect) (Hendrick, 1991), and Focus (Tsimpli, 1995, inter al.), but also nominal categories such as Det(erminer) (Abney, 1987), Num(ber) (Ritter, 1991), and K (case) (Bittner and Hale, 1996). In frameworks such as Kayne (1994) and Cinque (1998), functional categories are set up independently of any morphophonological considerations, leading to a proliferation of such categories that are empty of all content, syntactic, semantic, and phonological (their content coming from contentive specifiers). In this section, I am concerned with the general linguistic properties that have been proposed to characterize functional categories (see also Abney, 1987, for some discussion). The ones that interest me are defined over the functional expressions that instantiate the categories, rather than over more abstract properties (such as the ability to assign theta roles). In the discussion that follows, I shall be concerned only with the behavior of morphs or the observable characteristics of the classes they comprise. The syntactic properties discussed below are thus intended to be predicated of free and bound morphs such as articles, demonstratives, quantifiers, pronouns, complementizers, agreement affixes, tense, and reflexes of other inflectional elements, and not of the more abstract concepts with which they may be associated. The abstract functional categories of Kayne (1994) are hence omitted from consideration as they cannot directly provide evidence for a macrofunctional category. 2.1. Closed versus Open The distinction between functional and lexical parallels (and is often conflated with) that drawn between "closed" and "open" classes of expressions (Quirk et al., 1972:2.12-2.15). Functional classes such as pronoun, article, conjunction,
40
Ronnie Cann
and so on, form classes whose membership is fixed, whereas noun, verb, and adjective are open classes whose membership can be extended through borrowing or by transparent derivational means. Typically, functional classes are small and listable for any language, and the total number of all such elements within a language is considerably smaller than the numbers of open class expressions. Thus, the number of independent (wordlike) functional expressions within English has been said to be around 150 (Shillcock and Bard, 1993), which only increases slightly if bound morphemes are included in the total. This criterion is not entirely straightforward, however. In the first place, a number of subgroups of the traditional open classes form closed subclasses. For example, auxiliary verbs form a closed subclass of verbs, and days of the week, months of the year, points of the compass, and so on, form closed subclasses of nouns that show idiosyncratic syntactic behavior [compare (la) with (1b-lc) and (1d) with (le)].3 (1) a. b. c. d. e.
I'll see you Tuesday/on Tuesday/the first Tuesday after Easter. I'll see you tomorrow/*on tomorrow/*on the first tomorrow after Easter. I'll see you *breakfast/at breakfast/at the first breakfast after Lent. The exiles went North/to the North/to North London/North of Watford. The exiles went *London/to London/to London town/*London of Ontario.
Although the class of auxiliary verbs is usually taken to comprise a class of functional expressions, it is not normal to so classify the nominal subclasses indicated above, despite the fact that they clearly define a closed class of expression. Hence, the membership of some expression in a closed class is not by itself sufficient to make that expression (or the class that contains it) a functional one. Conversely, being identified as a functional expression may not always imply that the class it belongs to is closed. For example, the class of adverbs, generally construed as an open class, contain the expressions here and there, which are often classified with functional expressions, being essentially "pro-adverbs." Furthermore, there are closed classes, such as the pronouns, whose functional status is unclear and which are variously classified as reflexes of either major or functional categories (Noun, Agr, or Det). Thus, although there is a strong correlation between functional status and closed class, the property is neither necessary nor sufficient to distinguish functional classes from lexical ones. 2.2. Phonology and Morphology A number of phonological differences between the functional and lexical expressions have been noted. For example, evidence from English indicates that nonaffixal functional expressions typically lack metrical stress (see Cutler and Norris, 1988) and their vowels tend to be reduced and centralized (although this is unlikely to be true for all affixes in highly inflecting languages). For English,
Functional versus Lexical: A Cognitive Dichotomy
41
this phonological difference can also be seen in the general lack of initial strong syllables for functional expressions (9.5% of the 188,000 words in the LondonLund corpus), although it is common for lexical expressions (90%) (see Cutler and Carter, 1987). This reduced phonological status of functional expressions is reflected in their morphological structure. Functional expressions tend to be less independent than lexical expressions and are often encoded as bound morphs or clitics, as illustrated in (2). (2) a. b. c. d.
I'll work it out. (< will) *I'll the kettle (< fill). We've arrived. (< have) * We've on Sunday (< leave)
However, phonological reduction may also occur with lexical expressions in certain contexts. For example, it is likely to occur if a lexical expression is repeated or strongly predictable from the discourse context. In certain cases some expressions may even lose their lexical integrity (e.g., wanna < want to, gonna < going to), the latter contraction occurring in real lexical constructions (in British English, at any rate) as in I'm [gauna] London. On the other hand, it is possible in certain circumstances to accent functional expressions (e.g., in contrastive focus: / saw THE best book on Minimalism today, Cinderella HAS gone to the ball, etc.). Thus, although phonological and morphological reduction is indicative of functional status, it is not criterial. Following the general tendency for functional expressions to form closed classes, we find that they do not generally undergo derivational or other word formation processes like compounding (unfair ~ *unmany, verbify ~ *himify, owner ~ *haver). It is certainly true that books about morphology discuss only such processes as they apply to content words, and there are few uncontroversial examples of derivation as applied to functional expressions. On the other hand, the lack of derivation is not a sufficient criterion for functional status, as many lexical expressions fail to undergo expected or possible derivational processes (e.g., unhappy ~ *unsad ~ *unmany). Hence, again, we see that the lack of derivational morphology associated with functional expressions is not a sufficient condition to distinguish functional expressions from lexical expressions. 2.3. Syntax A number of syntactic differences have been said to distinguish lexical and functional expressions. In the first place, the latter appear in more restricted syntactic contexts than the former. For example, functional expressions usually appear in just a few syntactic contexts, and these are definitional of the class they belong to. Thus, modals must appear in construction with4 a bare V (or zero proverb) (Kim may go/Kim may I* Kim may going/* Kim may a dog)', articles all appear in construction with a following noun and nowhere else (the goosel*the ran, etc.);
42
Ronnie Cann
quantifiers appear independently (manyI all) or in construction with a following noun (many geese/all sheep) or with a following of phrase (many of the sheep/ all of the sheep) and so on. For lexical expressions, on the other hand, syntactic context varies widely and is not definitional of the class as a whole, or even of distinct subclasses. For example, lexical expressions may appear in various syntactic environments: (e.g., verbs may appear with or without direct objects or with sentential or nominal complements or with NPs in various cases: e.g., partitive for accusative in Finnish, etc.) (believe 0/the story/that Kim is mad/Kim to be mad); nouns may appear with or without determiners (water/the water/the water of Leith); adjectives may appear predicatively or attributively, and so on. Thus, the fact that an expression is a verb says nothing about the number and class of its complements. However, identifying an expression as a (proper) quantifier (in English) automatically predicts that it may appear on its own, with a common NP or with a following of phrase containing a definite NP. Furthermore, if a functional expression can appear with an expression with a particular property, then it will appear with all expressions with the same property. Thus, the can appear with any common noun in English, a can appear with any singular count noun, be can appear with any transitive verb that has an en form, and so on. For lexical expressions, however, there is no guarantee that an expression can appear with all relevant complements. Thus, while transitive verbs all take NP direct objects (by definition), it is not the case that a particular transitive verb will appear with every NP because of selectional restrictions (e.g., kick the footballl*kick many ideas). It is also possible for lexical expressions to be so restricted in their distribution that they will appear with only one or two items in the language (e.g., addled in English, which can collocate only with the words eggs and brains). The possibility of restrictive collocation does not seem to hold of functional expressions and may be attributed to the fact that such expressions typically do not impose idiosyncratic semantic selectional restrictions on their complements. Another aspect of the syntactic restrictedness of functional expressions, unlike lexical expressions, is that there are no processes that alter their selectional properties. Thus, there are no processes that apply to functional expressions5 that alter the status of their semantic arguments (as in passivization, raising, etc.), whereas such processes are common for lexical expressions and ensure that they appear in a wider range of syntactic contexts. Furthermore, it is not normally the case that long-distance dependencies alter the contexts in which functional expressions may be found. Question movement, topicalization, extraposition, and so on, which may radically alter the environments in which lexical expressions are found, do not generally apply to the complements of functional expressions. This is necessarily true of affixes, but also holds of more independent expressions, hence the ungrammaticality of expressions like *cats, Kim really liked the parallel to The cats, Kim really liked.
Functional versus Lexical: A Cognitive Dichotomy
43
This is not always true of all classes of functional expressions, however. For example, both auxiliaries and prepositions in English permit the extraction of their following complements (e.g., Who did Kim give the book to? What town did Kim send the cat to? Lou said he must go to town, and so go to town, he must.) However, such extractions are not common and are often subject to restrictions not apparent with lexical expressions. Thus, in English, the topicalization of a VP after a modal or auxiliary is strongly literary, whereas extraction from prepositional phrases is not completely free. It does not, for example, apply to clausal complements (assuming that complementizers like because are prepositions, see Emonds, 1976) (e.g., *Kim is mad, Jo is not happy because), nor to prepositional ones (*Through what did Kim go out? parallel to What did Kim go out through?). It is worth noting in this regard that auxiliaries and prepositions both have stronger semantic argument properties than many other functional expressions and given the association often made between argument structure and extraction,6 it is possible that this property is responsible for such exceptions to the general rule. Conversely, there are processes that apply to functional expressions that do not apply to lexical ones. An obvious example are the auxiliary verbs in English, which may appear before the subject (Will Hermione sing?/*Sings Hermione?); host the negative clitic n't (Hermione won't sing/*Hermione singn't)', and permit cliticization to a preceding element (Hermione'II sing soon). Although there are some verbs that occupy an awkward midway position between auxiliary and main verb in allowing some of these processes (such as need, dare, see Pullum and Wilson, 1977, inter al.), the majority of verbs show none of them. Groups of functional expressions also tend to cluster together around a particular major class (e.g., determiners and quantifiers with nouns, tense, aspect, and agreement with verbs) and these groupings define syntactic domains of a particular type (an extended projection in the terminology of Grimshaw, 1991). Thus, in English any expression appearing after the must be interpreted as nominal, whereas any expression appearing with a modal must be verbal [e.g., (3a, 3b)]. Where functional expressions from different domains are combined, the result is generally gibberish [e.g., (3c)]. (3) a. the kill (N) ~ may kill (V) b. the killing of the whale (N) ~ may kill the whale (V) c. *the ran ~ *many bes ~ *may them The same strict interpretation of syntactic domain does not hold of combinations of lexical expressions, and apparently anomalous combinations of expressions (e.g., adjective plus verb) do not necessarily lead to nonsense. Thus, the strings slow ball or cat killer may be used in different environments without being incomprehensible, compare (4a) with (4b) and (4c) with (4d). (4) a. Kim hit a slow ball (N). b. Kim slow balled it into the back of the net (V).
44
Ronnie Cann
c. Felix was a cat killer (A/N). d. Felix cat killered it round the garden (V). Another important property of functional expressions is that they can alter the categorial status of lexical expressions, whereas the latter cannot "coerce" functional expressions out of the domain that they define. Thus, tense morphemes are always verbal, articles are always nominal, whatever lexical expression they appear with.7 Looked at extensionally, once a functional expression has been assigned to a general domain (nominal, verbal, or whatever), then it always remains in that domain (although certain ones may be underspecified, like English -ing forms, which can appear in nominal or verbal contexts, see Borsley and Kornfilt, this volume). Lexical expressions, on the other hand, are freer to appear in different syntactic domains.8 Thus we have a situation where functional expressions generally exhibit a more restricted syntax, are more categorially determinate than lexical expressions, and often also associated with syntactic positions that cannot be occupied by lexical expressions. Furthermore, they cannot be coerced out of their syntactic category in the same way as lexical expressions. These properties are more robustly and generally applicable to functional expressions than those discussed in previous sections. Again, however, they are neither fully necessary nor sufficient to guarantee that some expression is functional, as there are lexical expressions with restricted syntax (e.g., addled noted above) and that resist appearing as a member of more than one category, and there are functional ones that appear in a wider range of contexts and as member of different categories (e.g., the participle forms in English) and which do only appear in positions that can be occupied by lexical ones (e.g., pronouns).
2.4. Semantics The most quoted semantic difference between the two classes of expression is that functional expressions have a "logical" interpretation, whereas lexical expressions have a denotative one. Thus, we find that major word classes have been traditionally defined in terms of their supposed semantic denotations. Nouns are notionally classed as expressions that name persons, places, or things; verbs are classed as expressions that denote actions, processes, states, and so on. Although structuralist linguists have denied the utility of such notional definitions of the parts of speech, the concept was defended in Lyons (1966) and has reentered the literature in terms of semantic sorts. Thus, many theoretical frameworks make use of Davidson's ontological distinction between events and individuals (see Davidson, 1967). Although the correspondence is not strictly parallel to the syntactic classification of verb versus noun (phrase), its recent appearance indicates a persistent tendency for lexical expressions to be defined in terms of their denota-
Functional versus Lexical: A Cognitive Dichotomy
45
tion (i.e., through the ontological properties of the sorts of thing they typically identify). (See also Anderson, 1997, for other a recent notional theory of the parts of speech.) Functional expressions, on the other hand, are said not to denote in the same way: they do not pick out sets of primitive elements, and ontological considerations do not have an effect on their classification. Instead, functional expressions typically semantically constrain relations between basic denotational classes or provide instructions for where to look for the denotation specified by an associated lexical expression. So, for example, quantifiers relate cardinalities and proportions of elements between nominal and verbal denotations; articles provide information about the discourse status of a referent; tense provides information about the relative time an event occurs; .modals provide information about the status of an event or proposition (e.g., as possible, necessary, etc.). However, such an approach begs many questions. Precisely what it means to have a logical interpretation is not easy to define, and the attempt at a characterization of the semantics of functional expressions in the previous paragraph is not easy to sustain. For example, although it is often true that functional expressions constrain relations between classes of primitive denotata, this does not hold of anaphoric expressions such as pronouns, pro-adverbs, and so on, which have a referential rather than a relational function. Furthermore, many lexical expressions denote relations that may, as in the case of verbs taking intensional complements like want, be as complex in semantic structure as more obviously functional expressions. Nor is it possible to maintain a view of functional expressions in which they typically convey less information (in some sense) than lexical ones. The semicopular verbs in English such as seem, appear, and so on, are typically treated as lexical expressions despite the fact that the information they convey bears comparison with that conveyed by the modal auxiliaries like can, may, which are treated as functional. Moreover, certain apparently functional expressions (like quantifiers such as several and numerals) again appear to convey as much information as lexical nouns (such as number, mass and so on). It appears, therefore, that although there does intuitively seem to be some content to the idea that the major lexical classes denote ontologically basic elements, a purely semantic characterization of the difference between functional and lexical expressions is unlikely to be sustainable. A more robust semantic property that differentiates the two classes, however, can be found in the interrelations that are found between members of different subclasses of expressions. Lexical expressions are linked in complex arrays of sense relations and exhibit identifiable semantic relations with each other, in terms of synonymy, hyponymy, selectional restrictions, and so on (see Cruse, 1986, for an extended discussion). These properties constitute the subject matter of most work on lexical semantics and provide interesting insights into the way our experience of the world is structured. No such sense relations obtain between
46
Ronnie Cann
(subclasses of) functional expressions. Although classes of functional expressions do exhibit similarities in meaning, this always results from the defining characteristics of the class itself. Thus, the and a, might be described as "opposites" (or co-hyponyms) of definiteness, but the relation between them is not one that is identifiable in groups of lexical expressions, nor is there ever a corresponding superordinate expression (i.e., no general purpose definite/indefinite marker) that can be transparently related to other subclasses of functional expressions.9 Quantifiers also form a class that exhibits a number of logical relations with each other, but these result from the basic semantics of the class in determining relations between sets, and the common characteristics are constrained by properties like permutation, conservativity, and so on (see van Benthem, 1986), which are hypothesized to be universal, unlike the parochiality exhibited by semantic fields in different languages. In other words, classes of functional expressions are semantically isolated, whereas lexical expressions are linked in complex arrays of meaning relations. Another semantic property displayed by lexical expressions but not by functional ones involves "coercion" or the modification of the denotation of one lexical expression by that of another. A classic example of this involves the influence of a complement NP on the Aktionsart of a sentence (see Verkuyl, 1993, inter al.). Thus, a bounded NP object like three dinners with an essentially unbounded process verb like eat produces an interpretation of the event as bounded (i.e., as an accomplishment), whereas a semantically unbounded NP (mass or bare plural) induces a process interpretation (5a-5c). This does not happen with functional expressions whose interpretation remains constant whatever semantic characteristics are displayed by the expression with which they combine. Notice further that combining a distributive quantifier with a mass term (or vice versa), does not affect the basic interpretation of the quantifier, which remains distributive (or mass). So three wines is distributive/count in (5d) and much sheep remains mass in (5e), despite the normal denotation of the complement noun. (5) a. b. c. d. e.
Kim ate all day. Kim ate three ice creams. Kim ate ice cream all day. Kim drank three wines. Much sheep was eaten by the infected cattle.
Unbounded/Process Bounded!Accomplishment Unbounded/Process Count Mass
The effects of semantic coercion go beyond Aktionsart, however. Because of the existence of selectional restrictions, different combinations of lexical expressions may give rise to metaphorical effects requiring inferencing to resolve apparent contradictions. In other words, attempts will be made to accommodate apparently anomalous combinations of lexical expressions, yielding metaphorical interpretations that may alter the basic type of object described by a phrase. Thus, in (6a) the event described is a physical one, whereas in (6b) an abstract event is
Functional versus Lexical: A Cognitive Dichotomy
47
described (see Pittock, 1992, for some discussion of this form of semantic coercion). Contradictions generated by combinations of functional expressions, on the other hand, lead to incomprehension/ungrammaticality (*all some books cf. all the books). In other words, the meaning of a functional element is not negotiable: there is no "inferential space" between a functional expression and the expressions with which it combines. (6) a. The river flowed to the sea. Physical b. Kim's thoughts flowed to Skye. Abstract A further semantic property of functional expressions that has been noted is that they may yield no semantic effect in certain environments. This is typically said of case or agreement, and in Chomsky (1995) a distinction is made between interpretable and noninterpretable features from which a number of theoretical consequences are derived. It is certainly the case that grammatical properties that are determined by other elements may not be semantically interpreted. Thus, the preposition to following a ditransitive verb like give is said not to have a semantic role but to act like a case-marker, in distinction from its use following an intransitive verb of motion like go. However, it is unlikely that any grammatical distinction that is not purely morphological (e.g., declensional class and the like) is entirely without interpretative capability. For example, agreement is often asserted (usually without discussion) to be an instance of a category without semantics, its sole role being to encode dependency relations. But this is shown to be false when one takes into account examples where agreement relations are broken (instances of grammatical disagreement). Where expected patterns are disrupted, the disagreeing feature signalled by the functional expression (usually an affix) induces an interpretation based on the interpretation of that feature. We find examples of this in many languages that have a system of agreement, as illustrated in the examples from Classical (Attic) Greek in (7a-7b). In (7a), there is a disagreement in number on the verb that emphasizes the individual nature of the withdrawal, and in (7b) there is a disagreement in gender that signals the effeminacy of the subject (see Cann, 1984, ch. 6 for further discussion of such phenomena).10 (7) a. to stratopedon anekho:ru:n Thucydides 5.60 the army[sg] withdraw [3pl] 'The army are withdrawing (severally).' b. kle:sthene:s esti sophoitate: Kleisthenes[masc] is wise[superlative,fem] 'Kleisthenes is a most wise woman.' However, although it does not appear to be true that functional expressions always lack semantic effect, it is true that this is often suppressed or eliminated in normal environments. Such is not the case with lexical expressions, however. The meaning of a lexical expression is not fully suppressed, even in strongly idiomatic or
48
Ronnie Cann
metaphorical environments, as can be seen in the ways in which metaphors and idioms can be felicitously extended. For example, (8a) makes a better extension of the figurative sentence in (6b) than (8b), whereas (8c) is a more informative statement than (8d), showing that the literal meaning of expressions is not completely suppressed in coerced (metaphorical or idiomatic) interpretations. (8) a. and eddied around the poor cottage where her mother lived. b. ??and exploded beside the poor cottage where her mother lived. c. Tabs were kept on the victim but they kept blowing off. d. ??Tabs were kept on the victim, but they were very noisy. Although an absolute distinction between the semantic properties of functional and lexical expressions cannot probably be made, semantic differences between the two classes do thus exist. Lexical expressions engage in rich semantic relations with others of the same sort, but their interpretation is subject to inferential manipulation in context. The semantics of functional expressions, on the other hand, may be suppressed in normal environments, but their interpretation cannot be coerced by other expressions with which they appear. 2.5. Diachrony, Polysemy, and Homonymy If there were a strong categorial differentiation between functional and lexical expressions, this would imply that the sets of expressions that make up the two macroclasses are discrete. This requires formally identical morphs that have both functional and lexical manifestations to be treated as homonyms rather than polysemes, which typically do not involve different categories. Hence, the morph to in English in its grammatical usage as a case marker ought to be classed as functional, whereas its manifestation as a preposition of motion should be classed as a lexical homonym. However, it is far from clear that the two uses of the preposition are as distinct as homonymy implies. For example, as a case-marker to only marks NPs whose relation to the event described by the main verb is such as can be described as a goal. There are no examples of this preposition marking patient, theme, or source NPs, indicating that it is a semantically reduced variant of lexical, to.11 This observation has led to the view, advocated in Adger and Rhys (1994), that case-marking prepositions (and other functional expressions in Chinese, see Rhys, 1993) mediate the thematic role assigned by a verb. Thus, although such prepositions, whose appearance is determined by a verb, do not themselves assign a full thematic role to their complement NPs, they provide bridges to help verbs assign the correct thematic roles to their arguments and so must be of the right sort to identify that role. If we accept this view, then we could hypothesize that there is only a single preposition to12 in English that has functional and lexical manifestations. Other evidence against homonymy comes from diachronic processes of Gram-
Functional versus Lexical: A Cognitive Dichotomy
49
maticalization. According to one theory (Hopper and Traugott, 1993), an expression develops into a grammatical homonym through a period of polysemy involving pragmatic enrichment (9). (9) single item > polysemy (pragmatic enrichment) > homonymy ("bleaching") It is the middle phase that poses problems for the idea that there is a discrete categorial difference between functional and lexical. The notion of polysemy requires there to be a single lexeme used in different contexts to give different but related meanings. If the dichotomy between functional and lexical is analyzed in terms of discrete categories, then it should be impossible for any expression to have polysemous uses that straddle the boundary between them. However, it is clear that this is precisely what does happen where a lexical expression is in the process of developing grammatical uses. An example of this sort of polysemy is given by the verb have in certain dialects of English.13 The different constructions involving have do not partition neatly in terms of their syntactic properties according to whether they are contentive (i.e., semantically "full") or functional (semantically bleached). Thus, from a semantic point of view the decrease of semantic effect goes from Process (Jo had a party), through Possessive (Jo has three books), to Causative (Jo had the cat cremated), Modal (Jo had to go home), and Perfect (Jo had gone home) (lOa). However, classic tests for auxiliaryhood in English (see Pullum and Wilson, 1977) show a different pattern that cuts across the semantic development, with the possessive showing more auxiliary-like behavior than the causative or modal uses (10b).14 (10) a. Process > Possessive > Causative/Modal > Perfect b. Process/Causative > Modal > Possessive > Perfect The mismatch between the auxiliary status of the verb in each construction and its semantic content seems to deny any clear distinction between the contentive and functional uses of this expression, thus undermining the idea that there is homonymy, and leading to the conclusion that have is a single polysemous expression in this English dialect. The fact that one must recognize functional/ lexical polysemy, at least for certain stages in the Grammaticalization of an expression, makes a strong categorial distinction between functional and lexical expressions problematic. 2.6. Discussion From the above discussion it appears to be true that certain grammatical tendencies are related to the functional/lexical distinction. Functional expressions tend to form closed classes; to be phonologically and morphologically reduced; to appear in a restricted range of often idiosyncratic syntactic environments; to appear in general categorial domains from which they cannot be shifted; to have
50
Ronnie Cann
meanings that may be fully suppressed in certain environments; and to allow the possibility of syntactically and semantically coercing lexical expressions. Lexical expressions, on the other hand, seem not to have these properties, but to form open classes, to be morphologically free, to appear in a wide range of syntactic environments, and to be categorially and semantically coercible. However, none of these linguistic characteristics is individually sufficient or uniquely necessary to determine whether a particular expression in some language is functional or lexical. Furthermore, the discussion in section 2.5 shows that, if the functional/lexical dichotomy is categorial, it cannot be discretely so, since a single expression may show behavior that combines both functional and lexical properties. This type of pattern, where grammatical properties cluster around groups of expressions but do not fully define them, and where there is not a discrete boundary separating one class of expressions from another, is typical of a number of linguistic notions like subject, head, and so on. Such "cluster concepts" characterize gradients from one type of expression to another depending on the number of properties exhibited but seem to reflect linguistically significant distinctions. There are four ways to approach a cluster concept of this sort. In the first place, one may deny the utility of the concept in linguistic description. Second, one may treat the concept as prototypical, allowing more or less determinable deviations from a putative norm. Third, one may restrict the set of properties indicative of the category to a potentially relevant subset in order to make the concept absolute. Finally, one may assume that the concept is essentially primitive and that variability in associated properties is explicable through other means. With regard to a categorial distinction between functional and lexical expressions, the first position is the one taken in Hudson (this volume, 1995), which accepts the importance of the notion of functional expression (Hudson's Function Word) but denies that Function Word Category has any linguistic significance. The fact that categories are only as useful as the generalizations that can be made using them makes the lack of any defining (and, therefore, predictable) properties of functional expressions strongly indicate that a category, functional, is not a linguistically useful one. However, the fact that there are strong tendencies for functional expressions to exhibit certain types of property supports the second position, which might be taken by proponents of Cognitive Grammar (Langacker, 1987; Taylor, 1989). In such a view, there would be a prototypical functional category that would be, for example, phonologically reduced, affixal, syntactically restricted to a single domain, and semantically impoverished in some sense. Instantiations of this category would more or less conform to this prototype and shade into the prototypical lexical category. The third approach that could be taken to the apparent cluster concept of functional category appears to be the one often taken in the Principles and Parameters literature. Here an abstract view of categorization is assumed that maps only imperfectly onto particular classes of (distributionally defined) expressions within a
Functional versus Lexical: A Cognitive Dichotomy
51
particular language. Functional categories, for example, may be defined as ones that do not assign a theta-role, but that select a particular (often unique) type of syntactic complement (Grimshaw, 1990). These theoretically motivated properties abstract away from the directly observable properties of functional expressions and allow the categorial distinction to be made uniformly at a more remote level of analysis. The final view of the categorial divide is the least well supported by the linguistic data, but is the one that will be pursued in this chapter. In other words, I explore the idea that the functional-lexical distinction is useful at some level of description, is not a prototypical concept, is not abstract but is categorial. To provide evidence that this is the case, however, I do not intend to explore further the linguistic properties of such expressions. Instead I will examine the psycholinguistic evidence in favor of there being a significant difference in the processing of the two classes of expression. Although it is not common to resort to experimental or pathological evidence to support linguistic hypotheses, the growing body of psycholinguistic research into the distinction between lexical and functional expressions is too extensive and important to ignore. Although none of the evidence is uncontroversial, the picture that emerges is one where the psychological treatment of functional expressions differs significantly from that of lexical ones, lending credence to the idea that they instantiate a primary categorial distinction.
3. THE PSYCHOLINGUISTIC EVIDENCE Evidence for the significance of the functional-lexical distinction from a psychological perspective comes from three principal sources: language processing, patterns of aphasic breakdown, and language acquisition. Exactly what the functional elements are within a language is not, however, clearly defined in the psycholinguistic literature, and the distinction between functional and contentive elements is often rather crudely drawn. Typically, such expressions are referred to as "closed class" items, even though, as pointed out in section 2.1, this is not a particularly good determinant of functional status. Fairly uncontroversially, however, such a view leads to classes of expressions such as determiners (especially articles, demonstratives, and certain quantifiers like every and all), auxiliary and modal verbs, prepositions, (certain) complementizers, and pronouns being treated as functional. More controversially, also included within this grouping are the "pro-adverbs" (here and there), clausal connectives (such as therefore), and intensifies (such as so, very, etc.). Other possible functional expressions (such as certain quantifiers like several, many) may be excluded from consideration as are expressions (such as the quasi-modals need, dare, etc.) that behave syntactically partly like functional expressions and partly like contentive ones. In the discussion
52
Ronnie Cann
that follows, I shall be deliberately loose in my terminology, reflecting the looseness apparent in the psycholinguistic literature. 3.1. Processing Experiments to test the psychological mechanisms underlying language processing provide strong support for there being a significant difference in the way certain functional elements behave. In the first place, there is evidence that functional expressions are not affected by speech errors. For example, spoonerisms only involve pairs of contentive expressions and never involve functional ones (Garrett, 1976, 1980). Thus, one gets errors like The student cased every pack but not Every student packed the case for The student packed every case or A student likes the lecturer for The student likes a lecturer. Processing models (e.g., Garrett, 1980) have tried to explain this effect by assuming a level at which lexical expressions are represented in the syntactic tree, prior to the insertion of the functional elements. Erroneous replacements and switches are then held to apply at this prior level, giving the observed errors. Second, normal adults show a frequency effect in lexical decisions with contentive expressions. In other words, normal adults respond quicker in timing experiments to more frequent words. This does not apply to functional expressions, where response times for all expressions is similar, even if on a straight count the items differ in absolute frequency (e.g., between the and those) (Bradley, Garrett, and Zurif, 1980). These results are controversial, and Bradley's dual-access route to the lexicon has been challenged in Besner (1988), Gordon and Caramazza (1982, 1985), among others, who report work that indicates that there is a frequency effect with functional expressions, as well as with lexical expressions. It may therefore be the case that both classes do show frequency effects, but that there is a limit to the effect with the most frequent expressions, a group that is dominated by functional expressions (Richard Shillcock, personal communication). More robust evidence comes from experiments that show that normal subjects take longer to reject nonwords based on lexical expressions than those based on functional expressions (e.g., thinage vs. thanage) (Bradley, 1978, and replicated by others, see Matthei and Kean, 1989). This implies that the linguistic-processing mechanism "knows" that a word is a functional expression and thus "knows" that it will not undergo any derivational processes. For lexical expressions, the processor appears to make a wider search for matching candidates within the lexicon. Thus, it appears that the linguistic processor is able to recognize instantly a derived form as based on a functional expression and reject the form without trying to identify whether the form is well formed and/or attested. Word priming experiments (see especially Shillcock and Bard, 1993) show that there is a difference in priming between certain functional expressions and lexical expressions. Lexical expressions prime lexical homophones (so, for example, the
Functional versus Lexical: A Cognitive Dichotomy
53
verb arose primes the noun rose}, and they also prime semantically related expressions (for example, wood also primes timber). Functional expressions, however, do not prime homophones (e.g., would does not prime wood), nor do they appear to prime semantically related expressions (e.g., may does not prime must or might). Further evidence for the distinction between functional expressions and lexical expressions is afforded by the informational encapsulation of lexical items during processing. Priming effects are independent of the syntactic structure within which a lexical item is embedded. So, rose primes both the noun (and semantically associated flower) and the verb (see Tannenhaus et al., 1989). Functional expressions, however, are affected by syntactic context: where the syntax strongly favors a functional expression, only the functional expression will be activated. Hence after an initial noun phrase [wud] does not prime wood (or timber), and so on. This connection between syntax and closed class items is further supported by evidence from bilinguals, where in code-switching situations the functional expressions used tend to come from the language that supplies the syntax (Joshi, 1985). 3.2. Acquisition and Breakdown Evidence from first language acquisition and from different types of language breakdown resulting from brain trauma also show distinctions in the behavior of functional and lexical expressions. Numerous studies have focused on the acquisition of grammatical elements (see, for example, Bloom, 1970; Bowerman, 1973; Radford, 1990, and the papers in Morgan and Demuth, 1996, among many others). The data from these studies are not uncontroversial, but they indicate that functional expressions typically appear later in child language production than lexical expressions, and that functional categories appear later than lexical ones. Crosslinguistically, however, this is probably not absolutely true. For example, Demuth (1994) reports that Sesotho children produce a number of functional, or functionlike, elements from an early age (she cites passive morphology as an example) and claims of this sort for English tend to ignore the affix ing, which is acquired and produced relatively early (de Villiers and de Villiers, 1978). Furthermore, studies like Gerken and Mclntosh (1993) indicate that children who fail to produce function words are nevertheless sensitive to their appearance in input and suggest that therefore children may have representations of such expressions before they use them. Morgan et al. (1996) further hypothesize that the functional-lexical split is innate and that children use the phonological differences to group expressions into the two classes. This, they suggest, helps the identification of word-meaning mappings by cutting down the amount of utterance material that the child must attend to. Thus, children may indeed have some (possibly underdetermined) concept of the functional expressions in the language they are acquiring. This implies that any relative lateness in the production of functional expressions may be due to the
54
Ronnie Cann
communicative needs of the learner, since lexical expressions carry greater information than functional expressions and therefore are likely to be fully represented and so produced earlier. It also implies that functional expressions that carry a lot of semantic information or are otherwise salient in the speech stream (e.g., because of regular morphology or phonological prominence) may be acquired relatively early, while less informative or salient elements will be acquired later.15 Whatever the precise characterization of first language acquisition, however, the importance and robustness of the functional-lexical divide is clear, and that the acquisition of syntax proper by first language learners is coincident with the production of functional expressions is an accepted fact. Because of the difficulties in interpreting what children are producing or comprehending and the problems and controversies that surround the nature of child language, patterns of aphasic breakdown are, in many ways, more interesting for our purposes, because we see in such cases what happens when damage occurs to a full adult grammar. The evidence can thus be taken as strongly indicative of the nature of the mature language faculty. Aphasias can be characterized broadly as fluent and nonfluent. Fluent aphasias (Wernicke's) are characterized by the use of functional expressions, control of syntactic operations (movement), production of speech at a normal rate of speed, and appropriate intonational patterns; but comprehension is disrupted and access to information associated with lexical expressions is deficient, particularly with regard to predicate-argument structure and lexical semantics. Agrammatic aphasia (Broca's), on the other hand, is characterized by slow or very slow speech, no control of sentence intonation, impaired access to functional expressions, no control of syntactic operations. Comprehension, provided syntactically complex sentences are avoided, is unimpaired and lexical expressions are generally used appropriately, indicating full access to semantic information (see Goodglass, 1976). What is interesting here is that in agrammatic aphasia, semantic processing appears to be intact, while syntactic processes are disrupted. Some representative examples of agrammatic speech (taken from Tait and Shillcock, 1992) appear in (1 la-1 If). (1 la) and (lib) illustrate difficulties with participle formation (and one example of an omitted determiner); (l1c) from Italian shows difficulty with gender agreement in both articles and verbs; in (1 Id) from Dutch there is a missing auxiliary; (l1e) from German displays wrong case assignment (accusative for dative); and (llf) from French indicates difficulty with prepositions. (11) a. burglar is open the window b. Little Red Hood was visit forest grandmother c. il, la bambina sono, e andata the.m, the.f girl have has gone d. ik nou 21 jaar gewerkt I now 21 years worked
Functional versus Lexical: A Cognitive Dichotomy
55
e. die Oma sperrt ihn auf the grandmother opens him.acc f. j'aipris chemin de d'orthophoniste envoiture I have taken road from/of of speech-therapist in car Of course, the syntactic impairment shown by such dysphasics is not an absolute, all-or-nothing affair affecting the whole of a subclass of functional expressions or all occasions of utterance [cf. (lla), where there is one omitted determiner and one overt one]. However, it is clear that there is difficulty in production16 and that this principally affects functional expressions, both words and affixes. There is also evidence that agrammatic aphasics have difficulty in interpreting noncanonical structures. For example, many agrammatics have difficulty understanding passive sentences that cannot be disambiguated through semantics alone. In experiments it has been shown that performance in understanding passives where the thematic roles are easily assignable is significantly better than comprehension of passives where no semantic clues are available (Saffran et al., 1980; Schwartz et al., 1980). (12) a. b. c. d.
The hunter shot the duck. The duck was shot by the hunter. The square shot the circle. The square was shot by the circle.
Furthermore, it is reported in Bradley et al. (1980) that agrammatic aphasics appear to show frequency effects for functional expressions. Although, as noted above, the conclusion drawn by Bradley that normals do not show such effects with functional expressions is controversial; the effect of frequency on the recognition of words by aphasic speakers is apparently more marked than for normal ones. Again, the test for recognition of nonwords based on closed and open class items is more robust and has been replicated. Broca's aphasics show no difference in reaction times between the two types of nonwords, indicating that their recognition of functional expressions is impaired. 3.3. Discussion The psycholinguistic evidence points to a strong distinction in the processing of functional expressions and contentive (lexical) ones. In particular, the evidence from word priming indicates that functional expressions are not encapsulated from syntax since the syntactic context that surrounds functional expressions affects lexical access, whereas syntactic context has no effect on the lexical access of contentives. Furthermore, functional expressions are recognized quickly by the processor and do not appear to interact with the mechanisms that identify contentive expressions. The data from language acquisition and language breakdown also show that functional expressions are closely linked with syntactic operations
56
Ronnie Cann
like passive, dative shift, and so on, while lexical expressions provide sufficient information for basic semantic processing to occur, even in the absence of coherent syntax. There is thus not only strong support for a significant distinction to be made between functional and lexical elements but also for the hypothesis that functional expressions are more closely associated with (local) syntactic processing than lexical ones, which themselves are more strongly implicated in semantic processing. Evidence from neurobiology further supports the significance of the distinction between functional and lexical expressions and the association of the former with syntactic processing, as it suggests that the two types may be stored in different parts of the brain. For example, the loss of the ability to manipulate syntactic operations in patients who have damage in the anterior portion of the left hemisphere, along the angular gyrus (Broca's area), indicates that the syntactically significant functional expressions may be located in this area. Patterns that emerge from fluent aphasias indicate that lexical expressions are less strongly localized, though a general tendency toward localization within the posterior portion of the left hemisphere is attested. Following left hemispherectomy, the right hemisphere may take over functions involving lexical expressions, with a remapping of activity to that hemisphere, but it cannot take over the functions of functional ones. Speech is possible, with normal comprehension and communication, but syntactic complexity is absent. There is also evidence from neurobiological studies that indicate differences in the storage of lexical and function items. It appears that neuronal assemblies corresponding to function words are restricted to the perisylvian language cortex, whereas those corresponding to content expressions include neurons of the entire cortex (see Pulvermuller and Preissl, 1991, and the discussion of neurobiological implications for language acquisition in Pulvermuller and Schumann, 1994). Unfortunately, as shown in section 2, the robust psycholinguistic evidence for the distinction is not reflected in the linguistic properties exhibited by the two macroclasses of expression. If the functional-lexical dichotomy is categorial, there should, as Hudson (this volume) notes, be "generalizations which would not otherwise be possible" without the categorization. In other words, the identification of an expression as functional will predict some subset of its grammatical properties. Furthermore, in a strict interpretation of the distinction between the two categories, there should be no expressions that are morphosyntactically attributable to both classes. If a lexeme is identified as a member of a contentive class by certain grammatical properties, then it should not exhibit properties centrally associated with functional ones (and vice versa). An implication of this is that Grammaticalization processes, whereby contentive expressions become functional, should exhibit an instantaneous shift from one class to the other at some point in the diachronic development. This in turn implies that lexemes that appear to have both lexical and functional uses should behave as homonyms
Functional versus Lexical: A Cognitive Dichotomy
57
and so should exhibit morphosyntactic properties that are entirely independent of each other. The fact that these properties do not appear to hold indicates that the important psycholinguistic notion of the functional-lexical distinction does not constitute a linguistic category. On the other hand, the notion does mirror the conceptual distinction between grammatical and contentive categories within linguistic theory, and there are clear connections between the psycholinguistic conception of functional expressions and their linguistic behavior. Thus, although there are no necessary and sufficient conditions that identify expressions as of one type or another, as noted in section 2.6, functional expressions are generally associated with restricted syntactic contexts and are not amenable to syntactic or semantic coercion, whereas lexical ones appear in a wider range of syntactic contexts and are syntactically and semantically coercible. This is reminiscent of the association of functional expressions with syntactic processing and lexical ones with semantic processing noted previously. We appear to have a situation, therefore, in which an important psycholinguistic distinction is not fully reflected in linguistic properties, but where there is a clear, but imprecise, relation between the processing properties associated with functional and lexical expressions and their general syntactic and semantic behavior. 3.4. E-language and I-language The apparent contradiction between the categorial nature of the functionallexical distinction implied by the psycholinguistic evidence and the noncategorial nature of the distinction implied by the lack of definitional linguistic properties can be usefully approached in terms of the distinction between E-language and I-language made in Chomsky (1986). The term E-language in that work is used to refer to the set of expressions that constitute the overt manifestation of a language in terms of actual utterances and inscriptions. It is something that may be observed directly as the output of linguistic behavior, an extensional or ostensive view of language that may be equated with the structuralist and mathematically formal notion of a language as a set of strings of basic elements. Different from this is I-language, which is characterized as an internal representation of structures that gives rise to the external manifestation of a particular language. I-language may be construed as a metalanguage that generates (or otherwise characterizes) E-language and is equated in Chomsky (1986) with a parameterized state of Universal Grammar. I-language thus consists of grammatical elements that are universally available to humans and that are manipulable by universal linguistic principles. E-language, on the other hand, necessarily consists of languageparticular elements (the expressions of the language) whose description at the level of the given phenomena must also be parochial and not necessarily amenable to analysis that is crosslinguistically generalizable.
58
Ronnie Cann
Considerations of this sort led Chomsky (1986) to argue that it is I-language that is the proper object of inquiry for linguistics, because it is this that results from the operation of universal linguistic principles and is thus directly relevant for the understanding of Universal Grammar. E-language, on the other hand is, for Chomsky, relegated to the status of an epiphenomenon, a symptom of language rather than its substance. Leaving aside the ideological battle that informs much of the debate around this topic, we may question whether there are in fact no aspects of E-language that are best described on their own terms (i.e., for which an I-language explanation misses the point and fails to adequately characterize all the relevant properties). Indeed, it is precisely with respect to this question about the nature of the functional-lexical dichotomy that the potential drawbacks of having a purely I-linguistic characterization of the language faculty are thrown into focus. Psycholinguistic investigation into language processing is principally concerned with the investigation of human responses to E-language. Descriptions of aphasic behavior or first language acquisition relate to the linguistic expressions that are produced or, less frequently, comprehended by the people being studied. Priming and other sorts of psycholinguistic experimentation record reactions to written or spoken tokens of expressions that are (or are not) part of a particular E-language. We may, therefore, hypothesize that the functional-lexical dichotomy indicated by psycholinguistic evidence is an ostensibly E-language notion, and we may assume that at the level of E-language (the set of expressions, particularly basic expressions, that extensionally define a language), the distinction is categorial, because it does identify a significant grouping of expressions that show identifiable traits in parsing and production (functional expressions are not encapsulated in processing, are accessed quickly, etc.). This hypothesis is supported by the fact that the set of functional expressions within a particular language is always sui generis in the sense that different languages overtly manifest different types of functional expression. English, for example, has no overt manifestation of gender agreement, nominal case, or switch marking, whereas a language like Diyari (Austin, 1981) has morphemes that express these concepts but no person agreement or aspect marking. I-language relates principally to the need to account for universal properties of language, whereas the "accidence" of grammar has traditionally been viewed as a language-specific phenomenon, but one that determines the properties of a specific language independent of its "substance." Insofar as accidence and functional expressions coincide, we might expect the study and analysis of this aspect of grammar to be language specific. In current transformational grammar, of course, the variability associated with accidence is attributed to universal parameters and, as such, is in the domain of I-language rather than E-language. However, parameters are intended to determine variable properties of language that are linked together in some way. Arbi-
Functional versus Lexical: A Cognitive Dichotomy
59
trary variations in the grammar of a language (e.g., a language has ejective consonants, fusional morphology, no overt WH-expressions, etc.) are relegated to the lexicon. What is not addressed is how significant such language-specific properties are and how much they contribute to the linguistic structures of a language beyond an epiphenomenal haze of arbitrary attributes. There is no a priori reason why external and nonuniversal properties cannot be linguistically significant. Aspects of E-language may determine certain aspects of grammaticality and interact with I-linguistic properties in interesting ways. In fact, the association of functional expressions with local syntactic processing and their independence from semantic processing implies a radical differentiation in the ways that functional and lexical expressions are represented. This hypothesis, that the functional-lexical distinction is an E-language phenomenon, will be pursued in the remainder of this chapter with a view to proposing a view of the grammar whereby extensional and external properties of language interact with intensional and internal ones and marry aspects of processing and theory in an interesting way.
4. CATEGORIZING FUNCTIONAL EXPRESSIONS At the end of the previous section, the hypothesis was promoted that the functional-lexical distinction is categorial at the level of E-language. We will refer to this E-language category ("E-category") as "functional" and take it to apply to the set of expressions in any language that show the psychological properties illustrated in the last section. In other words, the hypothesis is that the categorization of functional expressions is determined for an individual language through properties of processing and frequency. It is possible that certain types of phonological cue may also help to define this category.17 In other words, such a categorization is determined by properties that clearly belong to E-language and, hence, it must be language specific and not determined by universal factors. This is not to deny that expressions with certain inferential or semantic properties (such as anaphors and tense) will tend to be encoded by functional expressions, but the categorization of the expressions of a language into functional and lexical is one that is determined by the external manifestation of that language, as discussed above. This primary categorization into the macroclasses, functional and lexical, induces a split in the vocabulary that permits further (E-)categorization to take place. In this section, I explore the nature of this further categorization and develop a view of the way a theory of syntax may be developed that utilizes the different types of information associated with the two types of expression.
60
Ronnie Cann
4.1. Defining E-Categories Although not much in vogue in many current approaches to syntax, the quintessential type of syntactic categorization has generally been determined through properties of distribution. This approach to categorization finds its most elaborated form in the writings of the European and American structuralists (see, for example, Harris, 1951; Hjelmslev, 1953; Hockett, 1954). Morphosyntactic classes are defined by the syntagmatic and paradigmatic properties of expressions, typically through the use of syntactic frames: expressions are grouped into classes according to their ability to appear in a position determined by a particular frame. Clearly again, this type of categorization is induced by properties of E-language, as it depends solely on the appearance of expressions with one another and not on more abstract linguistic properties. Hence, one might look for further subcategorization of the lexical and functional categories to be determined by such a process. There is, however, a well-known methodological problem with this type of classification: how to determine which distributional frames are significant and which are not. In the classical model, categorization is meant to be automatic and determined without reference to semantics so that any linguistic context can in principle be used to define a distributional frame (see particularly works by Harris). In practice, of course, this ideal is not (and cannot be) met for all expressions in a language. The semantics of an expression is often used to determine whether it should be identified as a preposition or an adverb, a pronoun or a proper name, a verb or an adjective, before any distributional analysis is carried out. More problematic is the selection of significant distributional frames. For reasons to do with selectional restrictions, register, and other factors, if categorization is determined by distributional frames that are allowed to mention specific words, then almost all expressions in a language, including major class ones, will define unique word classes, since they will appear in a unique set of contexts. Clearly, if this applies to lexical as well as functional expressions, then this is problematic from the point of view of the grammar, since in the worst case it requires the same number of distributional (E-)categories as lexical expressions, preventing significant generalizations to be made. To get around this problem, structural linguists have tended to use broad, and sometimes arbitrary, syntagmatic frames to define word classes. This methodological problem is one that led to the move away from distributional theories of categorization to ones that rely on abstract or notional properties. In fact, however, the difficulty disappears if categorization is determined, not with respect to all basic expressions in a language, but only with respect to the functional ones. As noted in section 2.1, the number of functional expressions is itself small, so that even if every functional expression in a language appears in a unique set of contexts, the number of different categories that need to be recognized will still be small (no greater than the number of functional expressions). Furthermore, because there are no operations that alter the syntactic environment
Functional versus Lexical: A Cognitive Dichotomy
61
of functional expressions in the same way as for lexical ones, the number of significant contexts for any single functional expression, abstracting away from individual lexical expressions, will be small. Moreover, since functional expressions can appear with all members of an associated lexical class, and they coerce lexical expressions to be of the appropriate class in context, we may further abstract away from individual lexical expressions and refer only to major class labels. Thus, instead of classifying articles in English in terms of an indefinite number of frames [_ dog], [ student], [ hamster in a cage], and so on, they are classified in terms of the single frame [_ N]. In order that the distributional definitions of functional categories are not circularly reapplied to the definition of the major parts of speech (e.g., by taking the frame [the _] to identify particular lexical expressions as nouns), labels like N and V must be taken to be a priori categories that the class of functional expressions define extensionally. Thus, in English, whatever expression appears in construction with the, some, and so on, is necessarily (headed by) a noun or with may, will, and so on, is necessarily (headed by) a verb. In other words, E-functional categories are defined over the class of functional expressions and a small set of major class labels like N and V, the latter of which are universally given and hence may be considered to form part of the vocabulary of I-language. A restricted vocabulary, of course, does not guarantee that the set of distributional frames that needs to be considered will also be small or even finite. However, it seems (again because of the restricted syntactic distribution of functional expressions) that significant distributional frames will be in the region of two to four words in length. In general, increasing the size of the context used to identify classes of functional expressions will have no effect on the membership of those classes.18 For example, with respect to the illustrative set of frames for part of the functional system in English (13a-k),19 frames like [_ V+ed the N] or [_ has been V+ing] will pick out exactly the same class of expressions as (13h); frames like [_ N of the N], [_ A N], [_ A A N], and so on, will pick out the same class as (13a), and so on. (13) a. b. c. d. e. f. g. h. i.
[_ N] = {the, a, every, much, no, my, your}N [ N+s] = {the, some, many, few, all, no, those, my, your}N [the _N+s] = {many, few}N [ of the N] = {all, many, few, some, none}N [_ the N] = {all}N [_ V+s] = {he, she, it, this}N [_V] = {you, they, I, we, those, these, many, several, few} N [_ V+ed] = {I, you, he, she, it, we, they}N [N+_] = {-s}N
j. [V _] = {here, there}ADV
k. [A+ _] = {-ly}ADV
62
Ronnie Cann
14 (Figure 1)
Figure 1
One of the interesting things to note about functional E-categories is that they cut very finely. For example, given the representative data about the functional expressions in the nominal field in (13a) to (13e), we find that the different distributional classes are not fully generalizable to all members of this subclass. Thus, although most of the expressions that satisfy the frame [_ of the N] (abstracting here away from number) also satisfy [_ N], at least one does not (i.e., none}, and while most expressions that satisfy [_ V] satisfy [_ N] (and vice versa), not every relevant expression satisfies both (the personal pronouns satisfy the first but not the second, whereas the articles a, the, and possessive pronouns satisfy the second and not the first). However, some of the frames considered above do appear to be predictive: [_ of the N] predicts [_ V] and [the _ N] and [_ the N] predict [_ N] and [_ V] (when restricted to functional expressions, as we are doing). The intersection of the classes defined by [_ of the N] and [_ N] yields a further class. We can diagram these relations using the (subsumption) lattice in (14), where the nodes correspond to sets of expressions that can appear in a particular frame, to the intersection of classes defined by different frames, or to the complement of such intersections with respect to the two original sets. In this way, a complex array of distributional categories emerges. As one goes down the lattice, the categories (necessarily) become smaller, with all and none defining categories of their own. Indeed, if one cuts across the lattice with further properties (like syntactic number) then further differentiation occurs, with, for example, a and much being distinguished from the and no, and so on. Ultimately, the process leads to very small classes of expression, often containing only one member. This approach to the categorization of functional expressions thus yields a set of relations between individual morphemes that essentially treats each such morpheme as syncategorematic (or equivalently acategorematic), whose syntactic interpretation is given by its position within a distributional lattice as in those shown in (13). It comes as no surprise in such a
Functional versus Lexical: A Cognitive Dichotomy
63
view of functional categories that there will be expressions which are entirely sui generis and appear not to relate directly to other functional expressions (like perhaps the complementizer that, see Hudson, 1995). If it is the case that basic syntactic environments define a (meet semi-)lattice in terms of the elements that appear in them, then one only has to know the point at which a particular element is attached to the lattice to know its distribution. This is, of course, equivalent to defining a set of grammatical rules (of whatever sort) and assigning expressions to particular labels introduced by those rules, in the normal structuralist mode.20 It is not here important how the syntactic relations between the nodes on the lattice are determined and with what generality. What is important is that a structuralist distributional approach directly induces the categorization of functional expressions, both at and below word level, and, because of the syntactically restricted nature of such expressions, such an approach can in principle provide an exhaustive characterization of the restricted environments in which functional expressions can appear. 4.2. I-categories and E-projections As noted above, distributional classes such as those shown in (13) define E-categories, since they are extensionally defined over the vocabulary of English. Clearly, in such a categorization, the principal categorial distinction must be between functional and lexical, since this provides the restriction on the given data that makes distributional categorization possible. The functional expressions essentially then define the E-categories of the lexical expressions through the use of universal major class labels.21 Although such an approach is in principle capable of yielding an exhaustive characterization of the strictly local dependencies of the vocabulary of a language, as it stands it determines only subclausal constituents. Functional expressions do not provide sufficient information to enable distributionally defined phrases to be combined. Something more is needed that can induce the set of permitted combinations and presumably account for general, putatively universal, linguistic processes like unbounded dependencies and suchlike. Within transformational grammar, universal syntactic processing is assumed to operate only over I-language entities and so the relation between E-categories and I-categories becomes an important issue. One of the features of classifying functional expressions in terms of their distribution is that, because of their strict association with particular domains (nominal, verbal), basic labeling of phrases that are the output of the distributional grammar discussed in the last section can be done with respect to these domains, as indicated by the subscripts around the classes in (13). Thus, the different classes labeled N and V above are functional classes related to the universal I-categories noun and verb, respectively.22 Note that the I-categorial label is not equivalent to the E-categorial label used in the distribution frames themselves. Thus, we cannot
64
Ronnie Cann
substitute the (or the N, or any pronoun) for N in the frames (13a) to (13d). In fact, we can usefully here distinguish between the E-category N (or V) and its I-category counterpart N (or V). If we take the position that these latter labels are the ones that are visible to Universal Grammar, then we may understand the combination of functional expressions with a lexical expression as recursively defining the resulting (complex) expression as being of the appropriate I-category. Functional classes may thus be construed as defining E-projections (to slightly modify the concept of extended projection of Grimshaw, 1991) of the major class label they contain. This is illustrated in (15) below, where the complex expressions are defined by the distributional grammar associated with the functional expressions, and the categorial label gives the resultant I-category. Note that it is not important exactly how (or whether) the internal structure of such phrases is represented. What is important is that the phrases are constructed from information provided by the E-categories of the functional expressions within a given language and that they are labeled with the I-category associated with the major class of the lexical expression they contain. (15) a. [cat+s],v b. [the cat+s]^ c. [all the cat+s] # Through their associated I-category, E-projections are visible to Universal Grammar (however construed) and so may be combined through the syntactic operations that the grammar permits. One of the universal aspects of syntactic combination assumed in all current theories of syntax is the combination of lexical predicates and their arguments. Information about lexical argument structure necessarily comes from the lexical expression in an E-projection, as in (16), and E-projections may be combined by some tree-forming operation (like "Merge" in Chomsky, 1995), as illustrated in (17).23 (16) a. <[kick+ed]v,
> b. < [have kick+ed]v, > c. < [may have kick+ed] v, < AGENT,PATIENT>> This view of the grammar, whereby combinations of a lexical expression and its associated functional structure is defined by distributional grammar, and further combination is done through the manipulation of major I-categories and
17 (Figure 2)
Figure 2
Functional versus Lexical: A Cognitive Dichotomy
65
argument structure, provides a way to accommodate properties of linguistic expressions that are indicated by the psycholinguistic evidence and provides a solution to a number of the problems of characterizing functional categories discussed in section 2. From the processing point of view, the fact that functional expressions are associated with syntactic frames in a different way from lexical ones, and that they are strictly associated with syntactic frames, can be an explanation for why only lexical expressions prime homonyms; why the rejection of nonwords based on functional expressions is faster than those based on lexical ones; and why the processing of functional expressions is not encapsulated from syntax, but that of lexical ones is not. Furthermore, since the variables in distributional frames are associated with the major lexical classes, only these classes of expressions will be affected by spoonerisms. In terms of language breakdown, the association of functional expressions with local syntax means that the loss of such elements automatically entails the loss of their associated syntactic properties. Hence, in Broca's aphasia, what is left intact is the ability to manipulate argument structure, and so semantically coherent expressions can be constructed using only lexical expressions. In addition, if the representation of E-categories is essentially lexical, then particular functional expressions (and their associated syntax) may be lost, whereas other such expressions may be retained, giving rise to partial fluency. Hence, it is not necessary to assume (as did Ouhalla, 1991) that breakdown necessarily involves a complete functional class. The linguistic consequences of the approach also go some way to explaining the existence of expressions that show nonfunctional properties and why certain of the properties discussed in section 2 are not good indicators of functional status. In the first place, the property of closed versus open classes becomes mostly irrelevant. All functional classes are necessarily closed (and small), given the natures of E-categorization. However, all such classes are associated with some lexical (I-)category, and so the fact that certain functional expressions have the distribution of lexical classes is nonproblematic and expected, since the null environment (within an E-projection) is a possible environment (e.g., [there]ADV, [she]N, etc.). Second, nothing in the model prevents certain expressions that have similar semantic functions to functional expressions from being treated as lexical. So perhaps certain quantifiers may appear in the grammar as lexical nouns with argument structure (e.g., perhaps several), whereas others (e.g., every) are only associated with the I-category noun through its position in an E-projection. Provided that the semantic force of the two expressions can be expressed (which it must be able to be), the difference in syntactic status is immaterial. Furthermore, expressions that have both lexical and functional properties are not disallowed. Such expressions can be assigned to a major E-category (through its semantic sort) but also can be associated with functional domains. So, have
66
Ronnie Cann
may be a verb through its association with the sort event, but may also be associated with distributional frames like [_ V+ed] v and so on. This predicts thatpolysemous expressions that cross the functional divide are expected to show syntactic behavior that is not determined by whether the expression is being used as a lexical or a functional element, hence the mixture of auxiliary and main verb uses of have whether or not it is used as a possessive verb or a causative marker. The general syntactic properties of functional expressions noted in section 2.3 also follow from this model. Because distribution is defined with respect to a major class label and not individual lexical items, a functional expression cannot differentiate between members of the class and so cannot select any subset of them to appear with. This property also predicts that coercion will always be to the class required by the functional expression, and not vice versa, and that functional expressions cannot coerce each other. Furthermore, if long-distance dependencies are determined by argument structure (as noted in footnote 6), then the extraction of parts of an E-projection will be impossible, predicting the ungrammaticality of *cats, Kim really thought Lou liked the.24 Finally, the difference in the syntactic operations that govern the construction of E-projections and their combination into clauses allows, but does not require, functional expressions to appear in syntactic contexts in which lexical expressions cannot. The strong differentiation made between functional expressions and lexical ones may also form the basis of an explanation for other properties noted above. For example, phonological and morphological reduction may be expected for functional expressions given the close association between expressions in an E-projection and their predictability, whereas lexical ones are not predictable. The proposal made above, which utilizes aspects of different syntactic theories in having the grammar partly defined by distributional rules and partly by more abstract properties of Universal Grammar, thus provides a potential basis of explanation for a whole range of phenomena that are problematic when approached from the viewpoint of a theory that envisages just one type of syntactic representation for all expressions in a language. 4.3. FUNCTIONAL I-catcgories The picture of the grammar presented above, in which functional expressions define (distributionally determined) local domains over which universal grammatical principles operate, leaves out the relation between the E-categorial functional classes and the functional categories familiar from much recent syntactic theory. To relate the two notions, we might hypothesize the existence of an I-language category, FUNCTIONAL, which would consist of the nonmajor categories familiar from current transformational grammar, AGR, COMP, DET, TNS, and so on (i.e., a set of grammatical categories). The FUNCTIONAL categories are, however, independent of the language-particular morphs that somehow encode them,
Functional versus Lexical: A Cognitive Dichotomy
67
since they are, by being objects in I-language, necessarily universal, whereas functional classes are language particular and defined solely through their distribution within the language and not according to their relation to some abstract linguistic property. The independence of I-language and E-language categories presents a particular problem for functional elements that is not apparent with lexical ones. The E-categorization of lexical expressions into nouns, verbs, and so on is determined by their co-occurrence with nominal and verbal functional elements (words or affixes). However, functional expressions do not classify lexical expressions into those classes, because the distributional definition of functional classes cannot, by hypothesis, refer to individual lexical items nor, as we have seen, can we classify lexical expressions according to distributional frames defined by the functional ones without circularity. Major class membership must thus be determined in some other way, presumably through basic ontological properties as suggested in notional definitions of the major parts of speech.25 The I-category associated with a lexical expression is thus determined by the I-category associated with a particular functional expression (or directly in the lexicon, if the expression can appear without any accompanying functional expression, such as adjectives and proper names in English). Its association with an E-category is, however, mediated by its semantic properties (such as its sort).26 Because of this, there is no particular problem in understanding the relation between the major E-language and I-language categories or relating lexical expressions with particular I-categories. However, this transparency of relatedness between I- and E-categories and between expressions and I-categories does not hold for the relationship between functional classes and FUNCTIONAL categories. Individual functional expressions, for example, typically encode more than one traditional grammatical category. Hence, while the article the in English could be considered to instantiate only the category of definiteness (18a), its indefinite counterpart encodes both (in)definiteness and number (being singular) (18b). The quantifier every encodes number (singular) and the fact that it is a quantifier (18c), whereas the my encodes definiteness, agreement (pronominality), and possession (18d). However, distributionally these expressions form a functional class. What then is the relationship between this class and the FUNCTIONAL categories? Most obviously, the hypothesis should be that the functional class relates to the union of all FUNCTIONAL categories with its members (18e) or to their intersection (18f). Unfortunately, neither of these potential solutions tells us anything useful, since not all the members of the class exhibit all the properties indicated, and there is no one property shared by every member of the class. (18) a. the: [DBF] b. a: [DBF, NUM] c. every: [NUM, QNT]
68
Ronnie Cann
d. my: [DBF, POS, AGR]
e. (the, a, my,.. ., every} = [DBF, POS, AGR, NUM, QNT] f. (the, a, my,. . ., every} = 0 A further problem with the mapping between functional expressions and FUNCTIONAL categories has to do with the fact that certain expressions perform different grammatical functions according to their local syntactic context. For example, the morph -ed in English is interpreted either as perfect or passive (or adjectival) depending on whether it appears with the verb have or the copula be (or no verb at all). It is argued in Cann and Tait (1995) (and reiterated in Cann, 1999) that this morph is not homonymous between aspect and voice, but has a single interpretation (as an unaccusative state) whose other properties are determined by the elements with which it combines.27 If this is correct, then the mapping from individual functional expressions to FUNCTIONAL categories is not necessarily one-to-one and is thus nontransparent. It is not only the mapping from functional E-categories to FUNCTIONAL Icategories that is problematic, but so also is the reverse mapping from I-category to E-category. First, following from the observation above concerning the encoding of a number of grammatical categories by a particular functional expression, it is clear that a particular FUNCTIONAL category may be instantiated by a number of E-categories: agreement, for example, is distributed across nominal and verbal functional classes in many languages; definiteness may be distributed across articles, possessive pronouns and certain quantifiers; and so on. More importantly, FUNCTIONAL categories may be realized not only by functional expressions (affixes or semi, -bound forms like the articles in English) but also by lexical ones, which may or may not be in the process of grammaticalization. For example, TENSE in English may be realized by affixes (-ed, -s), auxiliary verbs (will), or fully lexical verbs (go as in be going to}. In Diyari, a number of TENSE and ASPECT distinctions are encoded by what appear to be full verbs followed by participles. For example, the habitual or intermediate past is indicated by the use of the verb wapa- meaning 'go', while pada- 'lie' indicates recent past, warn- 'throw' indicates immediate past, and wanti- 'search' indicates distant past (Austin, 1981:89). There is thus no direct correspondence between FUNCTIONAL category and functional expression. The Diyari example above also indicates a problem with FUNCTIONAL categories and their relation to functional classes that is part of a common concern for all universalist theories of linguistic categorization. As is well known, different languages often instantiate different values for a certain category (e.g., different types of past tense in Diyari), and no language morphosyntactically encodes every possible grammatical category. The question that arises is whether all the different values and all the different categories are to be considered universal. If so, then the theory of Universal Grammar requires every possible variation of a grammati-
Functional versus Lexical: A Cognitive Dichotomy
69
cal category to be at least immanently present in every human language, leading to further problems with regard to the representation of the nonovert categories within I-language. The position that all values of grammatical categories (or indeed all grammatical categories) are universal is not likely to be tenable, given the thousands of variations in the number and type of distinctions made crosslinguistically in all areas of the grammar. However, if categories like "distant past" are not universal, they must be represented as E-categories defined by morphosyntax of the language concerned. Since I-categories and E-categories are defined independently of each other, this leads to the uncomfortable situation where some functional expressions within a language encode (universal) FUNCTIONAL categories, but others must contribute semantic information without the mediation of such an I-category. Whether or not it is possible to identify any "significant" universal grammatical categories that must exist independently of any sets of associated functional classes, the fact that at least some functional expressions remain unassociated with any FUNCTIONAL category raises the possibility that the content of such expressions is always input into the grammar without this sort of mediation. Considerations such as the one-to-many mapping between functional expressions and FUNCTIONAL categories, the failure of the latter to consistently map onto functional classes (or even functional expressions), and the problem of apparently language-specific functional categories leads to a view of the grammar where the latter have no independent syntactic status. Indeed, one might hypothesize that if FUNCTIONAL categories are dissociated from distributional criteria (and thus any direct connection with functional classes), then all that is left of their content are the semantic functions they perform. Since such functions vary across functional expressions in a single language and across different languages, it may be that the categories themselves are not independently significant, and the content of functional expressions is projected directly into the semantic content of the expression without augmenting the syntactic information of the label of the expression (E-projection), as illustrated in (19). (19) a. b. c. d. e.
28 <[cat+s]N, (more-than-l(x), indiv(x), cat(x)}> <[thecat+s] w , (def(u)[u:more-than-l(u), indiv(u), cat(u)]}> ?> <[may kick]v, (possible(s)[s: process(e), kick(e,x,y)]},
The difficulties found in mapping between functional E-categories and FUNCTIONAL I-categories thus point to the conclusion that there is no need to posit the existence of a FUNCTIONAL I-category with its associated set of universal grammatical categories, and thus lends support to Hudson's (this volume) contention that the Function Word Category is not linguistically significant. However, the case against there being such a set cannot really be made on theory-independent
70
Ronnie Cann
grounds. It may, therefore, be necessary in certain frameworks to posit a set of FUNCTIONAL I-catcgories in order to account for putative universal relations between them. If such universals are part of the grammar, and are not just emergent properties of grammatical systems in use, as argued in Kirby (1998), then one could take the position that particular functional expressions do contribute information about grammatical categories. However, the significant distinction made here between functional expressions and grammatical categories still requires that such categories should not have independent syntactic status. Instead, one would have to treat functional expressions as augmenting the label of an E-projection with the labels of the FUNCTIONAL categories with which it is associated, as illustrated in (20). Such labels would be manipulable by universal syntactic operations, but it would still be the case that functional expressions would not head FUNCTIONAL projections, and these would therefore not be manipulable independently of major class label. There can therefore be no head movement (although there could still be movement to a "specifier" position, and features [grammatical categories] could still be checked).29 (20) a. [cat+s]{N,AGR}
b. [the cat+s] {N>NUM,DET] c. [have kick+ed]{V,ASP}
5. CONCLUSION In this chapter, it is argued that there is a primary categorial division between functional and lexical expressions, but that this is defined at a language-specific, and not a universal, level of linguistic description. A model is proposed in which functional classes (a notion of E-language) are defined distributionally and themselves determine the local syntactic domains in which lexical expressions are inserted. By defining the distribution of major lexical classes and giving rise to extended projections, functional expressions help to define the map between E-language and I-language. These domains (E-projections) are associated with a universal syntactic label (an I-category) such as N, V, and so on, which is manipulable by principles of Universal Grammar. The content of functional expressions is mapped directly onto the semantic content of the projection (or onto a restricted set of universal FUNCTIONAL categories that augment major class labels). Although the theory of grammatical structure presented here is not directly translatable into any of the current theories of syntax (except perhaps Word Grammar, Hudson, 1990, and LDSNL, Kempson, Meyer-Viol, and Gabbay, this volume), it is not fundamentally inconsistent with any of them and could be adapted
Functional versus Lexical: A Cognitive Dichotomy
71
to a minimalist transformational framework or Head-driven Phrase Structure Grammar (HPSG). However, it does have a number of consequences that do require the rethinking of a number of hypotheses concerning putative universal relations governing grammatical categories, since in the model proposed above, any such relations cannot be directly stated in the syntax. In the first place, and most importantly, it follows that because there are no universal morphosyntactic FUNCTIONAL categories, it cannot be the case that basic functional structures are universally determined. The E-projections that provide the functional baggage of a lexical term will necessarily vary from language to language, and so the grammatical categories associated with projections (if any) are also likely to vary. Following on from this is the consequence that there can be no theory of linguistic parameters of the sort envisaged in Ouhalla (1991) and other work. Even if there is a set of FUNCTIONAL categories, these cannot have a separate syntactic existence, as argued in the last section. Hence, the I-category of TENSE cannot itself select some other category such as AGR or v, since it can never appear independent of the E-categories that encode it, and these are themselves not in the domain of I-linguistic principles. Parameterization can only therefore be defined over the labels of E-projections, which, as already stated, are likely to vary from language to language, depending on the functional expressions in the language and the set of grammatical categories that they encode. A further consequence is that categorial distinctions not morphologically present in a particular language cannot be used in the analysis of that language. Thus, use of independent agreement categories (either subject or object) are not licensed for descriptions of English, as there is no way that subject agreement can be differentiated from tense and no object agreement is ever manifested. Tail and Cann (1990) suggest that this restriction follows from a principle they refer to as the PF-Licensing Principle, which requires all nodes in a syntactic tree to have a phonological "signature" of some sort.30 Such a principle follows automatically from the conclusions reached here and implies a very strict constraint on the appearance of empty functional categories within syntax, which would exclude much of Kayne's (1994) accounts of word order derived from a universal underlying SVO order. The ideas presented above thus clearly present certain difficulties for current transformational syntax, but they embody observations about processing behavior not usually incorporated into linguistic theory. The separation of I-language from E-language, the recognition that properties of the latter may be significant, and the hypothesis that the functional-lexical dichotomy is linguistically significant but separated from universal aspects of language enable the development of view of grammar that may one day reconcile the apparently conflicting hypotheses about linguistic structure that result from experimental psycholinguistic investigation and from the arguments of theoretical syntacticians.
72
Ronnie Cann
NOTES 1
This is a revised version of a paper delivered to the International Conference on Syntactic Categories, which is based on a working paper given in the Department of Linguistics at the University of Edinburgh. I am grateful to Bob Borsley, Caroline Heycock, Dick Hudson, Jim Hurford, Jim Miller, Louise Kelly, Ruth Kempson, Simon Kirby, Richard Shillcock, and an anonymous referee for helpful comments and suggestions on earlier drafts of this chapter. I doubt if any of the above would agree wholeheartedly with what follows. 2 But see Chomsky (1995) for a rejection of Agr as an independent functional category. 3 1 am grateful to R. A. Hudson for bringing these examples to my attention. 4 In other words, "must be followed by a phrase headed by." 5 This is true for the functional instantiations of expressions that have both functional and lexical uses. 6 Witness the ECP of Chomsky (1981), etc., the Slash Termination Metarule of Gazdar et al. (1985), and Kempson's (1995) type-theoretic analysis of gaps. 7 Witness the current vogue for verbalizing common nouns in American English. 8 In languages exhibiting more inflection than English, the freedom of content words to appear in different domains is limited. However, if one looks at roots, as opposed to stems or words, then it is often found that the same freedom in syntactic category is exhibited (Sasse, 1993:653). 9 Generic pronouns like one in English or Man in German are not counterexamples to this. Such elements remain third-person singular, even when their interpretation may range over all persons and numbers. They could not be described as semantic superordinates of all pronouns. 10 For discussion of other ways that AGR might contribute to semantic interpretation, see Adger (1994) and Cormack (1996). 1 ' Even the often cited preposition with in sentences like Kim credited Lou with more intelligence retains a reflex of its comitative uses in that it identifies a property associated with (or predicated of) the direct object by the subject. 12 1 ignore the infinitive marker, for convenience, but see Miller (1986) for an argument that this is a reflex of the same element. 13 In most dialects, the lexeme has either become fully grammaticalized as a support verb, with its main verb possessive function being replaced by the periphrastic have got, or developed into a full homonym, with the possessive function failing to display any auxiliary properties at all. 14 The grammatical judgments here are my own and those of other speakers of Standard British English in Scotland. That there is wide variation in the grammatical properties shown by the verb have in its different uses is not of significance here. What is important to note is that expressions in the process of Grammaticalization may show a mismatch between their semantic and syntactic development as functional expressions. 15 A theory of the acquisition of verbal morphology along these lines was presented in a talk by the author and M. E. Tait presented to the Linguistics Association of Great Britain and a number of places in the UK in 1991.
Functional versus Lexical: A Cognitive Dichotomy
73
16
The data from comprehension are more difficult to assess, see Bastiaanse (1995) for some discussion. 17 See also Gerken (1996) and Gerken and Mclntosh (1993) for a discussion of morphophonological properties that enable children to acquire this distinction. 18 This may be significant for first language acquisition, if assumptions about learning such as those made in Ellman (1993) are valid. 19 Note further that the frames abstract away from allomorphy (i.e., affixes refer to morphemes, not morphs) and so we have s for the plural morpheme, and so on. This is not necessary, but simplifies the exposition. 20 The latter is essentially the approach taken in Generalized Phrase Structure Grammar (GPSG), where different subcategorization environments for functional expressions are labeled uniquely using a number. The functional lexicon then specifies for each functional expression the subcategorization numbers associated with it, as illustrated for all in (la-d) (cf. Gazdar et al., 1985). 1 a. b. c. d. 21
N" N" N" N"
H', {SUBCAT: 21} H[DBF]', {SUBCAT:22} H[DEF,of]', {SUBCAT: 23} H[SUBCAT:24]}
(all boys) (all the boys) (all of the boys) (all)
Note that such a categorization of lexical categories is independent of semantic or notional considerations and does not permit ideas of greater or lesser prototypicality among members of the major categories like noun and verb (see Newmeyer, this volume, for a critique of approaches to linguistic categorization based on prototypes). 22 Hence, we follow Hudson (this volume) in claiming that the is a noun that appears with another noun, but it is so in a very different sense to the way in which father is a noun that appears with another noun. In the latter case, the transitivity results from the fact that student denotes a two-place relation, whereas the transitivity of the results from its distributional properties. 23 The information about argument structure shown in (16) and below is given in terms of 6-roles. The actual representation is clearly not important, and any of the means of representing this information in different theories could be used (e.g., as types as in Kempson, Meyer-Viol, and Gabbay, this volume, or as the ARG-S list in HPSG, Manning and Sag, 1995). All that is important is that arguments become saturated as syntactic combination proceeds. 24 The fact that prepositions and auxiliaries in English permit long-distance dependencies must result from their double status as both functional and lexical. 25 Alternatively, while expressions (roots or words) in the lexicon may be associated to a single major class (particularly derived expressions, e.g., prevarication N, *Mary prevaricationed versus prevaricate V *the prevaricate of the lecturers was very irritating), it may be the case that this information is accidental in the same way that words like trousers are accidentally syntactically plural and not central to their syntactic definition. 26 Such an approach allows for lexical expressions to be "coerced" into different syntactic categories, since it will be the semantics that mediates the assignment to an E-category, and thus to an I-category. Assuming that major I-categories are associated with semantic
74
Ronnie Cann
sorts, where the semantics of an expression is consistent with being expressed by different sorts, the expression may be assigned to a number of major E-categories. 27 A similar story may be put forward for the progressive and gerundive interpretations of verbs with the suffix -ing. 28 The details of the semantic representations below are not important and are for illustrative purposes only. What is significant is that the information provided by the functional expressions does not change the syntactic status (categorial label or argument structure) of the phrase. 29 The picture that emerges here is presaged in the discussion in Chomsky (1995, chapter 4) with regard to the lack of independent AGR nodes. 30 See Tait (1991), Cann (1993), and Cann & Tait (1995) for further discussion, and see Speas (1995) for similar ideas.
REFERENCES Abney, S. P. (1978). The English noun phrase in its sentential aspect. Unpublished doctoral dissertation, Massachusetts Institute of Technology. Adger, D. (1994). Functional heads and interpretation. Unpublished Ph.D. dissertation, University of Edinburgh. Adger, D., and Rhys, C. S. (1994). Argument structure and the English Gerund. In C. S. Rhys, D. Adger, and A. von Klopp (Eds.), Edinburgh working papers in cognitive science Vol. 9 Functional categories, argument structure and parametric variation. University of Edinburgh, Centre for Cognitive Science: 27-48. Anderson, J. M. (1997). A notional theory of syntactic categories. Cambridge: Cambridge University Press Austin, P. (1981). A grammar of Diyari, South Australia. Cambridge: Cambridge University Press. Bastiaanse, R. (1995). Broca's aphasia: A syntactic and/or a morphological disorder? A case study. Brain and Language, 48:1-32. Benthem, J. van. (1986). Essays in logical semantics. Dordrecht: D. Reidel. Besner, D. (1988). Visual word recognition: special purpose mechanisms for the identification of open and closed class items? Bulletin of the Psychonomic Society, 26: 91-93. Bittner, M., and Hale, K. (1996). The structural determination of case and agreement. Linguistic Inquiry, 27:5 31 - 604. Bloom, L. (1970). Language development: Form and function in emerging grammars. Cambridge, MA: MIT Press. Bloomfield, L. (1933). Language. New York H. Holt and Co. Bolinger, D. (1975). Aspects of language (2nd ed.). New York: Harcourt Brace Jovanovich. Bowerman, M. (1973). Early syntactic development: A cross-linguistic study with special reference to Finnish. Cambridge: Cambridge University Press. Bradley, D. (1978). Computational distinctions of vocabulary type. Cambridge, MA: MIT Press.
Functional versus Lexical: A Cognitive Dichotomy
75
Bradley, D., Garrett, M. E, and Zurif, E. B. (1980). Syntactic deficits in Broca's aphasia. In D. Caplan (Ed.), Biological studies of mental processes. Cambridge, MA: MIT Press. Cann, R. (1984). Features and morphology in GPSG. Unpublished doctoral dissertation, University of Sussex. Cann, R. (1993). Patterns of headedness. In G. Corbett, N. M. Fraser, and S. McGlashan (Eds.), Heads in grammatical theory (44-72). Cambridge: Cambridge University Press. Cann, R. (1999). Specifiers as secondary heads. In D. Adger, S. Pintzuk, B. Plunkett, and G. Tsoulas (Eds.), Specifiers: Minimalist approaches (21-45). Oxford: Oxford University Press. Cann, R., and Tait, M. E. (1995). Raising morphology. In C. S. Rhys, D. Adger, and A. von Klopp (Eds.), Edinburgh working papers in cognitive science Vol. 9 Functional categories, argument structure and parametric variation. University of Edinburgh, Centre for Cognitive Science: 1-23. Chomsky, N. (1981). Lectures on government and binding. Dordrecht: Foris Publications. Chomsky, N. (1986). Knowledge of language. New York: Praeger Press. Chomsky, N. (1995). Categories and transformations. In The minimalist program. Cambridge, MA: MIT Press. Cinque, G (1998). Adverbs and functional heads: A crosslinguistic perspective. Oxford: Oxford University Press. Cormack, A. (1996). Without specifiers. In D. Adger et al. (Eds.), Specifiers. Oxford: Oxford University Press. Cruse, D. (1986). Lexical semantics. Cambridge: Cambridge University Press. Cutler, A., and Carter, D. M. (1987). The predominance of strong initial syllables in the English vocabulary. Computer Speech and Language 2:133-142. Cutler, A., and Norris, D. G. (1988). The role of strong syllables in segmentation for lexical access. Journal of Experimental Psychology: Human Perception and Performance 14:113-121. Davidson, D. (1967). The logical form of action sentences. In N. Rescher (Ed.), The logic of decision and action. Pittsburgh: University of Pittsburgh Press. Demuth, K. (1994). On the underspecification of functional categories in early grammars. In B. Lust, M. Suner, and J. Whitman (Eds.), Syntactic theory and first language acquisition: Cross-linguistic perspectives. New Jersey: Lawrence Erlbaum. Elman, J. L. (1993). Learning and development in neural networks—the importance of starting small. Cognition, 48:71-99. Emonds, J. (1976). A transformational approach to English syntax. New York: Academic Press. Gabbay, D., and Kempson, R. (1992). Natural language content: A proof-theoretic perspective. In Proceedings of the 8th Amsterdam Semantics Colloquium. Amsterdam. Garrett, M. (1976). Syntactic processes in sentence production. In R. Wales and E. Walker (Eds.), New approaches to language mechanisms. North Holland: Amsterdam. Garrett, M. (1980). Levels of processing in sentence production. In B. Butterworth (Ed.), Language production (Vol. 1). London: Academic Press. Gazdar, G., Klein, E., Pullum, G. K., and Sag, I. A. (1985). Generalized phrase structure grammar. Oxford: Basil Blackwell.
76
Ronnie Cann
Gerken, L. (1996). Phonological and distributional information in syntax acquisition. In Morgan and Demuth (1996): 411-425. Gerken, L., and Mclntosh, A. B. J. (1993). The interplay of function morphemes and prosody in early language. Developmental Psychology 29:448-457. Goodglass, H. (1976). Agrammatism. In H. Whittaker and H. A. Whittaker (Eds.), Studies in neurolinguistics (Vol. 1). New York: Academic Press. Gordon, B., and Camarazza, A. (1982). Lexical decision for open- and closed-class words: Failure to replicate differential frequency sensitivity. Brain and Language 15:143160. Gordon, B., and Camarazza, A. (1985). Lexical access and frequency sensitivity: frequency saturation and for open/closed class equivalence. Cognition 21:95 -115. Grimshaw, J. (1990). Argument structure. Cambridge, MA: MIT Press. Grimshaw, J. (1991). Extended projection. Unpublished manuscript, Brandeis University. Harris, Z. (1951). Methods in structural linguistics. Chicago, IL: Chicago University Press. Hendrick, R. (1991). The morphosyntax of aspect. Lingua 85:171-210. Hjelmslev, L. (1953). Prolegomena to a theory of language. Bloomington, IN: Indiana University Press. Hockett, C. F. (1954). Two models of grammatical description. Word 10:210-233. Hopper, P. I, and Traugott, E. C. (1993). Grammaticalization. Cambridge: Cambridge University Press. Hudson, R. A. (1990). English word grammar. Oxford: Basil Blackwell. Hudson, R. A. (1995). Competence without Comp? In B. Aarts and C. Meyer (Eds.), The verb in contemporary English. Cambridge: Cambridge University Press: (40-53). Joshi, A. J. (1985). Processing of sentences with intrasentential code switching. In D. R. Dowty, L. Kartunnen, and A. M. Zwicky (Eds.), Natural language parsing. Cambridge: Cambridge University Press. Kayne, R. (1994). The antisymmetry of syntax. Cambridge, MA: MIT Press. Kempson, R (1995). Natural language interpretation as labelled natural deduction. In F. R. Palmer (Ed.), Grammar and meaning. Cambridge: Cambridge University Press. Kempson, R. (1996). Crossover: A dynamic perspective. In S. Jensen (Ed.), SOAS Working Papers 6. Kirby, S. (1998). Function, selection and innateness: The emergence of language universals. Oxford: Oxford University Press. Langacker, R. (1987). Nouns and verbs. Language 63:53-94. Lyons, J. (1966). Towards a notional theory of the "parts of speech." Journal of Linguistics 2:209-236. Manning, C. D., and Sag, I. A. (1995). Dissociations between argument structure and grammatical relations. Unpublished manuscript, Carnegie Mellon University and Stanford University. Matthei, E. H., and Kean, M.-L. (1989). Postaccess processes in the open vs. closed class distinction. Brain and Language, 36:163-180. Miller, J. E. (1986). Semantics and syntax. Cambridge: Cambridge University Press. Morgan, J. L. and Demuth, K. (Eds.), (1996). Signal to syntax. Mahwah, NJ: Lawrence Erlbaum. Morgan, J. L., Shi, R., and Alopenna, P. (1996). Perceptual basis of rudimentary grammatical
Functional versus Lexical: A Cognitive Dichotomy
77
categories: Toward a broader conceptualization of bootstrapping. In Morgan and Demuth 1996:263-283. Ouhalla, J. (1991). Functional categories and parametric variation. London: Routledge. Pittock, A. G. M. (1992). Knowledge elicitation, semantics and inference. Unpublished Ph.D. dissertation, University of Edinburgh. Pollock, J.-Y. (1989). Verb movement, universal grammar and the structure of IP. Linguistic Inquiry 20, 365-424. Pollard, C., and Sag, I. A. (1994). Head-driven Phrase Structure Grammar. Chicago: University of Chicago Press. Pullum, G. K., and Wilson, D. (1977). Autonomous syntax and the analysis of auxiliaries. Language 53:14l-7SS. Pulvermuller, F. and Preissl, H. (1991). A cell assembly model of language. Network 2: 455-468. Pulvermuller, F. and Schumann, J. H. (1994). Neurobiological mechanisms of language acquisition. Language Learning 44:681-734. Quirk, R., Greenbaum, S., Leech, G., and Svartvik, J. (1972). A grammar of contemporary English. London: Longman. Radford, A. (1990). Syntactic theory and the acquisition of English syntax. Oxford: Basil Blackwell. Rhys, C. S. (1993). Functional projections and thematic role assignment in Chinese. Unpublished doctoral dissertation, University of Edinburgh. Ritter, B. (1991). Two functional categories in noun phrases. Syntax and Semantics, 25: 37-62. Saffran, E., Schwartz, M., and Marin, O. (1980). The word order problem in agrammatism: Production. Brain and Language 10:263-280. Sasse, H.-J. (1993). Syntactic categories and subcategories. In J. Jacobs (Ed.), Syntax: An international handbook of contemporary research (646-686). Berlin: Walter de Gruyter. Schwartz, N., Saffran, E., and Marin, O. (1980). The word order problem in agrammatism: Comprehension. Brain and Language 10:249-262. Shillcock, R. C., and Bard, E. G. (1993). Modularity and the processing of closed-class words. In G. Altman and R. Shillcock (Eds.), Cognitive models of speech processing. Cambridge, MA: 163-185. Speas, M. (1995). Economy, agreement and representation of null arguments. Unpublished manuscript, University of Massachusetts. Stowell, T. (1981). Origins of Phrase Structure. Unpublished doctoral dissertation, MIT. Tait, M. E. (1991). The syntactic representation of morphological categories. Unpublished doctoral dissertation, University of Edinburgh. Tait, M. E., and Cann, R. (1990). On empty subjects. In The Proceedings of the Workshop on Parametric Variation, Centre for Cognitive Science, University of Edinburgh. Tait, M. E., and Shillcock, R. C. (1992). An annotated corpus of crosslinguistic non-fluent aphasic speech. ESRC report, Centre for Cognitive Science/Dept. of Linguistics, University of Edinburgh. Tannenhaus, M. K., Leiman, J. M., and Seidenberg, M. S. (1987). Context effects in lexical processing. In U. Frauenfelder and L. Tyler (Eds.), Spoken word recognition.
78
Ronnie Cann
Taylor, J. R. (1989). Linguistic categorization: Prototypes in linguistic theory. Oxford: Clarendon Press. Tsimpli, M. I. (1995). Focusing in Modern Greek. In K. E. Kiss (Ed.), Discourse configurational languages. Oxford: Oxford University Press: 176-206. Villiers, J. de, and Villiers, P. de. (1978). Language acquisition. Cambridge, MA: Harvard University Press. Verkuyl, H. J. (1993). A theory of aspectuality. Cambridge: Cambridge University Press.
FEATURE CHECKING UNDER ADJACENCY AND VSO CLAUSE STRUCTURE DAVID ADGER Department of Language and Linguistic Science University of York Heslington, York United Kingdom
1. INTRODUCTION This chapter1 examines the way that syntactic features, the building blocks of syntactic categories, are dealt with by interface systems. The core proposal is that the syntax morphology and syntax semantics interfaces both invoke feature interpretability but that the configurations in which they do so are different: the LF interface interprets syntactic features that are in spec-head or head-adjoined relations, whereas the interface with morphology, in addition, interprets syntactic features in adjacency relations. The main empirical focus is on the position of subjects in Irish and Scottish Gaelic. There is strong evidence that these VSO languages are SVO underlyingly, with V raising to a higher functional position (McCloskey, 1983, and subsequent work). Further evidence points to this position being the highest functional head below C (McCloskey, 1996a). The descriptive question, then, is whether the subject remains in situ in the verb phrase (VP) or whether it raises to the specifier of a higher functional head. A deeper question is how the subject is licensed: does it simply procrastinate (raising to its licensing position at LF) as argued by Bobalijk and Carnie (1996), or does it raise overtly to a specifier position to check case (McCloskey, 1996b) or EPP requirements? This chapter will show that the solution to these problems bears upon a more fundamental question: what mechanisms are at play in determining the surface Syntax and Semantics, Volume 32 The Nature and Function of Syntactic Categories
79
Copyright © 2000 by Academic Press All rights of reproduction in any form reserved. 0092-4563/99 $30
80
David Adger
position of DP arguments, and are these mechanisms purely syntactic or do they involve one or other of the interface systems? The standard minimalist line is that the position of an argument is determined by the strength of the features on its case checker (this theoretical viewpoint is taken by both Bobaljik and Carnie, 1996, and by McCloskey, 1996b). In this chapter I return to an older conception of what gives rise to the surface position of at least some arguments and argue that morphological requirements on the interpretability of features play a significant role. More specifically, I will argue that the final position of subjects in Irish and Scottish Gaelic is an adjacency configuration that results from movement motivated by the need to give a morphological interpretation to case features. This position is somewhat of a hybrid between McCloskey (1996b) and McCloskey (1991). In the latter, McCloskey argues that the subject remains inside VP, where it is governed by the V in I; in the former he argues that the subject raises into the specifier of an optional agreement head. The empirical data that shows that the subject raises is extremely strong, but that does not solve the problem of how to characterize the target of the raising or its motivation. I argue here that the subject raises from the VP into a position where it is immediately subjacent to the finite inflection, and that this movement takes place to satisfy morphological interpretability requirements on Case. This proposal, of course, has broader empirical implications, some of which will be briefly explored. The chapter is organized as follows: section 2 outlines a particular way of conceptualizing syntactic feature checking that treats checking and the locality configurations involved in checking as ways of rendering LF uninterpretable features acceptable to the Conceptual-Intentional interface. The idea is that a checking configuration always allows a feature to be interpreted. I then show how that same conception of feature checking can be applied to morphologically (un)interpretable features, and propose that a relevant configuration here is one of adjacency. The general consequences of adopting this framework are briefly explored. In section 3 the particular problem posed by subjects in VSO structures is introduced, and a number of analytical and theoretical problems with McCloskey's (1996b) analysis are discussed. The alternative proposal is that the subject raises in VSO in order to set up an adjacency configuration with its case assigner. A range of favorable consequences follow. The chapter concludes with a discussion of the idea that the notion of interpretability of syntactic features should be generalized to both interfaces.
2. FEATURE CHECKING 2.1. Checking Features and the LF Interface Chomsky (1995, chapter 4) conceives of movement in the following way (see Adger et al., 1999, for discussion):2
Feature Checking, Adjacency and VSO
81
(1) A head H in a structure 2 with a feature f attracts a feature f' on a substructure of X, a iff: a. H c-commands a b. f' is the closest feature to f c. f is uninterpretable d. the sets of features on H and a are compatible The attracted feature is then marked as deleted. If this feature is itself uninterpretable, then it is erased from the final representation. (Ic) may be too strict, in that the uninterpretable feature may be on the moved element, rather than the head (in which case we have movement, rather than attraction). The structure that results from application of Move/Attract must be a checking relation, where x and y are in a checking relation if x is a specifier of y, adjoined to y, or adjoined to a projection or a specifier of y. The notions of deletion and erasure here are, I think, not entirely helpful. In fact "deleted" features are never actually deleted (i.e., removed from the representation), and only uninterpretable features are ever erased. This suggests that interpretability is the core notion here, and that the mechanism for dealing with uninterpretable features is one that allows them to be interpreted, rather than deletes them. Let us suppose instead that part of the specification of the LF component is a partial function, i, which applies to a syntactic feature F to give an interpretation for F (which we will notate as [[F]], following standard model-theoretic practice). In general, i(F) may be one of three things: it may be zero, it may be some element of the conceptual-intentional system, or it may be undefined. If i(F) is undefined, then we have a violation of Full Interpretation and an unacceptable sentence. Note that this conception of the interface is rather different from Chomsky's, where there is no place for a zero interpretation, but there is a syntactic mechanism that deletes and erases features. Allowing zero interpretations is a conceptual simplification here, since we need such interpretations anyway, if the syntax semantics map is to be compositional (see, for example, Heim and Kratzer, 1998). How does this set of assumptions bear on the question of movement? As in Chomsky (1995), movement is still conceived of as an operation that brings into a local relationship features that are scattered around the structure. The result is that any uninterpretable features are then close enough to matching interpretable ones that the interface system does not balk at their presence. A checking configuration is, then, just a way of allowing an uninterpretable feature to receive an interpretation via the interpretation of a matching interpretable feature. I will be agnostic here as to exactly what the conceptual intentional (C-I) system is, although I conceive of it as a DRT-like representation (see Adger, 1994, for one way of thinking about this). The interface is structured as follows: (2)
Let i be a partial function from syntactic features to elements of the conceptual-intentional system, and let be a set of syntactic features
82
David Adger
(F1, . . . , Fn, G1, . . . , Gn, . . .} in a syntactic structure where Fi matches FI+1 then, (3) Where Fi f is in a checking relation with Fj F : a. If Fj is an uninterpretable feature, then i(F j ) is dependent on i(F j ). b. If Fj is an interpretable feature, then i(F j ) = [[Fj]] (4) Where Fi F is not in a checking relation with any Fj F a. If Fi is an uninterpretable feature, then i(Fi) is undefined. b. If Fj is an interpretable feature, then i(F i ) = [[F i ] The definition of checking relation is as in Chomsky (1995) (i.e., specifier or adjoined position), matching features are simply features that do not conflict in their specification and is dependent on is a cover term for whatever range of relations the structure of the C-I interface makes available. For example, Adger (1994) argues that the interpretation of a DP argument in the specifier of an agreement head at LF is dependent on the interpretation of agreement as a pronominal element. This means that semantic constraints on pronouns have to be obeyed by the DP, resulting in such DPs being interpreted as discourse familiar, and so on (see also Adger, 1995, 1996c; Meinunger, 1995; Runner, 1995). In this system a checked uninterpretable feature will be provided an interpretation as part of its checker; an unchecked uninterpretable feature cannot be provided with an interpretation and will violate Full Interpretation (FI), leading to unacceptability. For example, an uninterpretable D-feature on T (the EPP feature) is checked by the interpretable feature on the DP in [Spec, TP]. The D feature on T is then interpreted as though it were part of the DP chain. If there is no DP in [Spec, TP] by LF, then there is no way of checking the D feature on T and the result is an ill-formed representation. In such a system movement always takes place to check uninterpretable features. We can think of this in constructive rather than licensing terms: Move/ Attract F applies to form a checking relation whenever i(F) is undefined. Note that this assumes that the calculation of i(F) takes place in parallel with each syntactic operation involving F, and therefore requires some interweaving of syntactic and semantic demands in a way familiar from formal semantics. I have said nothing yet here about "strength," only interpretability. Under Chomsky's system, a feature that is marked as "strong" must be eliminated immediately from the derivation. We will return to this proposal below, and argue that "strength" is just uninterpretability by the morphological component, much as in Chomsky (1993). 2.2. Checking Features and Morphology Let us assume a postsyntactic morphology situated on the way to PF (Halle and Marantz, 1993). If the PF and LF wings of the grammar are structured simi-
Feature Checking, Adjacency and VSO
83
larly, we might expect that Morphologically)-uninterpretable features need to be checked as well. This is exactly what I propose. The only difference is that there are two types of appropriate checking configuration: one is the familiar specifier/ adjoined configuration; the other is adjacency. Full interpretation holds at both interfaces: (5) Let m be a partial function from syntactic features to elements of the morphological component, and let F be a set of syntactic features (F1 . . ., Fn, G l , . . . , G n , . . . } in a syntactic structure where Fj matches F i+1 then, (6) Where Fi F is in a checking relation with Fj F: a. If Fi is an uninterpretable feature, then m(F i ) is dependent on m(Fj). b. If F; is an interpretable feature, then m(Fi) = [[F;]] (7) Where Fj e is not in a checking relation with any Fj ÎF: a. If Fi is an uninterpretable feature, then m(Fi) is undefined. b. If Fi is an interpretable feature, then m(Fi) = [[Fi]]. Elements of the morphological component are morphological words. I shall return to exactly how lexical insertion proceeds in this system. Matching features are defined as above. However, the notion of checking relation here is extended. In addition to the standard checking configuration, we will also say that two features are in a checking configuration if the nodes on which those features are specified are in an adjacency relation, and we will define adjacency in purely tree geometric terms: (8)
x and y are in an adjacency relation if a. either the first right-branching node dominating x is the first leftbranching node dominating y (dominance to be understood reflexively) or b. x = y
Direction of branching here is to be understood relative to the node from which the adjacency relation is calculated. The definition of adjacency here simply says that one node in a tree is adjacent to another only if there is no branching structure in between, and that a node is adjacent to itself. The system works in a similar way to the LF branch of the grammar. A morphological feature that is uninterpretable will attract/move in order to be checked by a matching interpretable feature. The resulting configuration will allow the M-uninterpretable feature to be interpreted by the morphology as dependent on the interpretation of the M-interpretable feature. Exactly how this dependency is realized will depend upon the theory of morphology: perhaps by associating the former with a particular morphological form dependent on properties of the latter; perhaps via a morphophonological process linking the two; perhaps through some
84
David Adger
other morphological signal. The intuition is that chunks of morphology show up to express checked syntactic features. Note that any movement that takes place to check a morphologically uninterpretable feature will not necessarily "extend" the phrase marker, since the resulting structure simply needs to satisfy the adjacency configuration. This seems to be the correct result for head movement, which never extends the phrase marker, but will also allow movement within a phrase marker as long as this sets up an adjacency relation. A further consequence of the system is that if a feature is M-uninterpretable, then there are two basic configurations in which it might be checked: specifier/adjoined to its checker, or adjacent to its checker. In many cases these may overlap (since a specifier may be adjacent to its head, and a head adjoined to another head will necessarily be adjacent to it). We will address some of the consequences of this below. Many syntactic features have both LF and M-interpretations. Tense, for example, or Mood. Categorial features have LF-interpretations and zero morphological interpretations (at least in English). Note that a zero interpretation is an interpretation and satisfies FI; only unchecked uninterpretable features violate FI. The feature involved in the trace licensing ability of certain complementizers (what used to be called their status as proper head governors -quelqui in French; 0/that in English, etc.) has a clear M-interpretation but a zero LF-interpretation. The case-assigning feature on a head has zero LF and M-interpretations, since case assignors do not vary in morphological form depending on the case they assign (that is, we do not have prepositions varying in form depending on whether they assign accusative or dative), nor is there a particular semantics associated with particular case-assigning heads. Case assignees, however, do vary in form depending on the case they are assigned. Let us say then that Case is an uninterpretable feature to the Morphology and that, to satisfy FI, it needs to be checked. When Case is checked, an interpretation can be assigned to it (as a particular morphological form determined by the case assigner). When Case is not checked, no interpretation can be assigned, and the resulting representation violates FI. There are two ways of checking a case feature: it can be checked in a specifierhead configuration, or in an adjacency configuration. In both cases, the movement involved in constructing the checking configuration must be overt, or else the case feature will be unchecked at the point of morphological interpretation. This then is the notion of "strength" in this system: M-uninterpretability. Let us turn now to the question of lexical insertion: we stated above that the morphological interface associates "words," that is, maximal morphological units in the guise of phonological strings, with the units delivered to it by the syntax. Words are morphological exponents of certain syntactic configurations. The standard assumption is that the relevant configuration is the X° syntactic unit, an assumption that leads to problems with clitics, and so on. In LF, the units that are
Feature Checking, Adjacency and VSO
85
relevant to the interface are defined by movement chains and c-command. Let us assume then, in a parallel fashion, that at the morphological interface the adjacency relation defines the relevant syntactic configuration for lexical insertion. Now, since a node is always adjacent to itself, single nodes in the tree may be associated with "words." The phrase structural status of these nodes is irrelevant, as long as there is a morphological interpretation for the featural information contained in the node. However, the system also predicts that, where there is a morphological interpretation, two adjacent nodes may be associated with a single maximal morphological unit. This system then allows a range of syntax/ morphology mismatches but makes the prediction that these should always be constrained by adjacency. Note that this brings the current system very close to Marantz's Morphological Merger (Marantz, 1984, 1988). 2.3. Adjacency and Morphosyntactic Mismatches Clearly, the notion of Adjacency defined here is a species of locality and, in fact, brings to mind the concept of head-government. It is, however, not exactly the same as the latter. An adjacency relation will not only hold trivially between sisters, and between a head and the specifier of its sister, but also recursively between a head and the specifier of the specifier of its sister, and so on. Moreover, if A is to be adjacent to B, then no element can be adjoined to B, as this will introduce further branching structure. In this section I will tease out a few of the implications of hypothesizing that adjacency relations are involved at the morphological interface. In Scottish Gaelic certain prepositions appear to agree in definiteness with their complement DP. Thus we have: (9) le peann with pen (10)
dha thaigh to a house
vs.
lets a' pheann withdef the pen
vs. dhan an taigh todef the house
Here we see the preposition changes morphological form depending on whether or not its complement has a definite article. However, these prepositions do not agree with other DPs, which one would usually classify as definite: proper names, simple demonstratives, DPs with prenominal possessors, and, most interestingly, construct state nominals: (11)
ri(*s) Dhaibhidhl seol mo bhrathairl aite mo ghaoil (*def) David/ this/ my brother/ the place of my love
to
Most of these constructions are analyzed as involving an abstract D[+def] in the syntactic representation, and it is curious that the preposition is sensitive only to the overt definite article. Moreover, prepositional definiteness agreement takes
86
David Adger
place even when the following DP is not a complement of the preposition, but is more deeply embedded: (12) Dh'fheuch mi ri [teine a chur air doigh} Tried I to fire Agr put-VN on well 'I tried to set a fire up.' (13)
Dh'fheuch mi ris [an teine a chur air doigh] Tried I todef the fire Agr put-VN on well 'I tried to set the fire up.'
Here an teine 'the fire' is the complement of the verbal noun chur 'put,' rather than a complement of the preposition ri. In fact, there seems to be no grammatical relationship between the preposition and the DP: the DP receives case from Agr and is the complement of the embedded predicate, yet the preposition agrees in definiteness with it. One might attribute this to a head government type relationship, or perhaps movement of abstract features to an agreement node dominating PP, but one would have to show why the relevant features are not available on constructs, and so on. The simplest generalization seems to be that a preposition is definite when it precedes and is adjacent to the definite article. In terms of the framework outlined in the previous section, the [+def] feature is an m-uninterpretable feature that is optionally added to a preposition. It may receive an interpretation only when checked by a following adjacent definite article, and that interpretation is as -s or -n, depending on the preposition's morphological specification. Adopting this view makes an immediate prediction. If we adjoin an element to the complement of the preposition in (14), or to the DP itself, then the preposition should show up in its nonagreeing form. It is possible to left adjoin the focus particle eadhon 'even' to DPs or to the Verbal-Noun constituent, giving the ambiguous sentence (small caps represents focus): (14) Dh'fheuch e n(*s) eadhon am ban-righ a mharbhadh Tried he to(*def) even the queen Agr kill-VN 'He tried to kill even the QUEEN.' 'He tried even to KILL the queen.' In both cases no definiteness agreement is permitted on the preposition. A further prediction of the adjacency analysis is that movement of the definite DP should affect the appearance of the definiteness agreement on the preposition: (15)
Se an duine a bha an seo an-de a dh'fheuch mi r/(*s) It's the man that was here yesterday that tried I to(*def) mharbhadh kill-VN 'It was the man that was here yesterday that I tried to kill.'
Feature Checking, Adjacency and VSO
87
A movement analysis of these facts would require stipulations to stop movement over eadhon, and would require one to extrinsically order definiteness agreement after clefting, whereas the adjacency analysis predicts these facts straightforwardly. There are many other areas where adjacency seems to be important in the grammar: assignment of Case (Stowell, 1981); cliticization (Marantz, 1988); perhaps noun-incorporation (Bok-Bennema and Groos, 1988); Preposition-determiner contraction (van Riemsdijk, 1995); verb movement (Haeberli, 1995); dosupport (Bobaljik, 1995). I think that the framework outlined here will have some contribution to make in at least some of these cases. For example, the subject and finite verb must be strictly adjacent in French, a fact that would be tied down to the way that case is checked in French. Because case is checked via an adjacency relation, and since adjacency relations are potential insertion domains for lexical items, French subject pronouns should behave like clitics, which they do. English subject pronouns, on the other hand, have their case checked in a spec-head relation. Adjacency is not required, and these pronouns should behave like independent words. A similar argument can be made for the clitic-like nature of English object pronouns. In general, the framework presented here ties together clitic-like behavior of pronouns with adjacency phenomena. In the remainder of this chapter, however, I would like to draw out some consequences of the present framework for the analysis of VSO clause structure.
3. SUBJECT POSITIONS IN IRISH AND SCOTTISH GAELIC 3.1. Preliminaries of VSO Clause Structure As mentioned in the introduction, Modern Irish (MI) and Scottish Gaelic (SG) are VSO languages with V raising in finite clauses to the topmost functional projection below C (let us assume for concreteness that this is T; nothing will hinge on this assumption), giving examples like: (16) Bhuail Daibhidh an cat Strike-past David the cat 'David hit the cat.'
(SG)
When T is occupied by an auxiliary, or is nonfinite, we find SVO order, with a nominalized version of the verb and the object in genitive case:3 (17) Bha Daibhidh a' bualadh a' chait (SG) Be-past David SIMP strike-VN the cat-Gen 'David was hitting the cat.'
88
David Adger
In certain nonfinite constructions we find that the object shifts preverbally: (18)
Feumaidh Daibhidh an leabhar a sgrlobhadh Must-pres David the book-Dir Agr write-VN 'David must write the book.'
(SG)
This has been analyzed as case-driven movement to Spec, AgrO (Adger, 1996a; Duffield, 1990; Bobalijk and Carnie, 1996; Noonan, 1995, a.o.), or to a position involved in aspectual interpretation (Ramchand, 1993, 1997; Guilfoyle, 1994). I will adopt the position that this object shift takes place in finite clauses as well. See Adger (1997), Bobalijk and Carnie (1996); Ramchand (1997) for arguments that this is the correct analysis. For concreteness, I will assume that the object DP raises into the inner specifier of a light verb whose outer specifier hosts the subject (see Adger, 1996b, and Guilfoyle, 1994, for motivation that the base position of subjects is the specifier of a light verb in Irish and Scottish Gaelic). Nothing will turn on the precise position of the object, but since the object is not adjacent to its case assigner, it must be in a checking configuration. This presents us with the following clause structure for an example like (19):
3.2. McCloskey's (1996b) Analysis of the Subject Position We turn now to the position of the subject: assuming that the subject DP has case features that need to be checked, and that T has case features that it needs to assign, just how is the subject licensed? On the face of it there is a simple answer: the subject simply procrastinates until LF, whereon its formal features raise, adjoin to T, and the necessary checking takes place. We then have a very simple theory of VSO clause structure.
Feature Checking, Adjacency and VSO
89
Unfortunately, there are some complications that will require us to say something rather different. The complications come from some intriguing data from Munster Irish recently discussed in McCloskey (1996b). McCloskey examines a class of verbs that take a single argument in the language and shows that the single argument of these verbs may be licensed in one of two ways: either the argument surfaces as a PP in complement position (20), or it surfaces as a DP in subject position (21). He terms the former case salient unaccusative constructions, and the latter case putative unaccusative constructions: (20) Neartaigh ar a ghlor Strengthened on his voice 'His voice strengthened.'
(salient)
(21) Neartaigh a ghlor Strengthened his voice 'His voice strengthened.'
(putative)
McCloskey shows rather convincingly that the PP argument in (20) is not in subject position, whereas the DP argument in (21) is. I will briefly review two of the arguments he puts forward to show the strength of the case. First, there is a generalization that we have already hinted at governing the relationship between finite and nonfmite clauses: the subject of a finite clause appears preverbally in the corresponding nonfmite clause, whereas the complement of a finite clause may appear postverbally in the corresponding nonfinite, depending on the construction. Thus, corresponding to the finite (22a), we have the perception verb complement clause in (22b). (22) a. Bhuail iad an cat (SG) Strike-past they the cat 'They hit the cat.' b. Chonnaic mi iad a' bualadh a' chait Saw-past I them SIMP strike-VN the cat-Gen 'I saw them striking the cat.'
(SG)
This generalization holds of all dialects, as far as I am aware. Now consider what happens when we examine salient and putative unaccusative constructions in nonfinite clauses. (23) a. B'fhada ag cailliuint ar a mhisneach Cop-long asp lessen-VN on his strength 'His strength weakened for a long time.' b. *B'fhada ar a mhisneach ag cailliuint Cop-long on his strength asp lessen-VN 'His strength weakened for a long time.'
90
(24) (25)
David Adger
B'fhada a mhisneach ag cailliuint Cop-long his strength asp lessen-VN *B'fhada
ag cailliuint a mhisneach
(22) shows that the PP argument surfaces in postverbal position, and that it is not the structural subject of the construction (23); (24) shows that the DP argument, on the other hand, behaves like the subject, appearing in preverbal position. This is all the more curious in that it may not remain in its (presumably base-generated) complement position (25). Recall that DP complements may generally occur in postverbal position of a nonfmite clause (cf. 22b). The question is why (25) is not legitimate. McCloskey's second argument is to much the same effect. Irish allows a construction whereby a VP-like constituent is fronted. Crucially, this process cannot front the subject along with the VP: (26)
(27)
is ag togdil tithe a bhi siad Cop asp build-VN houses-gen that were they 'It was building the houses that they were.' *is na daoine ag imeacht a bhi Cop the men asp leave-VN that were 'It was the men leaving that was happening.'
Unsurprisingly, the PP argument of a salient unaccusative behaves as though it were part of the VP, while the DP argument of the putative unaccusative behaves as though it were the subject: (28)
is ag eiri ar an leanbh a bhi Cop asp raise-VN on the child that was 'The child was growing up.'
(29)
is ag eiri a bhi an leanbh Cop asp raise-VN that was the child 'The child was growing up.'
The generalization that McCloskey draws from this data is that the PP cannot raise but DP must raise into subject position. He draws attention to the systematic absence of a null expletive transmitting case, as has been argued for Italian, and other languages where objects of unaccusatives remain in situ (see, for example, Burzio, 1986): (30)
* V proi DP;
McCloskey points out that this evidence weighs against the EPP applying in Irish. If it did, then we might expect (30) to be a possible structure, and (25) to be
Feature Checking, Adjacency and VSO
91
grammatical. Moreover, if there were an EPP, then there should be an expletive subject in the cases with the salient unaccusatives, but this expletive would have to associate with a PP. McCloskey argues that this is theoretically implausible. If there is no EPP, however, how do we force the DP argument to raise into subject position? The most obvious solution would be to say that there is a licensing requirement on the DP that forces it to raise into a position where it can be licensed. However, note that under the clause structure proposed in (19), the structural subject position is not a position where the subject is licensed. The subject appears there because that is its base-generated position, and it only raises to check case features at LF. If this is the case, then why does the DP subject of a putative unaccusative have to raise? McCloskey's position is that there is a null agreement head between T and VP, and that this head has a strong D feature and checks nominative case. Moreover, this head, Agr[check D, assign Nom], is drawn optionally from the lexicon. If there is a DP in numeration then this must raise to [Spec, AgrP] to check the strong D feature of Agr, its nominative case being checked as a free rider; if, however, the numeration contains a P, then the derivation will crash, since P will check the case feature of the DP, and Agr will be left with uninterpretable case features. Note that since Agr may be optionally drawn from the lexicon, if there is no Agr in the numeration, then any DP that enters the derivation will not have its nominative case checked, again causing the derivation to crash. If there is a P in the numeration, but no Agr, then the case features of the DP can be checked by P and the derivation will converge. Now while this analysis cannot be faulted on empirical grounds, I think that there are four reasons why one might want to reject it: first, it enriches the functional structure of the clause by positing an optional functional head that has no interface motivation; a second, related, point is that Agr only appears in order to force "subject" shift—this is really just a restipulation of the basic properties of the construction and has no independent empirical consequences at all; third, Agr must be the case assigner, rather than T (since a derivation where Agr is omitted from the numeration but the single argument is a DP would converge with LF raising of the formal features of DP to check case with T), which goes against the standard idea that v checks accusative case while T checks nominative. Finally, if Agr is optional but appears below T, then we must have a way of fixing its position in the functional spine of the clause (why does it not occur above T?). If this is done by selection from T, then T must optionally select for Agr or vP purely to allow this construction, which does not seem at all elegant. I would like to suggest that the more obvious, and conservative, position is correct. The subject position in Irish and Scottish Gaelic is a licensing position for DPs, but this position should be defined as the position right adjacent to T. The single DP argument of a putative unaccusative raises into this position so that its case can be checked with T via adjacency.
92
David Adger
3.3. Morphologically Motivated Movement The system of assumptions set up in section 2 provide, of course, a simple way out of McCloskey's puzzle. We have assumed that case is an M-uninterpretable feature and must be checked. There are two ways to check Case: movement into a spec-head-checking relation with an M-interpretable feature and movement into an adjacency relation with an M-interpretable feature. Assuming that accusative case cannot be checked by an unaccusative verb (perhaps because of the absence of "little v"), the only possible way to check the case feature of the single argument DP is with the case-assigning feature of T. Again we have two possibilities: move into an adjacency relation with T, or move into the checking domain of T [these are not, of course, exclusive, since a specifier may be adjacent to a head under the definition provided in (8)]. Movement is necessary in the former case with the single DP argument of unaccusative verbs because the base position of this DP is not adjacent to T. No movement would be necessary for the subject of an unergative or a transitive, whose case features are checked in situ. The data show that the movement of the subject of an unaccusative takes place into a right-adjacent position to T; that is, DP raises and adjoins to VP. One reason why movement to [Spec, TP] is out may be that adjunction to VP is shorter (crosses less intervening nodes). A more intriguing possibility is that VSO languages like SG and MI simply lack specifiers of functional heads. In this case there is no [Spec, TP] position to move into. This conjecture also predicts that we never find filled specifiers of DP (giving rise to at least some of the properties of the construct state constructions found in Celtic), and that wh-movement cannot transit through [Spec, CP]. This latter prediction may seem somewhat off base, but Adger and Ramchand (1997) argue that this is exactly the case, and that wh-movement in SG and MI involves chaining together copular constructions. In this analysis, wh-movement only ever takes place to the specifier of the copular verb, rather than to the specifier of CP. I will tentatively assume that this position is correct, and that the subject does not raise to [Spec, TP] simply because there isn't one. We have, then, the following structure:
The DP must raise to check its M-uninterpretable features. If the single argument is a PP, then the M-uninterpretable features of the DP within this PP are checked
Feature Checking, Adjacency and VSO
93
in situ. The PP itself does not need to raise, and so won't. The case features of T receive a zero interpretation by the morphology, as usual. McCloskey's position that there is no EPP in Irish can be maintained without recourse to an optional functional agreement head that has no interface motivation. I take the fact that this movement operation does not extend the phrase marker to simply show that such a constraint is not viable (as head movement and LFmovement also show). That the final position of the subject is an adjoined position (or perhaps an outer specifier position) will in general be irrelevant to its interpretation: what matters is that its case features are checked. 3.4. Consequences of the Proposal 3.4.1. ADJACENCY CONSTRAINTS There is a well-known adjacency effect in operation in MI and in SG that prohibits any element intervening between the complementizer and the verb, and between the verb and the subject. The following examples are from Ulster Irish (from Duffield, 1995): (32) a. *go or ndoigh bhfaca Mdire an fear that of course saw Mary the man '... that Mary, of course, saw the man.' b. *Chuala ar ndoigh me an t-amhrdn sin heard of course I the song that 'I, of course, heard that song.' Such an adjacency effect might be expected if adverbials in Irish are constrained so that they only right-adjoin to projections.4 However, this cannot be correct, since we find examples like (33): (33) Mu dheireidh, thanig sinn an sin At last arrive-past we there 'We arrived there at last.'
(SG)
If the subject is licensed in an adjacency configuration with T, such data is immediately explained. The same argument can be made with parenthetical expressions, which also cannot intervene between the finite verb and the subject. The following data from SG illustrate: (34)
Dh 'fhag Daibhidh, tha mi cinnteach, an de left-past David, be-pres I sure, yesterday 'David, I'm sure, left yesterday.'
(35)
*Dh'fhag, tha mi cinnteach, Daibhidh an de left-past, be-pres I sure, David yesterday 'David, I'm sure, left yesterday.'
94
David Adger
(34) shows that a parenthetical expression may appear between the subject and a time adverbial, but that it may not appear between the verb and the subject. Assuming that parentheticals disrupt the adjacency relation vital to checking the case feature of the subject, these data follow immediately. That the proposed analysis makes even this one prediction is an improvement over the optional agreement head analysis, which treats the positioning of the subject as a completely independent factor in clausal structure. 3.4.2. AGREEMENT PHENOMENA Recall that the view of the morphology syntax interface proposed here predicts that whenever an adjacency relation is set up, if both nodes involved are zero-level projections, the morphology can interpret them as a single morphological unit. This section shows that we find evidence that this is exactly how the morphology of SG treats verb + subject, where the subject is pronominal. In an enlightening discussion of agreement effects in Hebrew and Irish, Doron (1988) argues that pronouns are morphologically analyzed as part of the immediately preceding verb. This is something which is expected within the current framework, since the pronoun has its case features checked by preceding finite T. Since the T complex and the pronoun are adjacent zero-level elements, lexical insertion may take place, giving a single morphological interpretation for the adjacent syntactic nodes, effectively "merging" the nodes. Agreement is not actually agreement, but just the morphological interpretation of the verb and a following pronoun. This proposal has the effect that if the lexicon contains an idiosyncratic form for the merged element, then this form will not occur with an independent pronoun. This gives rise to the well-known observation that "strong" agreement5 in Irish cannot occur with an independent pronominal (see, for example, McCloskey and Hale, 1984, who provide a different explanation). I illustrate here with examples from SG: (36) a. Sgriobhainn an leabhar Write-cond-ls the book 'I would write the book.' b. Sgriobhamaid an leabhar Write-cond-lp the book 'We would write the book.' c. Sgriobhadh tu an leabhar Write-cond you the book 'You would write the book.' Here (a) and (b) are cases where the pronoun has merged with the verb and the lexicon has supplied an apparently idiosyncratic form. In (c) the pronoun has also merged, but the form is predictable from the two inputs. In this particular para-
Feature Checking, Adjacency and VSO
95
digm of the verb, only the first-person forms have an idiosyncratic morphological shape: d. Sgrlobhadh elilsibhlsinnliad an leabhar Write-cond he/she/you (pi)/we/they the book 'He/she/you/they would write the book.' Note that, since the pronoun is actually syntactically present, we will never find the "strong" form with an independent pronoun, since this would lead to a thetacriterion violation: (37) a. *Sgriobhainn mi an leabhar Write-cond-Is I the book 'I would write the book.' b. *Sgriobhamaid sinn an leabhar Write-cond-Ip we the book 'We would write the book.' The advantage of a morphological account, rather than a syntactic incorporation account, is that the "strong" forms are found with coordinated subjects, and are here subject to the same constraints (coordinated pronouns in SG and MI must be augmented in all contexts with an emphatic marker, so there is an independent reason why we find the markers fhin andfhein in the example below): (38) Sgriobhainn fhin is thu-fhein an leobhar seo Write-Cond-ls Emph and you-Emph the book this 'You and I would write this book.' Heads in MI and SG appear to be subject to the Coordinate Structure Constraint (see Adger, 1994, chapter 3), so it would be a mystery, under a syntactic incorporation account, why this is possible with subjects. Another alternative, proposed by McCloskey and Hale (1984), is that there is an S-Structure filter on the licensing of pro, such that it may only occur when identified by agreement, and that there is a separate condition unique to MI which prohibits a full pronoun from being governed by agreement. I think that this idea is actually very close to what is going on in agreement in PPs and DPs in these languages, but that what we see with subjects is rather different. One striking difference between the cases is that agreement in PPs and DPs (and agreement with fronted objects in nonfinite clauses), varies very little across the dialects of MI and SG, whereas agreement with subjects partakes in just the paradigmatic messiness we associate with morphological rather than syntactic processes. Thus, "strong" agreement with subjects in SG is limited to just the first-person conditional cases discussed above, whereas "strong" agreement with prepositions, in possessive constructions, and in fronted object constructions occurs with all persons, numbers, and genders. For the latter three constructions this holds rather generally of all dialects of SG and MI, whereas there is a significant amount of
96
David Adger
dialectal variation with subjects (see McCloskey and Hale, 1984, for a survey, and also the discussion in chapter 3 of Adger, 1994). Moreover, there are syntactic differences between these two cases of "agreement." Consider fronted objects in nonfinite clauses: (39) a. Feumaidh mi na leabhraichean a sgriobhadh Must-pres I the books Agr write-VN 'I must write the books.' b. Feumaidh mi an sgriobhadh Must-pres I Agr-3pl write-VN 'I must write them.' In (39a) the object of the nonfinite verb is fronted to a position before the verb and is followed by a particle. There is a general consensus that such a particle is agreement in MI and SG (see, for example, Adger, 1996a; Duffield, 1990; Duffield, 1995; Bobalijk and Carnie, 1996; Ramchand, 1997, among others6), and it seems to be governed by something very like McCloskey and Male's (1984) filter. If there is a full DP, then we find a form of the particle neutralized for phi-features (39a), while if we have a null pronominal then we find a strong form, inflecting for person, number, and gender features (39b). The full paradigm is available in the case of "strong" agreement, unlike what typically happens with subjects. Note that if we cleft the object, then we find the "strong" form once again: (40) a. 's e na leabhraichean sin nach feumaidh sinn an Cleft-pres the books Dem that-not must we Agr-pl leughadh read 'It's these books that we must not read.' b. 's e iadsan nach feumaidh sinn an leughadh Cleft-pres them that-not must we Agr-pl read 'It's them that we must not read.' This is in contrast to what happens when we cleft a pronominal subject: (41) a.
'be mise a thubhairt thu a sgriobhadh an leabhar Cleft-past I-emph that said you write-cond the book 'It was me that you said would write the book.' b. * 'b e mise a thubhairt thu a sgriobhinn an leabhar Cleft-past I-emph that said you that write-cond-Is the book
In this case we apparently find the "weak" form of the verb. This is unexpected on a McCloskey/Hale type account, but exactly what we predict if the "strong" form is a result of morphologically merging the verb and pronoun: in the cleft there is only a null element to be merged resulting in the base conditional form. The view of agreement in Celtic proposed here is that what has been traditionally viewed as a unitary phenomenon is actually the result of two different pro-
Feature Checking, Adjacency and VSO
97
cesses with partly overlapping results. The "strong" forms of verbs that apparently agree with their subjects are actually the result of a process whereby the morphology interprets two adjacent zero-level nodes as a single morphological unit. Since this is a morphological process, we find some lexical variation in the forms that are spelled out, and this variation appears to be conditioned by paradigmatic, rather than syntactic, factors. So, as far as subjects go, we find varying "strong" forms of the verb with apparently no overt pronoun, "weak" forms of the verb with overt pronouns, and "weak" forms also with overt full DPs. In contrast, the "strong" forms of prepositions, possessive pronouns (which I have not discussed here), and object agreement particles occur with pro, whereas the "weak" forms occur with full (nonpronominal) DPs. We never find the weak forms of these elements with overt pronominals.7 4.2.3. MORPHOPHONOLOGICAL PHENOMENA
One final type of evidence that backs up the kind of system proposed here again comes from the pronominal system. Here we see morphophonological phenomena taking place between the verb and a following subject, suggesting that the morphology treats them as a single unit. In SG the second singular pronoun is thu, pronounced as /u/ if it is an object. If it is a subject then it occurs as thu, /u/ in some parts of the paradigm, but as tu, /tu/ in others. This does not seem to have a phonological conditioning, but rather a morphological one. Thus, the passive conditional and the active conditional both terminate in -(e)adh, but the passive infixes -t- between the root and the inflection. Examples might be bhuailteadh thu, 'you would be struck,' but bhuaileadh tu, 'you would strike.' In the passive conditional the pronoun is pronounced /u/, while in the active it is pronounced /tu/ (see above). The conditioning rule appears to be morphological, rather than phonological in nature, since the immediate phonological environments of the pronouns are the same. The same kind of argument may be made with the future of irregular verbs. Certain irregular verbs end in a high front vowel (e.g., Chi thu, 'you will see') and have /u/. The future of regular verbs also ends in a high front vowel, but gives /tu/ (e.g., Buailidh tu, 'you will strike'). There appears to be no phonological or phonetic difference between the two high vowels, again the conditioning of the pronoun is most simply stated morphologically. This is a puzzling fact if the V and pronoun are two separate morphological entities, but expected on the account here, since the V+pronoun complex is a single morphological interpretation of two separate syntactic nodes. 4. CONCLUSIONS This chapter has proposed that the interpretability of a syntactic feature by the interface systems is its paramount property. Checking relations are simply ways to get matching features into local enough configurations that an uninterpretable
98
David Adger
feature can receive an interpretation via an interpretable one. Given this view, it is natural to consider adjacency to be a local configuration for the checking of morphological features. We defined adjacency as a relation that holds between two nodes in the absence of intervening branching structure. Note that it may be possible to relax what counts as branching structure with empirical consequences [so perhaps phonologically empty elements do not count; or perhaps adjuncts, since they do not create a separate node but only a segment, also do not count, allowing us to resurrect Chomsky's (1957) analysis of do-support (see Bobaljik, 1995)]. As a general theory of adjacency effects, what is proposed here is in its infancy. Having put this framework on a sound conceptual basis, we then examined the positioning of subjects in Goidelic Celtic VSO structures and argued that subjects were positioned immediately subjacent to the verb because that was the configuration that nominative case was checked in. This view provides a natural explanation of adjacency effects between the tensed verb and the subject, and also accounts for the behavior of subject agreement in MI and SG. Further work is needed to determine whether this type of account can be extended to Brythonic Celtic and Semitic VSO orders.
NOTES 'This paper has undergone rather radical revision since first presented. Many people have helped shape its current form, but I thank especially Bob Borsley, Liz MacCoy, Bernadette Plunkett, Gillian Ramchand, and Georges Tsoulas. For help with the data, I thank the staff and students of Sabhal Mor Ostaig, Isle of Skye, especially Shona Caimbeul, Morag Chreig, Seona NicFhionghainn, Chaomhin O Donnaile, Christine Primrose, Deirdre Ni Chaomhanaigh, and Mark Wringe. This work was supported by a University of York IPRFC grant. Usual disclaimers apply. 2 Note that Chomsky's extra stipulation that movement targets the root node cannot be correct (since LF movement and head movement do not). The reason that XP movement always extends the structure is that XP movement is movement to a checking position, and these are defined in Chomsky's system as specifier/adjoined positions; the specifier of a head or an adjunct to XP are necessarily "extensions" of the projection of head plus complement. In a system where checking configurations are independently defined, it is redundant (where it is correct) to stipulate that movement always targets the root node. The extension requirement on transformations (so far as it holds) is, then, derivative of the specification of checking relations. 3 This simplifies the situation somewhat. The case of the object in these constructions is variable and depends on grammatical, geographical, and sociological factors such as definiteness, dialect, and register. 4 There are exceptions to the constraint discussed here. Chung and McCloskey (1987) give an example where an initial auxiliary is separated from its subject by a speakeroriented adverb, and Doherty (1996) provides an example where a full finite verb is separated. All speakers I have consulted judge such examples as severely ungrammatical.
Feature Checking, Adjacency and VSO
99
5
The traditional term for the "strong" form is synthetic, while the "weak" forms are termed analytic. 6 There is an interesting question here as to what the relationship between finite and nonfinite clauses is: if the object has shifted in both, why is it that the agreement particle only appears in the nonfinite case? The answer to this question, I think, is that in MI and SG, T is not projected in nonfinite clauses and therefore v cannot be projected, giving rise to an essentially nominal structure. The only way to license the object is via an agreement projection, probably associated with aspect (see Adger, 1996c, for evidence that T is not projected in MI and SG and the consequences this has for nonfinite clause structure). 7 Actually, in many modern dialects of Irish the object agreement particle appears with an overt pronoun in its specifier. This is only possible in SG when the pronoun is augmented by an emphatic particle. It may be that the particle in these dialects of Irish is simply a head with interpretable case features, but no agreement features.
REFERENCES Adger, D. (1994). Functional heads and interpretation, Ph.D., University of Edinburgh. Adger, D. (1995). Meaning and movement. In R. Aranovic et al. (Eds.), Proceedings of WCCFL 13, (451-466). CSLI Publications, Stanford. Adger, D. (1996a). Agreement, aspect and measure phrases in Scottish Gaelic. In R. Borsley and I. Roberts (Eds.), The syntax of the Celtic languages: A comparative perspective, CUP, Cambridge, England, 200-222. Adger, D. (1996b). Subjects andfiniteness in Irish and Scottish Gaelic. Unpublished manuscript. University of York, England. Adger, D. (1996c). Economy and optionality: Interpretations of subjects in Italian. Probus, 8:117-135. Adger, D. (1997). VSO order and weak pronouns in Goidelic Celtic. Canadian Journal of Linguistics, 42, 9-30. Adger, D., Plunkett, B., Tsoulas, G., and Pintzuk, S. (1999). Specifiers in generative grammar. In D. Adger, S. Pintzuk, B. Plunkett, and G. Tsoulas (Eds.), Specifiers: Minimalist perspectives. Oxford University Press, Oxford, 1-22. Adger, D., and Ramchand, G. (1997). Copular clauses, relative clauses and movement. Presentation at the 2nd Celtic Linguistics Conference, UCD, Dublin. Bobaljik, J. (1995). Morphosyntax: The syntax of verbal inflection. Ph.D. thesis, Cambridge: MIT. Bobalijk, J., and Carnie, A. (1996). A Minimalist approach to some problems of Irish word order. In R. Borsley and I. Roberts (Eds.), The syntax of the Celtic languages: A comparative perspective (223-240). Cambridge: Cambridge University Press. Bok-Bennema, R., and Gross, A. (1988). Adjacency and Incorporation. In M. Everaert et al. (Eds.), Morphology and modularity (33-56). Foris: Dordrecht. Burzio, L. (1986). Italian syntax. Reidel: Dordrecht. Chomsky, N. (1957). Syntactic Structures. The Hague: Mouton Chomsky, N. (1993). A Minimalist Programme for Linguistic Theory. In K. Hale and S. J. Keyser (Eds.), The view from building 20 (1-52). Cambridge, MA: MIT Press. Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: MIT Press.
100
David Adger
Chung, S., and McCloskey, T. (1987). Government, barriers, and small clauses in Modern Irish, Linguistic Inquiry, 18, 173-238. Doherty, C. (1996). Clausal structure and the Modern Irish copula. Natural Language and Linguistic Theory, 14, 1-46. Doron, E. (1988). On the complementarity of subject and subject-verb agreement. In M. Barlow and C. Fergusson (Eds.), Agreement in natural language: Approaches, theory, description (201-218). Stanford, CA: CSLI Publications. Duffield, N. (1990). Movement and mutation in Modern Irish. In T. Geen and S. Uziel (Eds.), MIT working papers in linguistics (Vol. 12, 31-45). Cambridge: MIT Press. Duffield, N. (1995). Particles and projections in Irish syntax. Dordrecht: Kluwer. Guilfoyle, E. (1994). Verbal Nouns, finiteness and external arguments. In the Proceedings ofNELS 24 (141-155). Amherst: GLSA. Haeberli, E. (1995). Adjuncts in pre-subject position. In M. Starke, E. Haeberli, and C. Laenzlinger (Eds.), GenGenP 3.2, (13-46). University of Geneva. Halle, M., and Marantz, A. (1993). Distributed morphology and pieces of inflection. In K. Hale and S. Keyser (Eds.), The view from building 20, (111-176). Cambridge, MA: MIT Press. Heim, I., and Kratzer, A. (1998). Semantics in generative grammar. London: Blackwell. Marantz, A. (1984). On the nature of grammatical relations. Cambridge, MA: MIT Press. Marantz, A. (1988). Clitics, morphological merger and the mapping to phonological structure. In M. Hammond and M. Noonan (Eds.), Theoretical morphology (253-270). New York: Academic Press. McCloskey, J. (1983). A VP in a VSO language? In G. Gazdar, E. Klein, and G. Pullum (Eds.), Order, concord, and constituency (9-55). Foris: Dordrecht. McCloskey, J. (1991). Clause structure, ellipsis and proper government in Irish. Lingua, 85, 259-302. McCloskey, J. (1996a). The scope of verb movement in Irish. Natural Language and Linguistic Theory, 14, 47-104. McCloskey, J. (1996b). Subjects and subject positions. In R. Borsley and I. Roberts (Eds.), The syntax of the Celtic languages: A comparative perspective (241-283). Cambridge: Cambridge University Press. McCloskey, J., and Hale, K. (1984). On the syntax of person-number inflection in Modern Irish. Natural Language and Linguistic Theory 1, 487-533. Meinunger, A. (1995). Discourse dependent DP-(dis)placement. Ph.D. Thesis, University of Potsdam. Noonan, M. (1995). VP internal and VP external AgrOP. In R. Aranovich et al. (Eds.), Proceedings of WCCFL 13, (318-333). Stanford: CSLI Publications. Ramchand, G. (1993). Aspect Phrase in Modern Scottish Gaelic. Proceedings of NELS, 23, 415-429. Ramchand, G. (1997). Aspect and predicational structure: Evidence from Scottish Gaelic. Oxford, UK: Oxford University Press. Riemsdijk, H. van. (1995). Head movement and adjacency. Unpublished manuscript, University of Tilburg. Runner, J. (1995). Noun phrase licensing and interpretation. Ph.D. Thesis, University of Amherst, MA. Stowell, T. (1981). The origins of phrase structure. Ph.D. Thesis, MIT, Cambridge.
MIXED EXTENDED PROJECTIONS ROBERTO. BORSLEY Linguistics Department University of Wales Bangor, Wales
JAKLIN KORNFILT Syracuse University Syracuse, New York
1. INTRODUCTION A notable feature of many languages are constructions that are basically clausal but also have certain nominal properties. For example, in English, the poss-ing construction is exemplified by the bracketed string in (1), in which the subject has the possessive marking characteristic of possessors within a nominal phrase. (1) I heard about [his leaving early]. In Turkish, there are nominalized clauses, in which verbs have the same agreement and case morphology as nouns, and there is also a genitive (i.e., possessive) subject, illustrated in (2): (2)
Hasan [u§ag-in oda-yi temizle-dig-in-i] Hasan servant-GEN room-ACC clean-FACT-3SG-ACC soyle-di. (Turkish) say-PAST 'Hasan said that the servant had cleaned the room.'
Syntax and Semantics, Volume 32 The Nature and Function of Syntactic Categories
101
Copyright © 2000 by Academic Press All rights of reproduction in any form reserved. 0092-4563/99 $30
102
Robert D. Borsley and Jaklin Kornfilt
In Polish, as (3) illustrates, subordinate clauses can be introduced by what looks like a determiner. (3) Jan oznajmit [to, Ze Maria zmienia pracg}. (Polish) Jan announced that COMP Maria is-changing job 'Jan announced that Mary is changing her job.' In this chapter, we will develop an analysis of these constructions within a version of Principles and Parameters theory (P&P), which, following Grimshaw (1991), recognizes some functional categories as inherently verbal and others as inherently nominal. We will suggest that all such constructions involve what Grimshaw calls a "mixed extended projection," a structure in which a verb is associated with one or more nominal functional categories. We will try to show that this proposal explains both which nominal properties occur and which do not. The chapter is organized as follows. In section 2, we introduce our main proposal. Then, in section 3, we look at a variety of relevant constructions. In section 4, we consider whether there is any alternative to our analyses within P&P assumptions. Next, in section 5, we argue that our proposal correctly predicts that certain phenomena do not occur. In section 6, we will look briefly at some other analyses of some of the phenomena with which we are concerned, and in section 7, we consider a question that arises from our proposal. Finally, in section 8, we summarize the chapter.
2. A PROPOSAL An important feature of work in P&P beginning with Chomsky (1986) is the distinction between lexical categories like N (noun), V (verb) and A (adjective), and functional categories like C (complementizer), I (inflection), and Del (determiner). Grimshaw (1991) argues that some functional categories, notably C and I, are inherently verbal, whereas others, notably Del, are inherently nominal.2 Building on this work, we want to propose the following: (4) Clausal constructions with nominal properties are a consequence of the association of a verb with one or more nominal functional categories instead of or in addition to the normal verbal functional categories, appearing above any verbal functional categories.
Mixed Extended Projections
103
In other words, we propose that they involve structures of the following form:
Here, we have a verb phrase (VP) containing both the subject and the object of the verb, following Fukui and Speas (1986) and Koopman and Sportiche (1991) and others, and above it first a number of verbal functional categories and then a number of nominal functional categories. We assume that the number of verbal functional categories could be zero, in which case we just have nominal functional categories. We assume that the nominal functional categories might include not just D but also a nominal agreement category (AgrN). We also assume that the verbal categories might include not I but AgrS (verbal subject agreement) and/or T (tense). Within this approach, constructions will differ in what functional categories they contain and in what movement processes apply within them. Grimshaw in fact proposes that there are no "mixed extended projections," in other words that nominal functional categories cannot be associated with a verbal projection (or verbal functional categories with a nominal projection). If we are right about the analysis of the data that we are concerned with here, this restriction must be relaxed. It may hold in some languages, but not in many others. This conclusion has been reached independently by Zaring and Hirschbuhler (1997), who propose a parameter distinguishing languages that do and do not allow mixed extended projections. We will not discuss whether this is the right approach. It is worth noting, however, that there is an alternative for someone who assumes Optimality Theory (OT), as Grimshaw (1997) does. Within OT, one could assume that there is a no-mixed-extended-projections constraint that can be violated in particular languages to satisfy some higher-ranked constraint. An important point about our analysis is that it predicts that not all combinations of nominal and verbal properties are possible in clausal constructions.
104
Robert D. Borsley and Jaklin Kornfilt
It predicts that such constructions may contain those nominal features that are associated with nominal functional categories but not those nominal features associated directly with N or NP. It also predicts that there will be no nominal properties that reflect a nominal functional category located below a verbal functional category. We will argue in section 5 that both these predictions are correct.
3. SOME CONSTRUCTIONS In this section, we look at a variety of clausal constructions with nominal properties, and argue that they are expected within our proposed analysis. The constructions can be classified according to how much verbal functional structure they possess. We look first at cases with no or few verbal functional categories. We begin with the English poss-ing construction. As noted earlier, the subject has the possessive marking characteristic of possessors within a nominal phrase. Otherwise, it has a verbal internal structure. The object has the same form as with a finite verb and we have adverbs and not adjectives. The following illustrates these features: (6)
[John's criticizing the book repeatedly] was annoying.
The poss-ing construction contrasts with NPs containing a derived nominal.3 Here, an object is preceded by a preposition and we have adjectives. Thus, (6) contrasts with (7): (7)
[John's repeated criticism of the book] was annoying.
The poss-ing construction also appears in canonical nominal positions, from which that-clauses are excluded, such as prepositional object position, postauxiliary subject position, and subordinate subject position. The examples in (8) contrast with those in (9). (8) a. We talked about [John's criticizing the book]. b. Is [John's criticizing the book] well known? c. I think that [John's criticizing the book] is well known. (9) a. * We talked about that John did it. b. *Is that John did it well known? c. *I think that that John did it is well known. Following in essence Abney (1987), we analyze the poss-ing construction as follows:
Mixed Extended Projections
105
This analysis predicts the main properties of the construction. First, it predicts that subjects have possessive marking since they are in the specifier position of a nominal functional category, and that objects have the same form as with finite verbs, since they are ordinary verbal objects. Second, it predicts that we have adverbs and not adjectives if we assume that adverbs modify VP's while adjectives modify NP's. Alternatively, if we assume with Cinque (1994) that adjectives occupy the specifier position of certain nominal functional categories, we can assume that these functional categories are absent from theposs-ing construction.4 Finally, the analysis predicts that the construction appears in canonical nominal positions. (10) contains no verbal functional categories. If we adopted Chomsky's (1993) view that objective case is licensed in the specifier position of AgrO, we would have one verbal functional category. As far as we aware, there is no evidence for any more verbal functional structure in this construction.5 One point that is worth noting here is that it is quite easy within this analysis to account for the fact that determiners do not appear in this construction [i.e., the fact that we do not get examples like (11)]. (11)
* We discussed [the /this /that criticizing the book].
We can simply assume that overt determiners, unlike the empty determiner that takes a DP specifier do not allow a VP complement. There is one further point that we should note here. As Yoon (1996, fn.23) points out, the poss-ing construction does not allow an expletive or an idiom chunk as its subject. He gives the following examples: (12) a. *there's being a spy in the closet b. *it's being obvious that Bill is a spy c. *the cat's being out of the bag Yoon suggests that such examples support an analysis in which the subject is basegenerated in Spec DP controlling a PRO subject in Spec VP. It seems to us,
106
Robert D. Borsley and Jaklin Kornfilt
however, that there are objections to such an analysis. First, the subject of the poss-ing construction appears to have the same theta role as the subject of a parallel finite clause. This fact will be unexplained if it originates in a different position. Second, it is generally assumed that PRO and NP-trace appear in different positions. Hence if NP-trace appears in Spec VP, one would not expect PRO to appear there as well. It seems to us, then, that something like the analysis in (10) is preferable. We leave the ungrammaticality of the examples in (12) as a problem for future research.6 Rather like the English poss-ing construction are Turkish nominalized clauses. (2), which contains what is known as a factive nominalization, illustrates this. The following, which contains a so-called action nominalization, provides a further example: (13) Hasan [u§ag-m oda-yi dikkat-li-ce temizle-me-sin-i] Hasan servant-GEN room-ACC care-with-ADV clean-ACT-3SG-ACC soyle-di. (Turkish) say-PAST 'Hasan said that the servant should clean the room carefully.' As in the English construction, the subject has the same marking (genitive case) as a possessor in a NP. The following shows that possessors have this case: (14)
u§ag-in oda-si (Turkish) servant-GEN. room-3SG 'the servant's room'
Also as in English, the object has the same marking (accusative case) as the object of a finite verb. The following shows that the object of a finite verb has this case: (15)
U§ak oda-yi temizle-di. (Turkish) servant room-ACC clean-PAST 'The servant cleaned the room.'
Nominalized clauses also appear in canonical nominal positions. One such position is postpositional object position. The following show that a simple nominal phrase and a nominalized clause can appear in this position but not a nonnominalized clause: (16)
[[U§ag-in .hastahg-in-dan] dolayi] misafir-ler ag servant-GEN illness-3SG-ABL because guest-PL hungry kal-di. (Turkish) stay-PAST 'Because of the servant's illness, the guests remained hungry.'
(17)
[[U§ag-m hastalan-ma-sin-dan] dolayi] misafir-ler ac servant-GEN fall ill-ACT-3SG-ABL because guest-PL hungry
Mixed Extended Projections
107
kal-di. (Turkish) stay-PAST 'Because the servant got sick, the guests remained hungry.' (18)
*[[U§ak hastalan-di(-dari)] dolayi] misafir-ler ac servant fall ill-PAST-ABL because guest-PL hungry kal-di. (Turkish) stay-PAST
Notice that (18) is ungrammatical with or without the case marking that the postposition requires. In (16)-(18), a postposition takes the ablative case. The following show that this is exactly the same situation with a postposition that takes the dative case. (19)
Ev u§ag-m ddnu§-un-e kadar toz-lu kal-acak. (Turkish) house servant-GEN return-3SG-DAT until dust-with stay-FUT. 'The house will stay dusty until the servant's return.'
(20)
Ev [[u§ag-in iyile§-me-sin-e] kadar} toz-lu house servant-GEN recover-ACT-3SG-DAT until dust-with kal-acak. (Turkish) stay-FUT 'The house will stay dusty until the servant recovers.'
(21)
*Ev [[u§ak iyile§-ecek(-e)] kadar] toz-lu kal-acak. (Turkish) house servant recover-FUT-DAT until dust-with stay-FUT
Another canonical nominal position is the preverbal position. There can be a simple nominal phrase and a nominalized clause in this position but not normally a nonnominalized clause. We have data such as the following:7 (22) Ben [Hasan-m hikdye-sin-i} duy-du-m. (Turkish) I Hasan-GEN story-3SG-ACC hear-PAST-1SG 'I have heard Hasan's story.' (23)
Ben [siz-in tatil-e cik-tig-miz-i] I you-GEN vacation-DAT go out-FACT-2PL-ACC duy-du-m. (Turkish) hear-PAST-1SG 'I heard that you had left for vacation.'
(24)
*Ben [siz tatil-e cik-ti-mz] duy-du-m. (Turkish) I you vacation-DAT go out-PAST-2PL hear-PAST-1 SG 'I heard that you had left for vacation.'
108
Robert D. Borsley and Jaklin Kornfilt
Turkish nominalized clauses show certain additional nominal properties. First, the verbs show the same agreement morphology as in NPs. Third-person singular verbal agreement is null, as illustrated by the root clauses in (2) and (13). Thirdperson singular nominal agreement is -(s)i(n), and this appears in the subordinate clauses in (2) and (13). Second, the verbs are case-marked. There are accusative marked verbs in the Turkish examples in (13) and (23), an ablative marked verb in the Turkish example in (17), and a dative marked verb in the Turkish example in (20). Turkish nominalized clauses also differ from the English poss-ing construction in having a limited manifestation of tense. In particular, factive nominalizations can be either nonfuture, as in (2), or future, as in the following: (25) Ben [siz-in tatil-e gik-acag-miz-i] I you-GEN vacation-DAT go out-FACT.FUT-2PL-ACC duy-du-m. (Turkish) hear-PAST-lSG 'I heard that you will leave for vacation.' Thus, nominalized clauses are somewhat more verbal than the poss-ing construction. Following Kornfilt (1984), we propose that the nominal agreement morphology in nominalized clauses is a nominal functional category, AgrN. If we assume that the elements we have glossed as FACT and ACT are the realization of a nominal mood (MN) category and that case here is the realization of another nominal functional category, which following Lamontaigne and Travis (1987) we label K, we can propose the following structure for the subordinate clause in (2):
Mixed Extended Projections
109
We assume that V combines with MN, AgrN, and K through head-movement. In other words, we assume that word formation may be syntactic. We will return briefly to this assumption in section 6. This analysis predicts that subjects of nominalized clauses have genitive case like possessors, whereas objects have accusative case like objects of finite verbs. It also predicts that they contain adverbs and not adjectives. Dikkat-li-ce in (13) is an adverb. The related adjective is dikkat-li. Finally, it predicts that nominalized clauses appear in canonical nominal positions. We turn now to a construction that arguably has a little more verbal functional structure. This is the masdar (verb-noun) clause of the Caucasian language, Tabasaran. Here, subjects of an intransitive masdar (verb-noun) have genitive case, but the masdar shows the same agreement as a finite verb. A relevant example is (27): (27)
[da§i-jin r-ub-az] kiliyurajcha". (Tabasaran) father-GEN CLl-come-DAT expect-1 PL 'We expected that father should come.'
(28) shows that possessors are genitive, and (29) shows that we have the same agreement marking with a finite verb: (28)
da§i-jin gaf-ar (Tabasaran) father-GEN word-PL 'father's words'
(29)
ermi t'i-r-xnuw. (Tabasaran) man.ABS PreV-CLl-fly.PAST 'The man flew.'
We can analyze the masdar clause in (27) as involving something like the following structure:
Here, the subject has moved to Spec AgrSP triggering agreement and then to Spec DP for case-marking. We assume that Tabasaran is head-final. We are taking no stand on the internal structure of AgrS', but we assume it contains a VP within
110
Robert D. Borsley and Jaklin Kornfilt
which both the subject and the verb originate. We also assume that V combines with AgrS and K through head-movement. There is no doubt much more to be said about case and agreement in Tabasaran masdar clauses, but it seems clear that examples like (27) pose no problems for our approach.8 We turn next to examples with a full set of verbal functional categories, but also a nominal functional category. First we consider cases in which the nominal functional category is a determiner. Combinations of a determiner and a clause either nonfinite or finite occur in a variety of languages. We will look first at examples where the clause is nonfinite. We begin with a Spanish construction highlighted in Plann (1981). We have examples like the following: (31) No acepto {el susurrar palabras obscenas}. (Spanish) NEG accept-1SG the whisper words obscene 'I do not accept the whispering of obscene words.' Here, the definite article el is immediately followed by an infinitive. The normal use of the definite article with a noun is illustrated by the following: (32)
el libra (Spanish) the book 'the book'
As we would expect, objects take the same form in this construction as with finite verbs. We also have adverbs and not adjectives. (33) No acepto [el susurrar palabras obscenas NEG accept-1SG the whisper words obscene constantemente] (Spanish) constantly 'I do not accept the constant whispering of obscene words.' In contrast, in nominal phrases containing a derived nominal (which is formally identical to the infinitive), we have objects marked with a preposition and adjectives. (34) No acepto [el constante susurrar de palabras NEG accept-1SG the constant whisper of words obscenas}. (Spanish) obscene 'I do not accept the constant whispering of obscene words.'
Mixed Extended Projections
111
A further point to note is that we also have examples with an overt postverbal subject. The following illustrate: (35)
No acepto [el susurrar Maria palabras obscenas NEG accept-1SG the whisper Maria words obscene constantemente}. (Spanish) constantly 'I do not accept Maria's constant whispering of obscene words.'
(36) No puedo aceptar el haber rechazado tu esa NEG can accept the have rejected you-NOM this propuesta. (Spanish) proposal 'I cannot accept that you rejected this proposal.' (36) shows that we have a nominative subject. Given standard P&P assumptions, the obvious assumption is that examples like these involve a determiner with a CP complement, so that (31) has something like the following structure:
Examples like (35) and (36) raise various analytic questions: for example, where is the verb located? why does it precede the subject? why is the subject nominative? We will not try to address these questions. All that matters for us is that we seem to have a definite determiner with a CP complement here. Rather similar to the Spanish construction are so-called nominalized clauses in Basque. We have examples like the following: (38)
[Jon-ek here hitzak hain ozenki es-te-a-n] denok harritu Jon-ERG his words so loudly say-NR-DET-INESS all surprise ginen. (Basque) AUX 'We were all surprised at John saying his words so loudly.'
112
(39)
Robert D. Borsley and Jaklin Kornfilt
[Zu-k etxea prezio honetan hain errazki sal-tze-a-re-kin] you-ERG house price that-in so easily sell-NR-DET-GEN-WITH ni-k ez dut ezer irabazten. (Basque) I-ERG NEC AUX anything win 'I don't get anything out of your selling the house so easily.'
NR here stands for nominalizing suffix. The presence of this suffix distinguishes the Basque construction from the Spanish construction, and so does word order, but otherwise the constructions seem quite similar. A is the definite article. Its use with a noun is illustrated by the following: (40)
liburu-a (Basque) book-the 'the book'
We can propose something like the following structure for the subordinate clause in (38):
The verb will presumably raise through I, NR, and D to K. The analysis predicts the main properties of the construction. In particular, it predicts that all dependents of the verb have their normal case-marking and that the verb is modified by adverbs and not adjectives. Ozenki in (38) and errazki in (39) are both adverbs. The related adjectives are ozen and erraz. We turn now to examples where the clause is finite. We will consider examples from Polish, Greek, and Georgian. One point to note immediately is that Det + finite clause sequences are not necessarily constituents. We have this sequence in English examples like the following if we assume with Abney (1987) that pronouns are determiners:
Mixed Extended Projections
(42)
113
They resented it that he was invited.
The evidence suggests, however, that we have two separate constituents here. The it + finite clause sequence cannot be fronted or appear as a sentence fragment. (43)
*It that he was invited, they resented.
(44)
What did they resent? *It that he was invited.
The situation is different with Det + finite clause sequences in various other languages. We can look first at Polish, where we have examples like the following: (45)
Jan oznajmil [to, ie Maria zmienia prace.}. Jan announced [that COMP Maria is-changing job 'Jan announced that Mary is changing her job.'
(Polish)
Here, we have a determiner + clause sequence that can be fronted and appear as the answer to a question, as the following illustrate: (46)
[To, ie Maria zmienia prace.} Jan oznajmil. (Polish) that COMP Maria is-changing job Jan announced 'Jan announced that Mary is changing her job.'
(47) a. Co Jan oznajmil? (Polish) b. To, ie Maria zmienia prace. that COMP Maria is-changing job 'Jan announced that Mary is changing her job.' It is fairly clear, then, that we have a constituent here, and the obvious assumption is that it is a DP in which the D has a CP complement.9 In (45), the determiner introduces an indicative clause introduced by ie. It can also introduce a subjunctive clause introduced by zeby, and various kinds of interrogative clauses. The following illustrate: (48)
Jan rzadal [(tego), zeby Maria zmienilaprace.}. Jan demanded that COMP Maria changed job 'Jan demanded that Maria change her job.'
(Polish)
(49)
Jan zastanawia sig nod tym, czy kupic nowy samochod. Jan wondered over that whether buy new car 'Jan wondered whether to buy a new car.'
(50)
Jan zastanawia sig nod tym, kiedy kupic nowy samochod. Jan wondered over that when buy new car 'Jan wondered when to buy a new car.'
(Polish)
(Polish)
114
Robert D. Borsley and Jaklin Kornfilt
The determiner is optional with complements of verbs. Thus, as well as (45), we have (51). (51)
Jan oznajmil, [ie Maria zmienia prace]. Jan announced COMP Maria is-changing job 'Jan announced that Mary is changing her job.'
(Polish)
The determiner is normally obligatory with complements of prepositions, as the following illustrate: (52)
Jan wiedzial o [*(tym), ie Maria wyjechala]. Jan knew about that COMP Maria left 'Jan knew that Maria had left.'
(53)
Jan marzyl o [*(tym), ieby Maria wrocita]. Jan dreamt about that COMP Maria returned 'Jan dreamt that Maria had returned.'
(Polish)
(Polish)
Again, then, we have a construction that can appear in a canonical nominal position from which ordinary clauses are generally excluded. For (45), we will have something like the following structure:
We can turn now to Greek. Here, we have examples like the following, discussed in Roussou (1991): (55)
dhen anifisvito [to oti efighe}. (Greek) NEG dispute-1SG the-ACC that left-3SG 'I do not dispute the fact that he left.'
Again, we have a determiner + clause sequence that can be fronted and appear as the answer to a question: (56)
[to oti efighe] dhen to amfisvito. (Greek) the-ACC that left-3SG NEG it-ACC dispute-1SG 'I do not dispute the fact that he left.'
(57) a. ti se stenoxori? what you-ACC upset-3SG 'What upsets you?'
(Greek)
Mixed Extended Projections
115
b. to oti efighe. the-ACC that left-3SG The fact that he left.' As in Polish, the determiner can introduce what might be regarded as subjunctive clauses and interrogative clauses.10 (58)
[to na ise politikos] dhen egkrino. (Greek) the-nom PRT be-2SG politician NEC approve-1SG 'I don't approve of you being a politician.'
(59)
[to an tha fighi] dhen gnorizo. (Greek) The-NOM if will leave-3SG NEG know-ISO 'I don't know whether he will leave.'
Also, as in Polish, the determiner is generally obligatory with complements of prepositions. (60) apo [*(to) oti etreme] (Greek) from the-ACC that was shaking-3SG 'from the fact that he was shaking' We assume that the Greek construction involves essentially the same structure as the Polish construction. Finally, we can consider Georgian, where we have examples like the following: (61) vanom gvian gaigo [(is,} rom ninom Vano.ERG late 3.3.find.out.AOR it.NOM that Nino.ERG dac'era c'erili}. (Georgian) 3.3.write.AOR letter.NOM ' Vano found out late that Nino had written the letter.' As in Polish and Greek, the determiner + clause sequence can be fronted and appear as the answer to a question: (62)
[is, rom ninom dac'era c'erili} vanom it.NOM that Nino.ERG 3.3.write.AOR letter.NOM Vano.ERG gvian gaigo. (Georgian) late 3.3.fmd.out.AOR 'Vano found out late that Nino had written the letter.'
(63) a. vanom ra gaigo? (Georgian) vano.ERG what 3.3.find.out.AOR 'What did Vano find out?'
116
Robert D. Borsley and Jaklin Kornfilt
b. is, rom ninom dac 'era c 'erili. it.NOM that Nino.ERG 3.3.write.AOR letter.NOM 'that Nino had written the letter' We assume, then, that we have the same structure here as in Polish and Greek. We turn now to a rather different construction which appears to involve a full set of verbal functional categories but also a nominal functional category. This is a clause type containing a participle found in the Caucasian language, Kabardian. The participle has absolutive case within a direct object, and ergative case within a subject. The following illustrate: (64)
[a-be tXel psens"ew zer-i-txe-nu-r] he-ERG book quickly PART-SBJ3SG-write-FUT-ABS z-je- ?-a-s'. (Kabardian) PreV-SUBJ3SG-say-PAST-ASSERT 'He said that he will write the book quickly.'
(65)
[jeq°'e-r zere-semaze-m] ane-r POSS3SG-son-ABS PTC-be.ill-ERG mother-ABS je-ye-q°ezave. (Kabardian) SBJ3SG-CAUS-worry.PRS 'It worries mother that her son is ill.'
This is rather like the situation in Turkish nominalized clauses. Unlike the Turkish construction, however, all arguments have ordinary verbal case-marking. The subject of a transitive participle is marked by the ergative case, as (64) illustrates, and the subject of the intransitive participle is absolutive, as (65) illustrates. The following related root clauses show that this is the ordinary verbal case-marking. (66)
a-be txele-r psens" ewje-txe-nu-s'. (Kabardian) he.ERG book.ABS quickly SUBJ3SG-write-FUT-ASSERT 'He will write a book quickly.'
(67)
q°'e-r f -semaze-s'. (Kabardian) son-ABSSUBJ3SG-be.ill.PRS-ASSERT 'The son is ill.'
Note also that we have adverbs and not adjectives in this construction. The adjective related to psens' 'ew in (64) is psens''. The suffix -w marks adverbs. We would suggest that this construction involves a nominal functional category K with a CP or an IP as its complement but no raising of an argument to the specifier position of a nominal functional head (as in English and Turkish) and hence no nominal case-marking. In other words, we would suggest something like the following structure for the subordinate clause in (64):
Mixed Extended Projections
117
We assume that the verb will raise through C to K to pick up the case-morphology.
4. SOME ALTERNATIVE APPROACHES As we noted in section 2, Grimshaw (1991) proposes that there are no mixed extended projections, so that nominal functional categories cannot be associated with a verbal projection. It is important, therefore, to ask whether there is any way within the P&P assumptions of reconciling the data that we have been discussing with the assumption that there are no mixed extended projections. We will consider two approaches and argue that neither is viable. The first approach is one suggested in Grimshaw (1991) in connection with degree words like so. Such words combine with both adjectives and adverbs. The combination of degree word and adjective behaves like an adjective, whereas the combination of degree word and adverb behaves like an adverb. The following illustrate: (69) He was so quick/* quickly. (70)
He ran so quickly/* quick.
If one assumes that degree words are members of a functional category Deg, one will have something like the following structures here:
Given such structures, however, it is not clear why so quick and so quickly should differ in their distribution, since both are DegP's. Grimshaw proposes that Deg is unspecified for whatever feature distinguishes adverbs from adjectives. Within her
118
Robert D. Borsley and Jaklin Kornfllt
framework, this entails that Deg becomes adjectival when it combines with an AP and adverbial when it combines with an AdvP. Thus, we have two different types of DegP here, and the difference in distribution is no problem. Building on this idea, one might propose that apparent nominal functional categories appearing in a clausal construction are in fact unspecified for whatever features distinguish between nouns and verbs. This will allow such categories to combine with a noun and become nominal or to combine with a verb and become verbal. If this approach were viable, we would not in fact have mixed extended projections here. There is, however, a rather obvious objection to such an account. This is that it cannot account for the appearance of these constructions in canonical nominal positions. It cannot account for the English examples in (8), the Turkish examples in (17), (20), and (23), the Polish examples in (52) and (53), and the Greek example in (60). It is fairly clear, then, that this is not a viable way of avoiding a mixed extended projection analysis.11 The second approach that one might suggest is one in which what look like mixed extended projections are in fact two pure extended projections. More precisely, it is one in which there is an empty noun between the verbal functional categories and the nominal functional categories. On this approach, instead of structures of the form in (5), we would have structures of the following form:
The obvious question here is, is there any evidence for an empty noun? In the absence of any evidence, this approach is really just an ad hoc way of avoiding a mixed extended projection. In the case of the poss-ing construction, there is a more serious objection to such an analysis. We noted in section 3 that the modern construction involves
Mixed Extended Projections
119
adverbs and not adjectives. This point is emphasized in Pullum (1991), who points to the ungrammaticality of the following: (73) a. *his kind walking me home b. *your predictable not agreeing with what I said As he notes, such examples would be expected if the poss-ing construction contains a nominal element. They will be expected if adjectives are adjoined to some projection of N. They will also be expected if adjectives appear in the specifier position of some nominal functional category, since the functional categories will presumably be available in a structure containing an empty N. Thus, the fact that such examples are ungrammatical provides an important objection to the analysis in (72). There is a further objection to this analysis if poss-ing subjects originate in Spec VP, as we have suggested. If they do, the analysis will involve extraction from the complement of a noun. As Kayne (1981) pointed out, this is not generally possible. Thus, while the sentences in (74) and (75) are grammatical, the bracketed NPs in (76) and (77) are unacceptable. (74)
John appeared to be drunk.
(75)
John continued to snore.
(76)
* [John's appearance to be drunk] surprised us.
(77)
* [John's continuation to snore] annoyed everyone.
Here, then, we may have a second objection to the idea that the poss-ing construction involves an empty N. Some of the constructions that we have looked at here involve what might be called a nominalizing morpheme, a morpheme that identifies verb-forms that can appear in the construction. Both the Turkish and the Basque constructions involve such morphemes. One might suggest that these morphemes are the realization of underlying nouns. We will not discuss the Basque morpheme, but we will argue that such an analysis is untenable for the Turkish morphemes. The Turkish morphemes differ from nouns both semantically and syntactically. Semantically, they are unlike nouns in conveying mood distinctions and also in the case of factive nominalizations tense distinctions. Syntactically, they are unlike nouns in taking an obligatory VP complement. Nouns do not usually have obligatory complements, and when they allow a complement it is usually an NP or a full clause. (78) illustrates a noun with a nominalized complement, and (79) illustrates a noun with a finite clausal complement. (78)
[sen-in ev-in-den kaf-tig-in] soylenti-si you-GEN home-2SG-ABL escape-MN-2SG rumour-3SG 'the rumor that you ran away from home'
120
(79)
Robert D. Borsley and Jaklin Kornfilt
[ben-i kimse sev-mi-yor] korku-su I-ACC nobody love-NEG-PRES.PROGR fear-3SG 'the fear that nobody loves me'
Thus, the idea that the the Turkish morphemes are underlying nouns is not very plausible. It seems, then, that there is little alternative within Grimshaw's version of P&P to an analysis in which they involve mixed extended projections.
5. SOME IMPOSSIBLE STRUCTURES As we indicated in section 2, our proposal does not allow just any combination of nominal and verbal properties in clausal constructions. It predicts that clausal constructions may only have nominal properties that are associated with nominal functional categories and not nominal properties that are associated directly with N or NP. It also predicts that there are no nominal properties that reflect a nominal functional category located below a verbal functional category. We will now argue that both these predictions are correct. One nominal property that we assume to be directly associated with N is what we can call nominal object marking, that is, objects marked in the same way as objects in nominal phrases (e.g., with a dummy preposition or genitive case). The former is illustrated by the English example in (80) and the latter by the Polish example in (81). (80)
the Romans' destruction of the city
(81)
teoria wzglgdnosci Einsteina (Polish) theory relativity-GEN Einstein-GEN 'Einstein's theory of relativity'
We predict, then, that this property will not appear in clausal constructions. Particularly important in the present context is the English poss-ing construction and the Turkish nominalized clauses. In both, the objects have the same form as they have in finite main clauses. Thus, we have nominal subject marking but ordinary verbal object marking. Noonan (1985:60) notes that it is quite common to have nominal subject marking but verbal object marking. He suggests, however, that nominal object marking is found with so-called verbal-nouns in Irish. If this were the correct interpretation of the data, Irish would constitute a counterexample to our prediction. We will argue, however, that this is not the correct interpretation, and hence that Irish is not a counterexample. It is the Irish progressive construction that is of interest here.12 It is exemplified by the following:
Mixed Extended Projections
(82)
121
Td Cathal [ag moladh an phictiuir}. (Irish) is Cathal [PROG praise the picture-GEN 'Cathal is praising the picture.'
The important point about this example is that the object has genitive case. This contrasts with the object in a finite main clause, as the following illustrates: (83)
Mhol Cathal an pictiur. (Irish) praised Cathal the picture-ACC 'Cathal praised the picture.'
On the face of it, however, it resembles the object of a nominal phrase. (84) pictiur an chapaill (Irish) picture the horse-GEN 'the picture of the horse' One might try to suggest that examples like (81) are irrelevant because they contain a derived nominal. However, the fact that we have adverbs and not adjectives in this construction argues against this. Consider the following: (85)
Td se ag ceol *(go) binn. (Irish) is he PROG sing PRT pleasant 'He is singing pleasantly.'
In Irish, adverbs are distinguished from adjectives by the preceding particle go. Here, this particle is obligatory. Hence, we can only have an adverb. It looks, then, as if there is a real problem here. In fact, however, there is no problem because objects in nominal phrases do not have genitive case. This becomes clear as soon as we consider a nominal phrase containing both a possessor or subject and an object. Here we have (86a) and not (86b). (86) a. pictiur Chathail den chapall picture Cathal-GEN of-the horse 'Cathal's picture of the horse' b. *pictiur Chathail chapaill. picture Cathal the horse-GEN
(Irish)
In other words, the object is marked with a dummy preposition and not with genitive case. We suggest that (84) is essentially a passive nominal, which might be better translated as 'the horse's picture.' Thus, we do not have nominal object marking in examples like (82). Hence, they do not constitute a counterexample to our prediction. We can turn now to the prediction that clausal constructions will not have
122
Robert D. Borsley and Jaklin Kornfilt
nominal properties that stem from a nominal functional category below a verbal functional category. We have stressed above that a variety of languages allow a DP above CP. We assume that no language allows the reverse: a CP above DP. The English poss-ing construction is of interest here. It can never be introduced by a complementizer or by a wh-phrase. We do not have examples like the following: (87)
*We discussed [whether John's criticizing the book].
(88)
*We discussed [which book John's criticizing].
It is fairly clear, then, that the construction does not have a CP above the DP. It seems equally clear that the Turkish nominalized clauses that we considered earlier do not involve a CP above the nominal functional category, AgrNP. Turkish has one element that appears to be a realization of C, namely ki. The following show that ki can appear with a nonnominalized clause but not with a nominalized clause containing a factive nominalization: (89)
Ben duy-du-m ki [siz tatil-e cik -ti -niz]. (Turkish) I hear-PAST-lSG COMP you vacation-DAT go out-PAST-2.PL 'I heard that you had left for your vacation.'
(90)
*Ben duy-du-m ki [siz-in tatil-e I hear-past-1SG COMP [you-GEN vacation-DAT gik-tig-imz\(-i). (Turkish) go out-FACT-2PL-ACC
Notice that (90) is ungrammatical with or without the accusative case. Ki is equally impossible with a nominalized subordinate clause containing an action nominalization. Thus, we have the following contrast: (91)
Ben kork-uyor-um ki [siz tatil-e I fear-PRES.PROG-ISO COMP you vacation-DAT fik-acak-smiz]. (Turkish) go out-FUT-2PL 'I am afraid that you shall leave for your vacation.'
(92)
*Ben kork-uyor-um ki [siz-in tatil-e I fear-PRES.PROG-1SG COMP you-GEN vacation-DAT fik-ma-mz] (-dan). (Turkish) go out-ACT-2PL (-ABL)
(92) is ungrammatical with or without ablative case that the verb kork 'fear' selects. Both ungrammatical examples are acceptable if the ki is ommitted. They simply become "scrambled" versions of the following:
Mixed Extended Projections
(93)
Ben [siz-in tatil-e fik-tig-miz-i] I you-GEN vacation-DAT go out-FACT-2PL-ACC duy-du-m. (Turkish) hear-PAST-lSG 'I heard that you had left for your vacation.'
(94)
Ben [siz-in tatil-e £ik-ma-mz-dan] I you-GEN vacation-DAT go out-ACT-2PL-ABL kork-uyor-um. (Turkish) fear-PRES.PROG-lSG 'I am afraid that you shall leave for your vacation.'
123
These data suggest quite strongly that a CP is possible with a nonnominalized clause but not with a nominalized clause. Hence, it suggests that Turkish does not allow a CP above an AgrNP. We noted in Section 3.2 that the appearance of verbal subject agreement with nominal subject marking as in Tabasaran is quite compatible with our assumptions since it can be analyzed as the reflection of a Det appearing above AgrS. In contrast, the appearance of nominal subject agreement with verbal subject marking would be problematic, since it would suggest an AgrS above a Det. We are not aware of any instances of this pattern of data.
6. SOME OTHER ANALYSES We have argued in the preceding sections that a version of P&P in which functional categories are inherently nominal or inherently verbal can provide an illuminating account of a variety of clausal constructions with certain nominal properties. In this section we will look briefly at some analyses that have been proposed for the English poss-ing construction in some other frameworks. Our main concern will be whether they can be extended to the other constructions that we have looked at. We will also look briefly at the analysis of mixed constructions developed in Bresnan (1997). A particularly interesting analysis of the poss-ing construction is proposed in Pullum (1991). Pullum assumes the General Phrase Structure Grammar (GPSG) framework of Gazdar, Klein, Pullum, and Sag (1985). An important feature of this framework is that heads and their mothers are not required to share any specific features. Exploiting this, Pullum proposes that the poss-ing involves an NP with a VP as its head licensed by a special immediate dominance rule. He shows how the analysis predicts the main properties of the construction. Lapointe (1993) proposes that the poss-ing construction involves what he calls dual lexical categories. These are lexical categories that project two different
124
Robert D. Borsley and Jaklin Kornfilt
phrasal categories, one immediately dominating the other. He argues that such categories permit a more principled account of the construction than Pullum's. Malouf (this volume) develops an analysis within the Head-driven Phrase Structure Grammar (HPSG) framework of Pollard and Sag (1994) in which gerunds are both nouns and verbals (a type which also includes verbs and adjectives) that are derived from verbs by a lexical rule. They take the same complements as the NPs from which they derive but also allow a possessive specifier. They are modified by adverbs because adverbs modify verbals, whereas adjectives only modify common nouns and not all nouns, but because they are nouns the poss-ing construction has the distribution of an NP. Malouf argues that this analysis is preferable to both Pullum's and Lapointe's. Although these analyses differ in various ways, there is an important similarity. They all in effect treat the poss-ing construction as a mixed simple projection, a mixed projection, that is, with a single head. Could these analyses be extended to the other constructions that we have looked at here? It seems clear that they are only applicable to constructions where it is plausible to assume that there is a single head. We have looked at constructions that contain a determiner in Italian, Basque, Polish, Greek, and Georgian. If we assume that determiners are heads, then these constructions contain more than one head. Hence, there seems to be little alternative to a mixed extended projection analysis for such constructions. It is logically possible that it is necessary to recognize both mixed extended projections and mixed simple projections. However, it seems to us that it is preferable, other things being equal, to assume that all these constructions are mixed extended projections. Mixed extended projection analyses seem to be viable for the English poss-ing construction and for the rather similar Turkish nominalized clause construction. It seems to us, then, that these analyses are preferable to mixed simple projection analyses of the kind that are proposed by Pullum, Lapointe, and Malouf. We want now to look briefly at the analysis of mixed constructions developed in Bresnan (1997). Bresnan in effect shares our assumption that these constructions involve mixed extended projections. There are two main differences between her analysis and ours. We have assumed that both heads and phrases may move and that word formation may apply in the syntax. In contrast, Bresnan assumes that there are no movement processes and that all word formation is in the lexicon. It seems to us that neither of these differences is very important. We could have assumed a version of P&P such as that of Brody (1995), in which there is no movement and chains are base generated. We could also have assumed with minimalism (Chomsky, 1995) that word formation is in the lexicon and that movement is not to pick up morphology but to check features.13 Thus, it is not clear that there are any fundamental differences between the two approaches. They are perhaps best seen as developments within two different frameworks of the same basic idea.
Mixed Extended Projections
125
7. A FURTHER ISSUE We have argued in the preceding sections that a variety of constructions in various languages should be analyzed as mixed extended projections, structures in which a verb is associated with one or more nominal functional categories, which give a nominal periphery to a basically verbal construction. An obvious question to ask here is whether we also find the mirror image of this situation, mixed extended projections, in which a noun is associated with one or more verbal functional categories that give a verbal periphery to a basically nominal constructions. In other words, do we find structures of the following form?
In this section, we will look briefly at this question. Structures of this form have been proposed in the literature for certain copula sentences. For example, Ouhalla (1991) proposes that the English copula is a realization of T that may combine with a variety of complements. He implicitly assumes that an example like (96) has something like the structure in (97): (96) Kim is a nuisance.
126
Robert D. Borsley and Jaklin KornfiH
This is a fairly clear instance of the structure in (95). Thus, if this is the right analysis, we must recognize that mixed extended projections of this kind exist as well as the type that we have concentrated on in this chapter. There is, however, a serious objection to Ouhalla's analysis. A well-known fact about the English copula is that it has nonfinite forms. We have examples like the following: (98) a. Kim may be a nuisance. b. Kim is being a nuisance. c. Kim has been a nuisance recently. The conclusion that is normally drawn from such examples is that the English copula is a verb, which exceptionally moves out of VP. We see no reason to question this conclusion. Hence, examples like (96) do not instantiate the structure in (95). Of course, the fact that (97) is not the right analysis for (96) does not show that structures like (95) do not exist. It is entirely possible that they exist elsewhere. Thus, Morimoto (1998) in effect argues that a structure of this kind is found in Japanese. In the remainder of this section, we will consider what would follow if such structures do in fact exist. If structures like (95) do exist, we will have nominal functional categories taking a verbal complement and verbal functional categories taking a nominal complement. Given this, an obvious question is, how can we allow structures like (5) and structures like (95) without allowing just any combination of nominal and verbal functional categories? One possibility is that lexical categories should incorporate information about the functional categories with which they are associated. This might take the form of a list of features that must enter into a checking relation. Each feature would correspond to some functional category, and the list would determine what functional categories appear in the extended projection of the category. We could then allow a limited range of structures by imposing constraints on these lists. Specifically, we could stipulate that they either contain just features of the same type as the lexical category or features of the same type followed by features of one other type. This would allow pure extended projections and mixed extended projections of the form in (5) and (95), but would exclude random combinations of functional categories.
8. CONCLUSIONS We have argued here that a variety of constructions in a variety of languages, which are basically clausal but also have certain nominal properties, can be analyzed within the version of Principles and Parameters theory (P&P) developed in
Mixed Extended Projections
127
Grimshaw (1991) as mixed extended projections, structures in which a verb is associated with one or more nominal functional categories. We have considered constructions from English, Turkish, Tabasaran, Spanish, Basque, Polish, Greek, Georgian, Kabardian, and Irish, and we have classified the various constructions according to how much verbal functional structure they possess. Obviously, however, there is much more research to be done in this area. Relevant constructions are found in many languages, and it would not be surprising to find some that do not fit as easily into our approach as those discussed here.14 It seems to us, however, that the approach shows considerable promise.
NOTES 1
An earlier version of this chapter was presented at the International Conference on Syntactic Categories, University of Wales, Bangor, June 24-26, 1996. The chapter draws on earlier work with Karina Vamling of the University of Lund, carried out as part of the European Science Foundation's Eurotyp project. We are grateful to the following for help with the data: Ewa Jaworska (Polish), Cathair 6 Dochartaigh (Irish), Jon Ortiz de Urbina (Basque), Maria-Luisa Rivero (Spanish), Anna Roussou (Greek), Karina Vamling (Georgian, Kabardian, Tabasaran). We are also grateful to two anonymous referees for a variety of helpful comments. All errors and inadequacies are our responsibility. 2 A very similar conception of functional categories is proposed within Head-driven Phrase Structure Grammar (HPSG) by Netter (1994). 3 As Yoon (1996) emphasizes, English and a number of other languages allow the form that appears in a mixed construction to double as a derived nominal. Examples like the following illustrate this point for English: (i)
your giving of the book to Bill
However, it is not clear how significant this is. 4 It is worth noting here that earlier forms of English allowed an attributive adjective in theposs-ing construction. Wescoat (1994:588) gives the following examples: (i) a. the untrewe forgyng and contryvyng certayne testamentys and last wyll (Paston Letters, 15th cent.) b. my wicked leaving my father's house (Defoe) Given Cinque's analysis of attributive adjectives, we can assume that such examples instantiate the structure in (5). If one assumes that adjectives modify NPs, such examples will have to contain an NP and hence they will not instantiate the structure in (5). 5 However, if one assumes with Cinque (in press) that adverbs occupy the specifier position of various verbal functional categories, then these functional categories will appear in the poss-ing construction. 6 Yoon also suggests that examples like the following, containing a raising verb, are ungrammatical:
128 (i)
Robert D. Borsley and Jaklin Kornfilt John's seeming to be the right person for the job
It seems to us that there are perfectly acceptable examples with a raising verb in the poss-ing construction, for example the following: (ii)
John's appearing to be intelligent is irrelevant.
Notice also that Yoon's analysis will not rule out such examples. This is because raising verbs can appear in controlled complements, as (ii) illustrates. (ii)
John tried to appear to be intelligent. 7
For the sake of completeness, we should note that there are a few "belief" verbs that allow a preverbal nonnominal complement, (i) illustrates: (i)
Ben [siz tatil-e fik-ti-mz] san-iyor-du-m. Turkish I you vacation-DAT go out-PAST-2PL believe-PRES.PROG-PAST-lSG. 'I thought that you had left for vacation.'
It is also possible to have preverbal nonnominal complements with the "quotational" form, diye 'saying,' as (ii) illustrates. (ii)
Ben [siz tatil-e fik-ti-mz] diye duy-du-m. Turkish I you vacation-DAT go out-PAST-2PL saying hear-PAST-lSG
Diye can only be used with verbs of saying and belief and certain sensory perception verbs. 8 For further discussion see Magometov (1965) and Bogatyrev and Boguslavskaja (1982). Examples are quoted from these works. 9 Zaring and Hirschbiihler (1997) use similar data to show that a similar sequence in Old French is a constituent. 10 For reasons that are not clear to us, these examples are only grammatical if the determiner + CP structure is topicalized. "Much the same point is made by Zaring and Hirschbiihler (1997) in connection with Old French analogues of the Polish and Greek examples. 12 For some general discussion, see McCloskey (1983). 13 It is not clear, however, that all word formation should be handled in the lexicon. In Turkish, there is an interesting contrast between the morphology in nominalized clauses and ordinary derivational morphology, which suggests that although the latter belongs in the lexicon, the former may belong in the syntax. Consider first the following: (i)
(ii)
[Ali-nin et-i pisir-ip ye-dig-in-i] gor-du-m. Ali-GEN meat-ACC cook-VBL.CONJ eat-FACT-3.SG-ACC see-PAST-l.SG 'I saw that Ali cooked and ate the meat.' [Ali-nin et-i pisir-ip ye-me-sin-i] iste-di-m. Ali-GEN meat-ACC cook-VBL.CONJ Eat-ACT-3.SG-ACC want-PAST-l.SG 'I wanted that Ali should cook and eat the meat.'
Here we have nominalized clauses with conjoined verbal stems and the morphology that is characteristic of such clauses appears just once, (ip here is a verbal coordination marker.) This is not possible with ordinary derivational morphology. Turkish has a resultative suffix
Mixed Extended Projections
129
used to form derived nominals, which is homophonous with the action suffix -ma. Thus, we have the derived nominals in (iii). (iii) a. ac-ma open-RES 'pastry' b. kavur-ma roast-RES 'roasted meat' It is not possible to have conjoined verbal stems here. (iv)
*ac-ip kavur-ma open-VBL.CONJ roast-RES 'pastry and roasted meat'
Turkish also has a productive agentive suffix, -(v)ici, illustrated by the following: (v) a. dinle-yici listen-AGENT 'listener' b. oku-yucu read-AGENT 'reader' Again, it is not possible to have conjoined verbal stems. (vi)
*dinle-yip oku-yucu listen-VBL.CONJ read-AGENT 'listener and reader'
It is clear, then, that Turkish has two rather different kinds of morphology. One possibility is that one type is the product of word formation in the syntax and the other of word formation in the lexicon. 14 One construction that one might see as somewhat problematic is the English ace-ing construction exemplified by examples like the following: (i)
I dislike Brown painting his daughter.
As Malouf (this volume) shows, this has a nominal distribution, but its internal structure appears to be entirely verbal. Within the approach developed here, one might propose that they are complements of an empty D. This, however, is not a very attractive analysis.
REFERENCES Abney, S. (1987). The English noun phrase in its sentential aspect, Ph.D. dissertation, MIT, Cambridge, MA.
130
Robert D. Borsley and Jaklin Kornfilt
Bogatyrev, K., and Boguslavskaja, O. (1982). Opredelitel'nye konstrukcii v dvux govorax tabasaranskogo jazyka. In V. Zvegincev (Ed.), Tabasaranskie etjudy. Materialy polevyx issledovanij, 15. Izd. Moskva: Moskovskogo Universiteta. Bresnan, J. (1997). Mixed categories as head-sharing constructions. In M. Butt and T. King (Eds.), LFG-Workshop. Proceedings of the First LFG conference, Rank Xerox Research Centre, Grenoble, August 26-28, 1996. On-line, CSLI Publications: http:// www-clsi.stanford.edu/publications/. Brody, M. (1995). Lexico-Logical Form: A Radically Minimalist Theory. Cambridge, MA: MIT Press. Chomsky, N. (1986). Barriers. Cambridge, MA: MIT Press. Chamsky, N. (1993). A Minimalist Program for linguistic theory. In K. Hale, and S. J. Keyser (Eds). The View from Building 20, Cambridge, MA: MIT Press. Chomsky, N.A. (1995). The Minimalist Program. Cambridge, MA: MIT Press. Cinque, G. (1994). On the evidence for partial N-movement in the Romance DP. In G. Cinque, J. Koster, J.-Y. Pollock, L. Rizzi, and R. Zanuttini (Eds.), Paths towards universal grammar: Studies in honor of Richard S. Kayne. Washington, DC: Georgetown University Press. Cinque, G. (1999). Adverbs and functional heads: A cross-linguistic perspective. Oxford: Oxford University Press. Fukui, N., and Speas, M. (1986). Specifiers and projection. In N. Fukui, T. Rapoport, and E. Sagey (Eds.), Papers in theoretical linguistics: MIT Working Papers in Linguistics 8, 128-172. Cambridge, MA: Dept. of Linguistics and Philosophy, MIT. Gazdar, G., Pullum, G. K., Klein, E., and Sag, I. (1985). Generalized Phrase Structure Grammar. Oxford: Blackwell. Grimshaw, J. (1991). Extended projection. Unpublished paper, Brandeis University. Grimshaw, J. (1997). Projection, heads, an optimality. Linguistic Inquiry 28, 373-422. Kayne, R. (1981). ECP extensions. Linguistic Inquiry 12, 93-133. Koopman, H., and Sportiche, D. (1991). The position of subjects. Lingua 85, 211-258. Kornfilt, J. (1984). Case marking, agreement and empty categories in Turkish. Ph.D. dissertation, Harvard University, Cambridge, MA. Lamontaigne, G., and Travis, L. (1987). The syntax of adjacency. Proceedings ofWCCFL 6, 173-186. Lapointe, S. (1993). Dual lexical categories and the syntax of mixed category phrases. In A. Kathol and M. Bernstein (Eds.), ESCOL '93, 199-210. Columbus, OH: The Ohio State University. Magometov, A. (1965). Tabasaranskij jazyk, Tbilisi: Mecniereba. McCloskey, J. (1983). A VP in a VSO Language. In G. Gazdar, E. Klein, and G. K. Pullum (Eds.). Order, Concord and Constituency. Dordrecht: Foris. Morimoto, Y. (1998). A lexical account of phrasal nominalization in Japanese. Unpublished paper, University of Stanford, Stanford, CA. Netter, K. (1994). Towards a theory of functional heads: German nominal phrases. In J. Nerbonne, K. Netter, and C. Pollard (Eds.), German Grammar in HPSG, 297-340. Stanford, CA: CSLI Publications. Noonan, M. (1985). Complementation. In T. Shopen (Ed.), Language typology and syntactic description II: Complex constructions. Cambridge: Cambridge University Press. Ouhalla, J. (1991). Functional categories and parametric variation. London: CroomHelm.
Mixed Extended Projections
131
Plann, S. (1981). The two el + infinitive constructions in Spanish. Linguistic Analysis 7, 203-240. Pollard, C., and Sag, I. A. (1994). Head-driven Phrase Structure Grammar. Chicago: University of Chicago Press. Pullum, G. K. (1991). English nominal gerund phrases as noun phrases with verb phrase heads. Linguistics 29, 763-799. Roussou, A. (1991). Nominalized clauses in the syntax of Modern Greek. UCL Working Papers in Linguistics 3, 77-100. London: University College London. Wescoat, M.T. (1994). Phrase structure, lexical sharing, partial ordering, and the English gerund. In S. Gahl, A. Dolbey, and C. Johnson (Eds.), Proceedings of the Twentieth Annual Meeting of the Berkeley Linguistics Society, 587-598. Berkeley, CA: BLS. Yoon, J. H. S. (1996). Nominal gerund phrases in English as phrasal zero derivations. Linguistics 34, 329-356. Zaring, L., and Hirschbiihler, P. (1997). Qu'est-ce que ce quel The diachronic evolution of a French complementizer. In A. van Kemenade and N. Vincent (Eds.), Parameters of Morphosyntactic Change, 351-379. Cambridge: Cambridge University Press.
This page intentionally left blank
VERBAL GERUNDS AS MIXED CATEGORIES IN HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR ROBERT MALOUF Stanford University Stanford, California and University of California, Berkeley
1. INTRODUCTION Grammatical categories are central to generative theories of grammar. In many ways, the study of syntax really is just the study of grammatical categories. It is typically assumed that there is a small number of primitive, probably universal, probably innate, grammatical categories N, V, A, and P, (noun, verb, adjective, preposition) and that furthermore the properties of a phrase are primarily determined by the category of its head. That is, a verb phrase has the properties of a verb phrase by virtue of its being headed by a verb. This view of parts of speech is in large part a legacy of traditional grammar. Since the advent of generative grammar, linguists have made considerable progress in the understanding of language. Not surprisingly, the traditional inventory of parts of speech has proven to be sufficient for the analysis of most constructions in English and for a broad range of other languages. Problems that have cropped up with the originally proposed parts of speech have been solved by decomposing them into bundles of binary features ±N and ± V, allowing categories to be divided into subcategories and to be grouped into natural classes (Chomsky, 1970). Despite this success, there remains a class of constructions, known as transcategorial or simply mixed category constructions, which do not fit well with any refinement of the four basic categories. These constructions involve lexical items that seem to be core members of more than one category simultaneously. In this Syntax and Semantics, Volume 32 The Nature and Function of Syntactic Categories
133
Copyright © 2000 by Academic Press All rights of reproduction in any form reserved. 0092-4563/99 $30
134
Robert Malouf
chapter I will look at a family of constructions, demonstrated in (1), which raises serious problems for this kind of approach to grammatical categories. (1) a. Everyone was impressed by Pat's artful folding of the napkins. b. Everyone was impressed by Pat's artfully folding the napkins. c. Everyone was impressed by Pat artfully folding the napkins. Each of these examples involves a slightly different use of the nominal verb form folding. The nominal gerund use in (la) is fully nominal and behaves like any other English common noun. The verbal gerund uses in (Ib) and (Ic), however, retain some of their verbal nature. These intermediate uses fall between the two categorial poles and show a mix of nominal and verbal properties that provide a challenge to any syntactic framework that assumes a strict version of X' theory. Several alternatives to the traditional system of parts of speech have been proposed. McCawley (1982) argues for an approach that avoids the notion of syntactic category as such, operating instead directly in terms of a number of distinct factors that syntactic phenomena can be sensitive to; in this view, syntactic category names will merely be informal abbreviations for combinations of these factors. (185)
A similar approach to categories was taken by Pollard and Sag (1987). In the course of describing Head-driven Phrase Structure Grammar (HPSG), an elaborated theory of syntactic information in terms of feature structures, they observe: "equipped with the notions of head features and subcategorization, we are now in a position to define conventional grammatical symbols such as NP [noun phrase], VP [verb phrase], etc. in terms of feature structures of type sign" (68). They offer the following definition for VP: (2)
I" SYN I LOG I HEAD I MAJ [SUBCAT
verb
This decomposition of a syntactic category into features is quite different from the kind found in most statements of X' theory. Rather than making a more finegrained distinction between categories in a single dimension (say, by adding more head features), (2) defines VP in terms of two independently varying dimensions of syntactic information. VP is distinguished from V directly in terms of selectional saturation rather than indirectly via the interaction of subcategorization, phrase structure rules, and a categorial notion of bar level. And VP is distinguished from NP in terms of lexical category (represented by the feature HEAD). The structure of this chapter is as follows. In the first section, I will discuss the properties of verbal gerunds, with particular attention paid to their status as mixed categories. Next, I will review some of the previous proposals offered to account for verbal gerunds. Finally, I will present an analysis of mixed categories as noncanonical combinations of properties from independent grammatical dimensions.
Verbal Gerunds as Mixed Categories
135
2. PROPERTIES OF VERBAL GERUNDS 2.1. Verbal Gerunds as Nouns The nominal nature of verbal gerunds is shown most clearly by the external distribution of verbal gerund phrases (VGerPs). VGerPs occur in syntactic positions, such as the complement of a preposition, that generally only admit NPs. VGerPs can also occur as a clause-internal subject: (3) a. I believe that Pat's/Pat taking a leave of absence bothers you. b. Why does Pat's/Pat taking a leave of absence bother you? c. It's Pat's/Pat taking a leave of absence that bothers you. Unlike VGerPs and NPs, finite clauses are prohibited from appearing sentence internally: (4) a. *I believe that Pat took a leave of absence bothers you. b. *Why does that Pat took a leave of absence bother you? c. *It's that Pat took a leave of absence that bothers you. However, VGerPs are subject to no such constraint, as we see in (3). These examples show that, at least with respect to the prohibition against sentence-internal clausal arguments, VGerPs behave like NPs and not like Ss. One thing worth pointing out here is that VGerPs do not have the full distribution of NPs. In particular, as we see in (5), VGers cannot happily occur as possessive specifiers. (5) a. Pat's leave of absence's bothering you surprises me. b. *Pat's/Pat taking a leave of absence's bothering you surprises me. c. *That Pat took a leave of absence's bothering you surprised me. In this case, VGerPs seem to behave more like Ss and less like NPs. But, as Zwicky and Pullum (1996) observed, only a restricted subclass of what are otherwise clearly NPs can show up as possessives, for example, this Tuesday in (6). (6) a. This Tuesday is a good day for me. b. *this Tuesday's being a good day for me In other contexts, though, these examples do not sound as bad: (7) a. Did you go to this Tuesday's lecture? b. ?Pat's taking a leave of absence's impact will be considerable So, this suggests that VGers, like the other cases described by Zwicky and Pullum, fall into a "functionally restricted" subclass of nouns that only marginally head possessive phrases. As Taylor (1995:193) puts it, "the ease with which nouns can designate a 'possessor' appears to correlate with the closeness to the semantically defined prototype."
136
Robert Malouf
On the other side of things, there are contexts which admit VGers but not regular NPs. J0rgensen (1981) and Quirk et al. (1985:1230) discuss a class of predicative heads which select for an expletive subject and a VGer complement, as in (8). (8) There's no use (you/your) telling him anything. The fact that the complement's subject can appear in the possessive shows that the complement really is a VGerP and that this is not a case of subject-to-object raising. Examples such as this provide evidence that VGers form a subcategory of noun distinct from common nouns. 2.2. Verbal Gerunds as Verbs While the external syntax of VGers is much like that of NPs, their internal structure is more like that of VPs. For one, VGers take accusative NP complements, as in (9a), whereas the common nouns and nominal gerunds can only take PP complements: (9) a. (Pat's/Pat) calling (*of) the roll started each day. b. The calling *(of) the roll started each day. Another verb property of VGers is that they take adverbial modifiers, as in (lOa). In contrast, true nouns take adjectival modifiers: (10) a. Pat disapproved of (me/my) quietly leaving before anyone noticed, b. The careful/*carefully restoration of the painting took six months. Similarly, VGers can be negated with the particle not. But, not cannot be used to negate a noun: (11) a. Pat's not having bathed for a week disturbed the other diners, b. *The not processing of the election results created a scandal. These facts have been used to motivate the claim that VGers must be verbs at some level. However, none of the behavior exhibited in (9)-(ll) is unique to verbs. Some of the verb-like properties of gerunds are also shared by prepositions, adjectives, and determiners. Verbs, prepositions, adjectives, and VGers take adverbial modifiers, while common nouns take adjectival modifiers: (12) a. b. c. d.
Sandy rarely gets enough sleep. Sandy lives directly beneath a dance studio. Sandy's apartment has an insufficiently thick ceiling. Sandy grumbles about the dancers' nocturnally rehearsing Swan Lake.
Along the same lines, not can be used in some circumstances to negate adverbs, adjectives, prepositions, and determiners:
Verbal Gerunds as Mixed Categories
137
(13) a. Not surprisingly, the defendant took the Fifth. b. The conference will be held in Saarbriicken, not far from the French border. c. Not many people who have gone over Niagara Falls live to tell about it. These facts about modification and negation do not show that VGers must be verbs. What they show is that VGers, unlike common nouns, are part of a larger class of expressions that includes verbs. The complementation facts also do not constitute a strong argument that VGers must be verbs. Verbs, prepositions, and VGers, unlike common nouns, can take NP complements: (14) a. Robin sees the house. b. Robin searched behind the house. c. Robin's watching the house unnerved the tenants. On the other hand, some verbs only take PP complements: (15) a. *The strike extended two weeks. b. The strike extended through the summer. What these examples show is that taking adverbial modifiers and NP complements are neither necessary nor sufficient conditions for verbhood. The fact that some VGers take accusative objects is therefore not especially striking. What is important to take away from these examples is that a VGer, unlike a nominal gerund, takes the same complements as the verb from which it is derived: (16) a. Chris casually put the roast in the oven. b. Chris's/Chris casually putting the roast in the oven appalled the visiting vegetarians. c. Chris's casual putting of the roast in the oven appalled the visiting vegetarians. So, what we can say is that a VGerP headed by the -ing form of a verb has the same internal syntax as a VP headed by a finite form of that same verb. 2.3. Subtypes of Verbal Gerund Phrases The bottom line of the last two sections is given in (17). VGerPs have four basic properties that need to be accounted for: (17) a. b. c. d.
A VGer takes the same complements as the verb from which it is derived. VGers are modified by adverbs and not by adjectives. The entire VGerP has the external distribution of an NP. The subject of the gerund is optional and, if present, can be either a genitive or an accusative NP.
138
Robert Malouf
These properties are shared by accusative subject (ACC-mg), genitive subject (POSS-mg), and subjectless (PRO-mg) VGerPs and are not shared by any other English constructions. The three types of VGers seem to be subtypes of a single common construction type, and any analysis of VGers ought to be able account for their similarities in a systematic way. It is also important to point out, however, that there are differences among the three types, which also must be accounted for. Of course, the most obvious difference is the definitional one: the case of the subject. POSS-ing VGerPs, like NPs, take a possessive specifier: (18) a. My hunting snipe is unlikely to be successful, b. My hunt for snipe is unlikely to be successful. ACC-mg VGerPs, on the other hand, are more like nonfinite clauses in taking a subject in the accusative case: (19) a. Pat's plans didn't involve me hunting snipe, b. Pat didn't expect for me to hunt snipe. So, in this respect, the POSS-ing VGerPs are more like NPs, whereas ACC-ing VGerPs are more like Ss. Another important difference between the two types of VGerPs, pointed out by Abney (1987), is that POSS-mg but not ACC-mg VGerPs with wh-subjects can front under "pied piping" (Ross, 1967) in restrictive relative clauses: (20) a.
The person whose being late every day Pat didn't like got promoted anyway. b. *The person who(m) being late every day Pat didn't like got promoted anyway.
Again, the same contrast can be seen between NPs and Ss: (21) a. The person whose chronic lateness Pat didn't like got promoted anyway. b. *The person (for) who(m) to be late every day Pat didn't like got promoted anyway. The same generalization holds for wh-questions: (22) a. I couldn't figure out whose being late every day Pat didn't like, b. *I couldn't figure out who(m) being late every day Pat didn't like. (23) a. I couldn't figure out whose lateness Pat didn't like. b. *I couldn't figure out (for) who(m) to be late every day Pat didn't like. POSS-mg VGerPs, like NPs, can appear as the leftmost constituent of a whquestion, while ACC-mg VGerPs, like clauses, cannot.
Verbal Gerunds as Mixed Categories
139
Curiously, Webelhuth (1992:133ff) (drawing on Williams, 1975) reports a different pattern of grammaticality than I am claiming here. He cites the following examples (with the given judgments): (24) a. The administration objected to Bill's frequent travels to Chicago on financial grounds. b. The administration objected to Bill's frequently traveling to Chicago on financial grounds. (25) a. Whose frequent travels to Chicago did the administration object to on financial grounds? b. *Whose frequently traveling to Chicago did the administration object to on financial grounds? While judgements differ, I am not sure I would agree that (25b) is strictly speaking ungrammatical. While it is certainly less felicitous than (25a), it is unquestionably much better than (26). (26)
*Who(m) frequently traveling to Chicago did the administration object to on financial grounds?
The question remains, though, why (25a) should be even slightly better than (25b). Part of the reason might be that this use of traveling is partially blocked by the existing and more or less synonymous travel, much like curiosity partially blocks the derivation of Icuriousness (Arnoff, 1976; Briscoe et al., 1995). When we change (25) to use a verb that has no common nominalized form, the contrast weakens even further: (27) a. Whose frequent absences from Chicago did the administration object to on financial grounds? b. Whose frequently leaving Chicago did the administration object to on financial grounds? Why this blocking effect should be felt more strongly in (25) than in (24) is unclear, but it seems at least plausible that the source of the contrast in (25) is not due specifically to a constraint on pied piping. Although some (e.g., Abney, 1987) have taken the contrast in (20) as another piece of evidence that ACC-ing VGerPs are clauses, Former (1992) argues against this conclusion by pointing out that ACC-ing examples like (20) and (22) without pied piping are just as bad: (28) a. *The person who(m) Pat didn't like being late every day got promoted anyway, b. *I wonder who(m) Pat didn't like being late every day? So, he concludes that the ungrammaticality of (20b) has nothing to do with restrictions on pied piping, but that the best generalization to account for this data is that
140
Robert Malouf
"ACC-/ng's are generally impossible with subject wh's" (116). However, this generalization falsely predicts that ACC-ing VGerPs with wh-subjects should be ungrammatical even in constructions which allow clauses with wh-subjects. One such construction is the multiple wh-question: (29) a. Pat wonders who didn't like whose chronic lateness. b. Pat wonders who didn't like (for) who(m) to be late every day. ACC-ing VGerPs, like clauses, can in fact occur with wh-subjects in multiple wh-questions: (30) a. Pat wonders who didn't like whose being late every day. b. Pat wonders who didn't like who(m) being late every day. Here again is an instance where POSS-wg VGerPs pattern more like NPs while ACC-wg VGerPs pattern like Ss. However, it is hard to see how this difference can be attributed to a difference in the semantics of the two types of VGerPs. Instead, what this evidence shows is that at some purely syntactic level POSS-ing VGerPs have something in common with NPs, whereas ACC-mg VGerPs have something in common with Ss.
3. PREVIOUS ANALYSES An ideal analysis of VGers in English would be able to account for their mixed verbal/nominal properties, summarized in (31), without the addition of otherwise unmotivated mechanisms. (31)
Verbs govern NPs adverbs not subjects S distribution
Verbal gerunds govern NPs adverbs not subjects/specifiers NP distribution
Nouns don't govern NPs adjectives *not specifiers NP distribution
Pullum (1991:775ff) makes a specific proposal as to what devices ought to be avoided, setting out three "theoretical desiderata" that any analysis of VGers should satisfy: strong lexicalism, endocentricity, and null licensing. Strong lexicalism is the principle that syntactic operations do not affect the internal structure of words and, conversely, that morphological operations do not apply to syntactic structures. Endocentricity is the principle that "EVERY constituent has (at least) one distinguished daughter identified as its head." Null-licensing is a principle intended to restrain the proliferation of phonologically null elements. Pullum pro-
Verbal Gerunds as Mixed Categories
141
poses that "no phonologically zero constituent should be posited that is neither semantically contentful nor syntactically bound" (776). In particular, this principle would rule out phonologically null heads. Pullum (1991) proposes an analysis of VGers that exploits the flexibility of the General Phrase Structure Grammar (GPSG) Head Feature Convention (HFC) to allow V to project NP under certain circumstances. Pullum starts with the following rule for ordinary possessed NPs: (32) N[BAR:2] -> N[BAR:2, POSS: +], H[BAR: 1] The head of the phrase is only specified for the feature BAR. The HFC is a default condition that requires that the mother and the head daughter match on all features, so long as they do not conflict with any "absolute condition on feature specifications" (780). So, for instance, for the rule in (32), this will ensure that the head daughter will match the mother in its major category and that the phrase will be headed by an N. Given this background, Pullum observes that POSS-mg VGerPs can be accounted for by introducing a slightly modified version of the previous rule: (33) N[BAR:2]
(N[BAR:2, POSS: +],) H[VFORM:prp]
This rule differs from the rule in (32) only in the feature specification on the head daughter: in (33), the head daughter is required to be [VFORM: prp]. An independently motivated Feature Co-occurrence Restriction (FCR) given in (34) requires that any phrase with a VFORM value must be verbal. (34) [VFORM] D [V: +, N: -] This constraint overrides the HFC, so the rule in (33) will only admit phrases with -ing form verb heads. However, the mother is the same as the mother in (32), so (33) will give VGers the following structure:
This reflects the traditional description of VGerPs as "verbal inside, nominal outside" quite literally by giving VGerPs a VP node dominated by an NP node. However, Pullum's (1991) analysis only applies to POSS-wg VGerPs and has nothing to say about ACC-ing VGerPs at all. He suggests that ACC-wg and POSS-mg
142
Robert Malouf
VGerPs "must be analyzed quite differently" (766), but by treating them as unrelated constructions, he fails to capture their similarities. This is not merely a shortcoming of the presentation. There does not seem to be any natural way to assimilate ACC-ing VGerPs to Pullum's analysis. The simplest way to extend (33) to cover ACC-ing VGerPs is to add the following rule: (36) N[BAR:2]
(N[BAR:2]), H[VFORM:prp]
Since the default case for NPs is accusative, this rule will combine an accusative NP with an -ing form VP. This rule neatly accounts for the similarities between the two types of VGers, but not the differences. Following the direction of Hale and Platero's (1986) proposal for Navajo nominalized clauses, we might try (37) instead. (37)
N[BAR:2]->H[SUBJ:+,VFORM:prp]
The feature SUBJ indicates whether a phrase contains a subject and is used to distinguish VPs from Ss. A VP is V[BAR:2, SUBJ: -], whereas an S is V[BAR:2, SUBJ: +]. So, (37) would assign an ACC-ing VGerP the structure in (38).
It is plausible that this rule might account for some of the differences between the two types of gerund phrases. It less clear though how it can account for the difference in pied piping, since nothing in the GPSG treatment of relative clauses rules out examples like (21b), repeated here (see Pollard and Sag, 1994:214ff): (39)
*The person (for) who(m) to be late every day Pat didn't like got promoted anyway.
This analysis cannot, however, properly account for PRO-ing VGerPs. Since the possessive NP in (33) is optional, it treats PRO-ing VGerPs as a subtype of POSS-ing VGerPs even though, as we have seen, PRO-ing VGerPs have more in common with ACC-mg VGerPs. Furthermore, I do not think it is possible to account for the control properties of PRO-mg VGerPs in this type of analysis. Some subjectless gerund complements, like some subjectless infinitive complements, must be interpreted as if their missing subject were coreferential with an argument of the higher verb:
Verbal Gerunds as Mixed Categories
(40)
Chris tried
143
a Nautilus machine in Paris without success.
In both sentences in (40) the subject of find must be coindexed with the subject of tried, namely Chris. In GPSG control for infinitive complements is determined by the Control Agreement (AGR) Principle, which ensures that the AGR value of to in (41) is identified with the AGR value of try (Gazdar et al., 1985:121).
Other constraints identify the AGR value of try with its subject and the AGR value of to with the unexpressed subject of find. Although this works for infinitive complements, it cannot be extended to account for control in gerunds. The agreement FCR in (42) will block projection of the gerund's AGR value to the top-level NP node. (42)
[AGR]D[V:+,N:-]
Because complement control is mediated by AGR specifications, there will be no way to capture the parallel behavior of subjectless infinitives and gerunds. Finally, structures like (38) raise doubt as to whether the notion of head embodied by the HFC has any content at all. In this case, the only head specification shared by the mother and the head daughter is [BAR:2], and this match comes about not by the HFC but by the accidental cooperation of the rule in (37) with the FCR in (43). (43)
[SUBJ:+]D[V:+,N:-,BAR:2]
I think it is fair to classify (37) as an exocentric rule. So, the only clear way to extend Pullum's analysis to account for ACC-ing VGerPs violates one of the theoretical desiderata that are the primary motivators for his analysis in the first place.1 Lapointe (1993) observes three problems with Pullum's analysis. The first problem is that, as discussed above, it vitiates the principle of phrasal endocentricity. Lapointe's second objection is that Pullum's proposal is much too general. It has no way of representing the fact that some types of mixed category constructions are much more common than others. Nothing in it prohibits outlandish and presumably nonattested rules such as:
144
Robert Malouf
(44) a. VP->H[NFORM:plur],PP b. N' -> (QP), H° [VFORM: psp] And, nothing in it explains why constructions parallel to the English POSS-ing VGer are found in language after language. To avoid these shortcomings of Pullum's analysis, Lapointe proposes a more conservative modification to standard notions of endocentricity. He proposes introducing dual lexical categories (DLC) like ; BAR:2] -*..., H[(X\Y); BAR:2],... b. No ID rule can have the form <X|X>->...,H[F,g],..., where F implies (X \ Y), unless g includes (X \ Y). However, Lapointe restricts himself to discussion of genitive subject VGerPs. As a consequence, his analysis suffers from the same problems as Pullum's.2 In addition, since Lapointe's necessarily brief presentation leaves some formal details unspecified, it is not at all clear that a rule like (37) would even be permissible under his system. Wescoat (1994) points out an additional problem with Pullum's analysis: in excluding articles and adjectives from gerunds, they "make no allowance for a variant grammar of English that admits archaic forms like [(47)], attested between the 15th and early 20th centuries" (588). (47) a. the untrewe forgyng and contryvyng certayne testamentys and last wyll [15th cent.] b. my wicked leaving my father's house [17th cent.] c. the being weighted down by the stale and dismal oppression of the rememberance [19th cent.]
Verbal Gerunds as Mixed Categories
145
Wescoat goes on to note that "such structures coexisted with all modern gerund forms, so it is only plausible that the current and former grammars of gerunds should be largely compatible, in a way that Pullum's approach cannot model" (588). Wescoat proposes to preserve phrasal endocentricity by modifying Kornai and Pullum's (1991) axiomatization of X' syntax to allow a single word to project two different unordered lexical categories and therefore two different maximal phrases. He proposes that VGers have a structure like (48a), parallel to the clause in (48b).
In these trees, the N and I nodes, respectively, are extrasequential. That is to say, they are unordered with respect to their sisters. This structure preserves syntactic projection, but at the cost of greatly complicating the geometry of the required phrase structure representations in ways that do not seem to be independently motivated. Even assuming Wescoat's formal mechanism can be justified, the analysis shown in (48a) runs into problems with POSS-ing VGerPs. In order to account for the nonoccurrence of adjectives and determiners with gerunds in Late Modern English, Wescoat adds a stipulation that the N node associated with a gerund must be extrasequential. Since adjectives and determiners must precede the N they attach to, this stipulation prevents them from occurring with gerunds. But, possessors also have to precede the head noun in their NP, so this stipulation should also prevent gerunds from occurring with possessors. As there is no way an ordering restriction could distinguish between adjectives and determiners on the one hand and possessors on the other, Wescoat has no choice but to treat possessors in POSS-ing VGerPs as subjects with unusual case marking, not as specifiers. In so
146
Robert Malouf
doing, he fails to predict that PQSS-ing VGerPs, unlike ACC-ing VGerPs, share many properties of head-specifier constructions. For example, POSS-ing gerunds are subject to the same pied-piping constraints as NPs, while ACC-ing gerunds behave more like clauses. On the other hand, Wescoat's approach would extend to cover the ACC-ing VGerPs that are problematic for other analyses. A natural variant of (38) using lexical sharing would be
In this structure, both the N and the I nodes associated with painting are extrasequential. This tree seems to be fully consistent with all of Wescoat's phrase structure tree axioms. But, because it is not clear from his discussion how noncategorial features get projected, it is hard to say whether this kind of analysis could account for the differences between the two types of VGerPs. For instance, the contrast in (21), repeated in (50), is typically attributed to the fact that projection of wh-features is clause-bounded. (50) a. The person whose chronic lateness Pat didn't like got promoted anyway, b. *The person (for) who(m) to be late every day Pat didn't like got promoted anyway. This is what motivates the introduction of an S node in (38). However, it is not obvious that the introduction of an IP in (49) will prevent any features from projecting from the head painting directly to the top-most NP If the N, I, and V nodes in (49) are really sharing the same lexical token, then the same head features should be projected to the NP, IP, and VP nodes. Otherwise, in what sense are the three leaf nodes "sharing" the same lexical token? Without further development of these issues, it is hard to evaluate Wescoat's analysis. Finally, Wescoat's approach runs into a fatal problem, pointed out by Wescoat (p.c.), when faced with coordinate gerund phrases. Take an example like (51). (51)
Pat's never watching movies or reading books
Since the adverb never is modifying the whole coordinated VP, the only plausible structure Wescoat could assign to this sentence is (52).
Verbal Gerunds as Mixed Categories
147
But this structure is clearly ruled out by Wescoat's constraints: the mapping from leaf nodes to lexical tokens need not be one-to-one, but it must still be a function. That is, while a lexical token may be linked to more than one leaf node, each leaf node must be linked to one and only one lexical token. Therefore Wescoat's approach cannot account for examples like (51), and there is no obvious way that it could be extended to handle this kind of construction. Despite their technical differences, these approaches share a common underlying motivation. Very similar proposals have been made by Hale and Platero (1986) for Navajo nominalized clauses, by Aoun (1981) for Arabic participles, by van Riemsdijk (1983) for German adjectives, and by Lefebvre and Muysken (1988) for Quechua gerunds. Although these analyses differ greatly in their technical details, they all involve a structure more or less like the tree in (53), and so require weakening the notion of head to allow a single lexical item to head both an NP and a VP simultaneously.
The assumptions underlying (53) are those mentioned above: that the basic categories are N, V, A, and P, and that the properties of a phrase are determined by the lexical category of its head. If we accept X' theory in general, then we would not expect to find an NP projected by a verb, and the "null hypothesis" should be that structures like (53) do not exist. If there were strong evidence that (53) was indeed the structure of an English VGer, then we would have no choice but to reject the hypothesis and revise the principles of X' theory. However, as shown in
148
Robert Malouf
section 2.2, there is no clear evidence that VGerps include a VP. Therefore, an analysis that can account for the properties of VGers without violating the principles of X' theory is preferable a priori to one that posits a structure like (53). Borsley and Kornfilt (this volume) argue that a mixed extended projection similar to the structure in (53) provides insight into the cross-linguistic distribution of gerund-like elements, whereas the analysis presented here does not. However, it should be noted that the present analysis is compatible with Croft's (1991) functional explanation for the observed cross-linguistic patterns (see Malouf, 1998). In the next sections I will explore an analysis of VGers that takes into account the varying sources of syntactic information by exploiting HPSG's fine-grained categorial representations and thus calls into question the assumption underlying analyses involving categorial changeover.
4. THEORETICAL PRELIMINARIES Recent work in Construction Grammar (Fillmore and Kay, in press; Goldberg, 1995) and HPSG (Pollard and Sag, 1994) provide the basis for an analysis of mixed categories that can account for their hybrid properties without the addition of otherwise unmotivated mechanisms. In this section, I will outline the theoretical devices that will play a role in the analysis. The basic unit of linguistic structure in HPSG is the sign. Signs are "structured complexes of phonological, syntactic, semantic, discourse, and phrase structural information" (Pollard and Sag, 1994:15) represented formally by typed feature structures (TFSs), as in (54):
This TFS represents part of the lexical entry for the common noun book. A sign consists of a PHON value and a SYNSEM value, a structured complex of syntactic and semantic information.
Verbal Gerunds as Mixed Categories
149
Every linguistic object is represented as a TFS of some type, so linguistic constraints can be represented as constraints on TFSs of a certain type. The grammar of a language is represented as a set of constraints on types of signs. In order to allow generalizations to be stated concisely, linguistic types are arranged into a multiple-inheritance hierarchy. Each type inherits all the constraints associated with its supertypes, with the exception that default information from higher types can be overridden by conflicting information from a more specific type.3 In addition to allowing generalizations to be expressed, the type hierarchy also provides a natural characterization of motivation, in the Saussurean sense discussed above. In Construction Grammar, default inheritance is used to give a formal characterization of such system-internal motivation: "A given construction is motivated to the degree that its structure is inherited from other constructions in the language. ... An optimal system is a system that maximizes motivation" (Goldberg, 1995:70). Thus, the type hierarchy reflects the way in which constructions are influenced by their relationships with other constructions with the language and allows what Lakoff (1987) calls the "ecological niche" of a construction within a language to be captured as part of the formal system. Considerable work in HPSG has focused on examining the hierarchical structure of the lexicon (e.g., Flickinger, 1987; Riehemann, 1993). More recent research has investigated applying the same methods of hierarchical classification to types of phrasal signs. Expanding on the traditional X' theory presented in Pollard and Sag (1994), Sag (1997) develops an analysis of English relative clauses based on a multiple-inheritance hierarchy of construction types, where a construction is some form-meaning pair whose properties are not predictable either from its component parts or from other constructions in the language. A relevant part of the basic classification of constructions is given in (55).
Phrases can be divided into two types: endocentric headed phrases and exocentric nonheaded phrases. Since syntactic constraints are stated as constraints on
150
Robert Malouf
particular types of signs, the Head Feature Principle can be represented as (56), a constraint on all signs of the type headed. (56)
headed^
Headed phrases are also subject to the following constraint on valence features: (57)
headed
This constraint ensures that undischarged valence requirements get propagated from the head of a phrase. In the case of, say, a head/modifier phrase, the nonhead daughter [2] will not be a member of the SUBJ, SPR, or COMPS value of the head, and so the valence values will be passed up unchanged. In the case of, say, a head/complement phrase, \2\ will be on the head's COMPS list B, so the mother's COMPS value is the head's COMPS value minus the discharged complement. Headed phrases are further divided into head-adjunct phrases and head-nexus phrases. Head-nexus phrases are phrases that discharge some grammatical dependency, either a subcategorization requirement (valence) or the SLASH value of an unbounded dependency construction (head-filler). Finally, valence phrases can be subtyped according to the kind of subcategorization dependency they discharge: subject, specifier, or complement. For example, head/specifier constructions obey the constraint in (58). (58)
head-spr
In addition, constructions inherit constraints from the cross-cutting classification of phrases into either clauses or nonclauses. Among other things, clauses are subject to the following constraint (further constraints on clauses will be discussed in section 5.2):
Verbal Gerunds as Mixed Categories
(59)
151
clause
This constraint states that the SUBJ list of a clause must be a list of zero or more PRO objects. This ensures that either the clause contains an overt subject (and so the the SUBJ list is empty) or the unexpressed subject (e.g., in control constructions) is PRO, a special type of SYNSEM object that at minimum specifies accusative case and pronominal semantics (eitherppro or reft). Note that this PRO is quite unlike the homonomous empty category of Chomsky and Lasnik (1977). Its purpose is only to put constraints on the argument structure of a verb in a control structure, and it does not correspond to a phonologically unrealized position in the phrase structure. In addition, the constraint in (59) restricts the semantic type of their content: the CONT value of a clause must be a psoa object (i.e., a proposition). These two hierarchies define a set of constraints on phrasal signs. A syntactic construction is a meaningful recurrent bundle of such constraints. One way to think of constructions is as the syntactic equivalent of what in the lexical domain would be called morphemes. In terms of the theory of phrasal types presented here, a construction is a phrasal sign type that inherits from both the phrase hierarchy and the clause hierarchy. Because a construction licenses a type of complex sign, it must include information about how both the form and the meaning are assembled from the form and the meaning of its component parts. A construction may inherit some aspects of its meaning from its supertypes. In contrast to the strictly head-driven view of semantics presented by Pollard and Sag (1994), a construction may also have idiosyncratic meaning associated with it. Some of the basic constructions of English are shown in Figure l.Thefin-headsubj-cx and the nonfin-head-subj-cx constructions combine a subcategorized for subject with a finite and nonfinite head, respectively. The finite version, for normal English sentences like They walk, requires a nominative subject. The nonfinite version, for "minor" sentence types like absolutives or Mad magazine sentences (McCawley, 1988), requires an accusative subject. The noun-poss-cx construction combines a noun head with a determiner or possessive specifier to form a phrase with a nom-obj (i.e., an index-bearing unit) as the CONT value. To be more precise, the construction type noun-poss-cx is subject to the following constraint: (60)
noun-poss-cx —>
152
Robert Malouf
Figure 1. English construction types.
Here for convenience, I assume that the English genitive case marker 's is an edge inflection (see Zwicky, 1987; Miller, 1992; Halpern, 1995). The two head/complement constructions both combine a head with its selected for complements, but differ as to whether the resulting phrase can function as a clause and is subject to the constraint in (59).
5. A MIXED LEXICAL CATEGORY ANALYSIS OF VERBAL GERUNDS Words in HPSG select for arguments of a particular category. Therefore, categorial information, projected from the lexical head following the Head Feature Principle, determines the external distribution of a phrase. Selectional information, from a lexical head's valence features, determines what kinds of other phrases can occur in construction with that head. Finally, constructional information, represented as constraints on particular constructions, controls the combination of syntactic units. Within each of these three domains, VGerPs show fairly consistent behavior. What is unusual about VGers is their combination of nounlike categorial properties with verb-like selectional properties. Given the theoretical background of the previous section, we can account for the mixed nominal and verbal properties of VGers that seem puzzling given many standard assumptions about syntactic structure. The categorial properties of VGers are determined by their lexically specified head value. Like all other lin-
Verbal Gerunds as Mixed Categories
153
guistic objects, types of head values can be arranged into a multiple inheritance type hierarchy expressing the generalizations across categories. The distribution of VGers can be accounted for by the (partial) hierarchy of head values in (61).
Since gerund is a subtype of noun, a phrase projected by a gerund will be able to occur anywhere an NP is selected for. Thus, phrases projected by VGers will have the external distribution of NPs. Adverbs modify objects of category verbal, which include verbs, adjectives, and VGers, among other things. Since adjectives only modify c(ommori)-nouns, VGerPs will contain adverbial rather than adjectival modifiers. As a subtype of noun, gerunds will have a value for the feature CASE (although in English this is never reflected morphologically), but since gerunds are not a subtype of verb, they will not have a value for VFORM. The cross-classification in (61) directly reflects the traditional view of gerunds as intermediate between nouns and verbs. In this respect, it is nothing new: in the second century B.C. Dionysius Thrax analyzed the Greek participle as a "separate part of speech which '. . . partakes of the nature of verbs and nouns'" (Michael, 1970:75). But, by formalizing this intuitive view as a cross-classification of HEAD values, we can localize the idiosyncratic behavior of VGers to the lexicon. The position of gerund in this hierarchy of head values provides an immediate account of the facts in (17b) and (17c). The remaining two gerund properties in (17) can be accounted for most simply by the lexical rule in (62).
This rule produces a lexical entry for a VGer from the present participle form of the verb. The VGer differs syntactically from the participle in two ways: it is of category gerund and it selects for both a specifier and a subject. Because a VGer selects for the same complements as the verb it is derived from, the phrase formed by a VGer and its complements will look like a VP. And, since a gerund selects for both a subject and a specifier, it will be eligible to head either a nonfin-headsubj-cx construction, which combines a head with an accusative NP subject, or a noun-poss-cx construction, which combines a head with a genitive NP specifier. Here I assume that the gerund's external argument is lexically unmarked for case and that it is assigned either accusative or genitive case by the appropriate
154
Robert Malouf
construction. POSS-wg VGerPs will inherit all the constraints that apply to possessive constructions in general, for example, the restrictions on the specifier NP and on pied piping. Because the subject and specifier are identified with each other, no VGer will be able to combine with both a subject and a specifier. The combination of properties created by the lexical rule is unusual for English, but the properties themselves are all inherited from more basic types. This mixture of verbal and nominal characteristics reflects the VGer's intermediate position between nouns and verbs in the hierarchy of categories. 5.1. Some Examples To see how these constraints interact to account for the syntax of VGers, it will be useful to consider an example of each type. First, consider the (partial) lexical entry for the present participle of the verb fold, in (63).
From this lexical entry, the Verbal Gerund Lexical Rule produces the corresponding entry in (64).
The two entries differ only in the shaded features. The output of the lexical rule is of category gerund, rather than verb, and the gerund selects for both a subject and
Verbal Gerunds as Mixed Categories
155
a specifier. All other information about the verb gets carried over from the input to the lexical rule. Now we turn to the constructions that this gerund is eligible to head. We will look at two cases: POSS-mg VGerPs and ACC-ing VGerPs. First we will look at the structure of the phrase Pat's folding the napkins, shown in Figure 2. The head of this phrase, folding, is a VGer formed by the lexical rule in (62). It combines with its complement NP (marked 3]) via the head-comp-cx construction. It then combines with a genitive specifier to form a noun-poss-cx construction. Note that the formulation of the Valence Principle in (57) allows Pat's to satisfy both the subject and the specifier requirement of the gerund simultaneously. However, since the construction this phrase is an instance of is a subtype of head-spr, Pat's will only have the properties of specifier.
Figure 2. Pat's folding the napkins.
156
Robert Malouf
Figure 3. Pat folding the napkins.
An equivalent example with an accusative subject is given in Figure 3, for the phrase Pat folding the napkins. This example differs from the previous example only in the way the subject combines with the head. The nonfin-head-subj-cx combines a nonfinite head with an accusative subject. As before, Pat cancels both the subject and the specifier requirement of the head, but in this case it will have only subject properties. 5.2. Pied Piping The pied-piping contrast between ACC-ing and POSS-ing VGerPs follows from the fact that the former are clauses while the latter are not. To show how this
Verbal Gerunds as Mixed Categories
157
result is achieved, I will first sketch the HPSG treatment of pied piping developed by Pollard and Sag (1994), Sag (1997), and Ginzburg and Sag (1998). The basic fact that needs to be accounted for is shown in (65). (65) a. Who failed the exam? b. Whose roommate's brother's neighbor failed the exam? In a wh-question, the leftmost constituent must contain a wh-word, but that wh-word can be embedded arbitrarily deeply. This dependency is encoded by the nonlocal feature WH. Question words are marked with a nonempty value for the WH, whose value is a set of interrogative parameters. Wh-words also introduce an interrogative parameter in the STORE of the verb that selects them. All parameters and quantifiers introduced into the STORE then must be retrieved somewhere in the sentence and assigned a scope by a constraint-based version of Cooper storage (Cooper, 1983; Pollard and Sag, 1994). Take a sentence like (65a). This is an instance of the construction wh-subjinter-cl, which combines a subject and a head to form an interrogative clause. This construction is subject to the constraint in (66).4 (66)
wh-inter-cl
This constraint requires that the subject have somewhere inside it a wh-word that contributes an interrogative parameter. The presence of a wh-word is indicated by the phrase's nonempty WH-value. The position of the wh-word is not unconstrainted, but it can be embedded arbitrarily deeply within the subject so long as its WH-value is passed up to the top of the phrase. In addition, this interrogative parameter must be a member of the PARAMS value of the interrogative clause, and all of the members of the clause's PARAMS value are removed from the store and given scope over the clause. As first proposed by Ginzburg (1992), the interrogative word who is optionally specified for a nonempty WH value, as in (67).
(67)
who
In addition, the lexical entries for all lexical heads obey Sag's (1997) WH Amalgamation Principle in (68).
158
(68)
Robert Malouf
word
This constraint ensures that the WH-value of a head is the union of the WH values of its arguments. These lexical constraints force the head of any phrase that contains a governed wh-word to have a nonempty WH-value reflecting that fact. Next, the WH Inheritance Constraint, in (69), ensures that the value of WH gets passed from the head daughter to the mother. (69)
head-nexus
Similar constraints amalgamate the STORE value of a word's arguments and pass up the STORE value of a phrase from its head daughter. Finally, to guarantee that only questions contain interrogative words, clauses are subject to the constraint in (70). (70)
clause [NONLOCAL | WH { }]
This requires all clauses to have an empty WH-value. This means that any WHvalue introduced by the lexical entry of an interrogative word must be bound off by an appropriate interrogative construction, ruling out declarative sentences like Chris flunked which student. These constraints provide a completely general, head-driven account of pied piping in both relative clauses and questions. Consider first the nongerund examples in (71). (71) a. Whose failure was expected? b. *For whom to fail was expected? In (7la), failure will take on the nonempty WH-value of its specifier whose. The constraint in (69) passes the WH-value of failure (that is, an interrogative parameter whose) up to the entire phrase whose failure. The wh-subject interrogative clause construction forms the interrogative clause whose failure was expected? A similar chain of identities passes up the WH-value of whom in (71b) to the clause for whom to fail. But, this violates (70), and the example is ruled out. Now it should be clear how this theory of pied piping carries over to the VGer examples in (72).
Verbal Gerunds as Mixed Categories
159
(72) a. I wonder whose failing the exam surprised the instructor, b. *I wonder who(m) failing the exam surprised the instructor. The structure of these examples is given in Figures 4 and 5. In (72a), failing picks up the WH-value of whose and passes it up to the phrase whose failing the exam. Since this is an example of a POSS-mg VGerP, a type of noun-poss-cx construction, it is not subject to (70). In (72b), though, the subject of the question is a nonfinite head-subject clause, which by (70) must have an empty wn-value. This conflicts with both the constraints on wn-percolation and with (66), and the sentence is ungrammatical. The difference between POSS-ing and ACC-ing VGerPs with respect to pied piping follows directly from independently motivated constraints on constructions types. Any analysis that treats the subject case alternation
Figure 4.
Whose failing the exam surprised the instructor?
160
Robert Malouf
Figure 5.
*whom failing the exam surprised the instructor?
as essentially free variation would be hard pressed to account for this difference without further stipulations. By adapting Ginzburg's (1992) theory of interrogatives to Sag's (1997) analysis of pied piping, we can also account for the behavior of VGerPs in multiple wh-questions. For multiple wh-questions, Ginzburg (1992:331) suggests "the need for syntactic distinctions between forms that are, intuitively, interrogative syntactically and semantically and forms that are declarative syntactically, but have interrogative contents." In a multiple wh-question, wh-words that are leftmost in their clause have both interrogative syntax and interrogative semantics. They pass up a nonempty wn-value in exactly the same way as in ordinary whquestions. Wh-words that are not clause initial, on the other hand, have only inter-
Verbal Gerunds as Mixed Categories
161
rogative semantics. While they introduce an interrogative parameter into the store, they have an empty WH-value. This is what accounts for the noncontrast in (73). (73) a. I wonder who was surprised by whose failing the exam. b. I wonder who was surprised by who failing the exam. The structures of (73) is given Figure 6. Note that I have assumed Pollard and Yoo's (1998) head-driven STORE collection here, but I have crucially not adopted their analysis of multiple wh-questions. Unlike (72b), (73b) does not run afoul of (70), the constraint requiring clauses to have empty WH-values. Since the ACC-ing VGerP who failing the exam appears in situ, it is only interrogative semantically and its WH value is empty. No constraints prohibit an interrogative parameter from being passed up via the storage mechanism, and so both (73a) and (73b) are grammatical. The constraints outlined in this section also apply to restrictive relative clauses, and so predict the contrasts in (20) and (21). Similarly, under this analysis, PRO-ing VGerPs are instances of clauses. As Wasow and Roeper (1972) observe, PRO-mg VGerPs are parallel to subjectless infinitives in Equi constructions: (74) a. Lee hates loud singing. b. Lee hates singing loudly. c. Lee hates to sing loudly. In both (74b) and (74c), the understood subject of the embedded verb must be Lee. In (74a), though, the understood subject of the nominal gerund singing can be anyone. Since, by the constraint on clauses in (59), the unexpressed subject of a PRO-mg gerund phrase is a PRO, it will be governed by Pollard and Sag's (1994) semantic theory of complement control just like the unexpressed subjects of infinitive complements (see Malouf, 1998). Furthermore, since subjectless gerunds are clausal, pied piping out of a PRO-mg VGerP is also predicted to be ungrammatical: (75) a. Pat invited no one who(m) Chris hates talking to. b. *Pat invited no one talking to who(m) Chris hates. One final point is that the constraints discussed here apply only to questions and restrictive relative clauses. As Levine (p.c.) observes, these predictions do not hold for nonrestrictive relatives or "pseudorelatives" (see McCawley, 1988): (76) a. Sandy, even talking about whom Chris hates, won't be invited. b. Robin is one person even talking about whom gets my blood boiling. These constructions place weaker constraints on pied piping than restrictive relatives do, and they generally allow pied piping of clauses:
162
Robert Malouf
Figure 6.
Who was surprised by whose/whom failing the exam?
(77) a. Sandy, for someone to even talk about whom Chris hates, won't be invited. b. Robin is one person for someone to even talk about whom gets my blood boiling. So, what these examples show is that pied piping in nonrestrictive relatives is not mediated by the feature WH and so are not subject to the constraint in (70).
Verbal Gerunds as Mixed Categories
163
6. CONCLUSION The constructions that combine a VGer with its complements and its subject or specifier are the same constructions used for building NPs, VPs, and clauses. This reflects the traditional view that VGerPs are built out of pieces of syntax "reused" from other parts of the grammar. In one sense, under this analysis a VGer together with its complements really is like V. Both are instances of the same construction type and both are subject to any constraints associated with that construction. In the same way, a VGer plus an accusative subject really do form a clause, while a VGer plus a genitive subject really do form an NP. So, these two types of VGerPs inherit the constraints on semantic type and pied piping associated with the construction type of which they are an instance. However, in a more important sense, a VGer plus its complements forms a VGer', which combines with an accusative or genitive subject to form a VGerP. The analysis presented here allows this similarity to be captured without weakening HPSG's strong notion of endocentricity. By exploiting HPSG's hierarchical classification of category types and its inventory of elaborated phrase structure rules, we are able to account for the mixed behavior of English VGers without adding any additional theoretical mechanisms or weakening any basic assumptions. The analysis presented here does not require syntactic word formation and thus preserves lexical integrity. It also does not require any phonologically null elements of abstract structure, and it allows us to maintain the strong notion of endocentricity embodied by the HPSG Head Feature Principle. Finally, by making crucial reference to syntactic constructions, this analysis allows us to capture on the one hand the similarities among the subtypes of VGerPs and on the other their similarities to other English phrase types.
ACKNOWLEDGMENTS I would like to thank Farrell Ackerman, Bob Borsley, Bob Levine, Carl Pollard, Ivan Sag, Gert Webelhuth, Michael Wescoat, and an anonymous reviewer for their helpful comments. This research was conducted in part in connection with Linguistic Grammars Online (LiNGO) project at the Center for the Study of Language and Information, Stanford University.
NOTES 1
In addition, there are quite general formal problems with the default nature of the GPSG Head Feature Convention (McCawley, 1988; Shieber, 1986). 2 Similarly, under the approach taken by Borsley and Kornfilt (this volume) it is difficult to account for the similarities between ACC-ing and POSS-ing VGerPs.
164
Robert Malouf
3 The details of default inheritance are not relevant to this chapter, but Lascarides and Copestake (1999) suggest how such a system might be formalized. 4 The contained set difference of two sets (X — Y) is the ordinary set difference as long as Y X. Otherwise it is undefined. Likewise, the disjoint set union of two sets (X Y) is the ordinary set union as long as the two sets are disjoint.
REFERENCES Abney, S. P. (1987). The English noun phrase in its sentential aspect. Ph.D. thesis, MIT, URLhttp://www.sfs.nphil.uni-tuebingen.de/~abney/Abney_87a.ps.gz. Aoun, Y. (1981). Parts of speech: A case of redistribution. In A. Belletti, L. Brandi, and L. Rizzi (Eds.), Theory of markedness in generative grammar (pp. 3-24). Pisa: Scoula Normale Superiore di Pisa. Aronoff, M. (1976). Word formation in generative grammar. Cambridge, MA: MIT Press. Bouma, G. (1993). Nonmonotonicity and Categorial Unification Grammar. Ph.D. thesis, Rijksuniversiteit Groningen. Briscoe, T, Copestake, A., and Lascarides, A. (1995). Blocking. In P. St. Dizier and E. Viegas (Eds.), Computational lexical semantics.Cambridge: Cambridge University Press. URL http://www.cl.cam.ac.Uk/ftp/papers/acquilex/acq2wp2.ps.Z. Chomsky, N. (1970). Remarks on nominalizations. In R. Jacobs and P. Rosenbaum (Eds.), Readings in English transformational grammar (184-221). Waltham, MA: Ginn. Chomsky, N., and Lasnik, H. (1977). Filters and control. Linguistic Inquiry, 8, 425-504. Cooper, R. (1983). Quantification and syntactic theory. Dordrecht: Reidel. Croft, W. (1991). Syntactic categories and grammatical relations. Chicago: University of Chicago Press. Fillmore, C. J., and Kay, P. (in press). Construction grammar. Stanford: CSLI Publications. URL http://www.icsi.berkeley.edu/~kay/bcg/ConGram.html. Flickinger, D. (1987). Lexical rules in the hierarchical lexicon. Ph.D. thesis, Stanford University. Gazdar, G., Klein, E., Pullum, G., and Sag, I. (1985). Generalized Phrase Structure grammar. Cambridge: Harvard University Press. Ginzburg, J. (1992). Questions, Queries, and Facts: A Semantics and Pragmatics for Interrogatives. Ph.D. thesis, Stanford University. Ginzburg, J., and Sag, I. A. (1998). English interrogative constructions. Unpublished manuscript, Hebrew University and Stanford University. Goldberg, A. (1995). Constructions: A construction grammar approach to argument structure. Chicago: University of Chicago Press. Hale, K., and Platero, P. (1986). Parts of speech. In P. Muysken and H. van Riemsdijk (Eds.), Features and projections (pp. 31-40). Dordrecht: Foris. Halpern, A. (1995). On the placement and morphology of clitics. Stanford: CSLI Publications. J0rgensen, E. (1981). Gerund and to-infinitives after 'it is (of) no use', 'it is no good', and 'it is useless'. English Studies, 62, 156-163.
Verbal Gerunds as Mixed Categories
165
Kornai, A., and Pullum, G. K. (1990). The X-bar theory of phrase structure. Language, 66, 24-50. Lakoff, G. (1987). Women, fire, and dangerous things. Chicago: University of Chicago Press. Lambrecht, K. (1990). 'What me worry?' Mad magazine sentences revisited. In Proceedings of the Berkeley Linguistics Society (vol. 16, pp. 215-228). Lapointe, S. G. (1993). Dual lexical categories and the syntax of mixed category phrases. In A. Kathol and M. Bernstein (Eds.), Proceedings of the Eastern States Conference of Linguistics (pp. 199-210). Lascarides, A., and Copestake, A. (1999). Default representation in constraint-based frameworks. Computational Linguistics, 25, 55-105. Lefebvre, C., and Muysken, P. (1988). Mixed categories. Dordrecht: Kluwer. Malouf, R. (1998). Mixed categories in the hierarchical lexicon. Ph.D. thesis, Stanford University. URL http://hpsg.stanford.edu/rob/papers/diss.ps.gz. McCawley, J. D. (1988). The syntactic phenomena of English. Chicago: University of Chicago Press. McCawley, J. D. (1982). The nonexistence of syntactic categories. In Thirty million theories of grammar. Chicago: University of Chicago Press. Michael, I. (1970). English grammatical categories and the tradition to 1800. Cambridge: Cambridge University Press. Miller, P. (1992). Clitics and constituent in phrase structure grammar. New York: Garland. Pollard, C., and Sag, I. A. (1994). Head-driven phrase structure grammar. Chicago: University of Chicago Press, and Stanford: CSLI Publications. Pollard, C., and Sag, I. A. (1987). Information-based syntax and semantics. Stanford: CSLI Publications. Pollard, C., and Yoo, E. J. (1998). Quantifiers, w/z-phrases, and a theory of argument selection. Journal of Linguistics, 34, 415-446. Former, P. H. (1992). Situation theory and the semantics of propositional expressions. Ph.D. thesis, University of Massachusetts, Amherst. Distributed by the University of Massachusetts Graduate Linguistic Student Association. Pullum, G. K. (1991). English nominal gerund phrases as noun phrases with verb-phrase heads. Linguistics, 29, 763-799. Quirk, R., Greenbaum, S., Leech, G., and Svartvik, J. (1985). A comprehensive grammar of the English language. London: Longman. Riehemann, S. (1993). Word formation in lexical type hierarchies. Master's thesis, Universita't Tubingen. URL ftp://ftp-csli.stanford.edu/linguistics/sfsreport.ps.gz. Ross, J. R. (1967). Constraints on variables in syntax. Ph.D. thesis, Massachusetts Institute of Technology, Cambridge, MA. Sag, I. A. (1997). English relative clause constructions. Journal of Linguistics, 33, 431484. Shieber, S. M. (1986). A simple reconstruction of GPSG. In Proceedings of the eleventh International Conference on Computational Linguistics (COLENG-86). (pp. 211215). Bonn, Germany. Taylor, J. R. (1995). Linguistic categorization (2nd ed.). Oxford: Oxford University Press. van Riemsdijk, H. (1983). A note on German adjectives. In F. Heny and B. Richards (Eds.),
166
Robert Malouf
Linguistic categories: Auxiliaries and related puzzles (pp. 223-252). Dordrecht: Reidel. Wasow, T., and Roeper, T. (1972). On the subject of gerunds. Foundations of Language, 8, 44-61. Webelhuth, G. (1992). Principles and parameters of syntactic saturation. New York: Oxford University Press. Wescoat, M. T. (1994). Phrase structure, lexical sharing, partial ordering, and the English gerund. In S. Gahl, A. Dolbey, and C. Johnson (Eds.), Proceedings of the Berkeley Linguistics Society (vol. 20, pp. 587-598). Williams, E. (1975). Small clauses in English. In J. Kimball (Ed.), Syntax and semantics (vol. 4, pp. 249-273). New York: Academic Press. Zwicky, A. M. (1987). Suppressing the Z's. Journal of Linguistics, 23, 133-148. Zwicky, A. M., and Pullum, G. K. (1996). Functional restriction: English possessives. Paper presented at 1996 Linguistics Society of America meeting.
ENGLISH AUXILIARIES WITHOUT LEXICAL RULES ANTHONY WARNER Department of Language and Linguistic Science University of York Heslington, York United Kingdom
1. INTRODUCTION English auxiliaries show a complex but systematic set of interrelationships between their characteristic construction types. It has often been assumed that, for its proper description, this requires the resources of movement defined over structures (as in the tradition stretching from Chomsky, 1957, to Pollock, 1989, and onwards). An alternative, within Phrase Structure Grammar, has been to appeal to the resources of lexical rules (Flickinger, 1987; Pollard and Sag, 1987) or their antecedent metarules. In this chapter I will give an account of the grammar of these characteristic constructions in Head-driven Phrase Structure Grammar (HPSG), without using lexical rules or movements interrelating structures, but relying solely on the organization of information within an inheritance hierarchy to make relevant generalizations. The demonstration that lexical rules are not required in this area substantially enhances the possibility that lexical rules could be banished from the armory of HPSG in favor of mechanisms of lexical inheritance (extending the kind of approach to valence alternations developed in Kathol, 1994, Bouma, 1997, and to inflectional and derivational relationships in, for example, Krieger and Nerbonne, 1993; Riehemann, 1994). The demonstration is the Syntax and Semantics, Volume 32 The Nature and Function of Syntactic Categories
167
Copyright © 2000 by Academic Press All rights of reproduction in any form reserved. 0092-4563/99 $30
168
Anthony Warner
more convincing because of the complexity of the interrelationships between auxiliary constructions, and of their interface with negation, which at first sight seems to require a more powerful device than simple inheritance. This chapter is a development of the lexicalist analysis of auxiliaries given in Warner (1993a). The structures posited and much of the argumentation for them are essentially carried over. But the desire to avoid the device of lexical rules, which played a major role in Warner (1993a), has led to an entirely new analysis. The interrelationships proposed between structures are radically different since they are constrained by the need to state them within a hierarchy of unifiable information, whereas lexical rules permit what looks to the practicing grammarian like a more potent ability to manipulate relationships between feature structures.1
2. AUXILIARY CONSTRUCTIONS IN HEAD-DRIVEN PHRASE STRUCTURE GRAMMAR Head-driven Phrase Structure Grammar characterizes linguistic information (lexical or phrasal signs and their components) in terms of feature structures and constraints on those feature structures, where a constraint is, in effect, a partial description.2 Feature structures (or attribute value matrices) are themselves defined within a hierarchy of types. Appropriate features are defined for each type, and appropriate values for each feature. Thus the type category will be defined as having values for attributes HEAD and VALENCE. The values of HEAD correspond broadly to part of speech, and one of the subtypes involved here is verb. This is defined as having attributes AUX (with Boolean values {+, —}) and VFORM, with values corresponding to the major morphosyntactic subcategories of verbs: {fin, bse, etc.} (finite, base infinitive, etc.). In parallel fashion, VALENCE will be defined as having attributes SUBJ (subject), SPR (specifier), and COMPS (complements), which have as their values lists of synsem objects (that is, of feature structures which characterize syntactic and semantic information) corresponding to the subject, specifier, and complements of the category in question. Within this framework, I assume that auxiliaries (modals, BE and appropriate instances of DO and HAVE) share a type verb with nonauxiliary verbs, being distinguished from nonauxiliary verbs as [+AUX] versus [—AUX], that they occur in structures which are like (1) for the reasons argued in Gazdar, Pullum, and Sag (1982) and Warner (1993a), and that they head their phrase.
English Auxiliaries without Lexical Rules
169
Then modals are subcategorized for a plain infinitive phrase; BE (when used as a copular verb) is subcategorized for a "predicative" phrase; DO is subcategorized for a plain infinitive phrase which cannot be headed by an auxiliary, and so on. Most English auxiliaries are "raising" verbs, requiring identity between their subject and the subject of their complement. This holds not only for verbal complements, but for predicative complements after BE. It is dealt with as token identity (that is, structure sharing) between two feature structures within the higher verb's VALENCE: the value of its attribute SUBJ and the value of the attribute SUBJ within the sole member of its COMPS list. This token identity is shown by tagging the feature structure's occurrence with a boxed numeral. The lexicon will then include such basic valence information as that given in (2), where PRD is a Boolean-valued feature which is positive in predicative phrases, and list values appear in angle brackets. (2)
Auxiliary category and subcategorization information in the lexicon
Category can, could, etc. (finite) is (finite) do (finite)
Selecting a phrase headed by Plain infinitive Noninfinitive predicative4 Nonauxiliary plain infinitive
Corresponding value of VALENCE3 SUBJ([D} COMPS [PRD —, SUBJ , VFORM bse} SUBJ COMPS [PRD +, SUBJ ] SUBJ COMPS [PRD —, SUBJ , AUX —, VFORM bse]
Together with some further distinctions and conventions which will be discussed immediately below, the feature structure for modal should in affirmative declaratives will include the information in (3). I omit the specification of a value for PHON, which will characterize its phonology, and for NONLOCAL, which will state the properties needed to deal with unbounded dependencies, such as the relationship between a fronted wh-word and a corresponding gap within the constituent headed by should. The empty list is designated "elist," and the complement's CONTENT value is abbreviated as a boxed integer after a colon (in accordance with the normal convention: Pollard and Sag, 1994:28).
170
Anthony Warner
The combination of this lexical information into phrases depends on schemata of immediate dominance which define local structures (themselves specified in terms of attribute value matrices in recent work), and principles of linear order. Beyond this we need to appeal to three particular principles. One, the Head Feature Principle of (4a), requires the value of a mother's HEAD feature to be identical to that of its head daughter, thus ensuring, for example, that a finite verb occurs within a finite verb phrase (VP), and that a finite VP occurs within a further finite verbal projection. The second, the Valence Principle of (4b), ensures a mismatch between mother and head daughter within the list-valued features in VALENCE, provided that the relevant synsem objects occur as sisters to the daughter: the effect is one of cancellation by combination, as illustrated in (5). Note that a hierarchy of levels is partly established by the occurrence of different values of SUBJ. At the highest level its list is empty; at lower levels it has a nonempty list. The third principle, the Semantics Principle (Pollard and Sag, 1994: 48, 323), has the effect that in headed structures the mother and head daughter have the same value for CONTENT in cases where neither adjuncts nor quantification are involved. So in (5) the CONTENT of the whole clause is token identical with that of its VP, and this is token identical with that of its head. This token is tagged . Should is treated here as a raising auxiliary, whose content is a semantic
English Auxiliaries without Lexical Rules
171
relation which takes as its argument the content of its complement, here tagged [3], as also in (3).5 The principles in (4) are cited from Miller and Sag, 1997:583. (4) a. Head Feature Principle: A head-daughter's HEAD value is identical to that of its mother. b. Valence Principle: If a phrase consists of a head daughter and one or more arguments (complement(s), subject, or specifier), then its value for the relevant VALENCE feature F (COMPS, SUBJ, or SPR) is the head daughter's F value minus the elements corresponding to the synsem(s) of the nonhead daughter(s). Otherwise, a phrase's F value is identical to that of its head daughter.
English auxiliaries differ from nonauxiliary verbs in that they occur not only in structures like (1) but also in the distinctive negated, inverted, and elliptical structures associated with (6b, c, d).
172
(6) a. b. c. d.
Anthony Warner
John should go there. He loves opera. John should not go there. *He loves not opera. Should John go there? *Loves John opera? If John wants to go, he should, [sc. go] *If he can go, he intends, [sc. to go] —OK with intends to, but to is an auxiliary.
I will interpret these constructions in terms of modifications of the basic VALENCE of the auxiliaries involved, as in (7b, c, d). (7) a. John should go there. should = [SUBJ [1]NP , COMPS [VFORM bse, SUBJ [1] ] ] b. John should not go there. should = [SUBJ [1]NP , COMPS not, [VFORM bse, SUBJ [1] ] ] c. Should John go there? should = [SUBJ elist, COMPS [1]NP, [VFORM bse, SUBJ [1] ] ] d. If John wants to go there, he should. [sc. go there] should = [SUBJ<[1]NP>, COMPS elist] The interrelationships between these auxiliary constructions can be dealt with by lexical rules, as in Warner (1993a), where one rule maps the lexical information on should in (7a) into that on should in (7b), another into that on should in (7c), a third into that on should in (7d). But some papers have suggested that particular lexical rules may be avoided within HPSG, and that the interrelationships they encode should be reinterpreted within an inheritance hierarchy. Thus Riehemann (1994) shows how morphologically complex words which contain different layers of structure can be appropriately related to other lexical items within an inheritance framework which represents this structuring. This approach is not suitable for the present case, where there is no internal structuring to support the valence alternations. But Kathol (1994) pointed out that lexical rule interrelationships can be integrated into an inheritance hierarchy by the use of ad hoc features (his "proto features") which encode the shared information. A minor extension of this might be to propose that in each of (7a-d) the attribute value matrix for should contained a feature PROTOCOMPS<[VFORM bse, SUBJ<[ ])]>, whose role was to provide a common reference point permitting the local definition of the actual subcategorization of each construction, by giving that definition access to the "basic" subcategorization. On the face of it, this approach requires an unattractive proliferation of features. But Manning and Sag (1997) have established the position of an attribute "ARG-ST" (for "argument structure"), which lists the arguments associated with a lexical head, alongside the "valence features" SUBJ, SPR, and COMPS. ARG-ST corresponds to the SUBCAT attribute of Pollard and Sag (1994). One might have expected that the adoption of the va-
English Auxiliaries without Lexical Rules
173
lence features (for reasons given in Pollard and Sag, 1994: Chapter 9, following Borsley, 1987) would make this further list redundant: after all its value is (canonically at least) simply the concatenation of the valence feature lists. But Manning and Sag provide a series of arguments in favor of the retention of such a list. They note that in passives the Russian reflexive anaphor sebe may corefer not only with the surface subject but alternatively with the "demoted" active subject, instead of being restricted to coreference with the surface subject as in actives; that in Japanese derived causative verbs allow adverbs and quantifiers to scope between the causative predicate and the predicate of the stem, and that the complex internal structure posited for this allows additional binding possibilities. They argue from such facts, and from the distribution of binding possibilities in ergative languages, that lexical heads must carry a syntactic specification of arguments which is potentially distinct from the simple append of valence features: hence they propose the attribute ARG-ST, which supplies the information necessary to distinguish the properties of passives and causatives, and that required for the statement of binding relationships. Given the independent need for this attribute, it is clearly possible to use its values as the "reference" feature which provides the common information base from which the values of other attributes SUBJ, SPR, COMPS are derived by inheritance. This line of thinking has been developed in Bouma (1997) and Bouma, Malouf, and Sag (1997), who deal with the mapping relationship between an item's argument structure and its valence features by means of constraints rather than by using lexical rules, so that extraction and adjunction involve permitted mismatches in this mapping. Miller and Sag (1997) similarly use constraints to state the valence alternations which underlie their account of the distribution of French clitics. So I shall discuss relevant aspects of an inheritance hierarchy for lexical information in which ARG-ST plays this reference role.6 The cross-cutting valence possibilities of auxiliaries will be defined in two types: aux lex and finite aux lex. Aux lex is [+AUX]. It has the subtypes finite aux lex and nonfinite aux lex. Finite aux lex is specified [VFORM fin] and it has complex subtypes which are divided into two "partitions," one for auxiliaries in inverted structures, and one for negation. These partitions form dimensions of choice which any member of the superordinate type is required to exercise. (On partitions see Pollard and Sag, 1987, and the discussion of Carpenter, 1992.) Thus any item of type finite aux lex must also belong to one of the subtypes in each of the partitions NEGATION and INVERSION, and setting up the hierarchy as in (8) automatically produces a series of subordinate types for each finite aux lex (i.e., negated and inverted, not negated and inverted, negated and not inverted, and not negated and not inverted). These will further unify with information stated elsewhere in the hierarchy which defines either ellipsis or its absence to characterize a full range of constructions.
174
(8)
Anthony Warner
Part of the lexical inheritance hierarchy for auxiliaries.
Thus we can account for not just the valences given in (7), but for the set of combinations whose other members are illustrated in (9). (9) a. Should John not go there? inverted + negated + not elliptical should = [SUBJ elist, COMPS<[1]NP, not, [VFORM bse, SUBJ<[1]>]>] b. Well, should he? [sc. go there] inverted + not negated + elliptical should = [SUBJ elist, COMPS] c. Even though John wants to go there, he should not. [sc. go there] not inverted + negated + elliptical should = [SUBJ, COMPS<not>] d. Should he not ? [sc. go there] inverted + negated + elliptical should = [SUBJ elist, COMPS] We must add to this the possibility of the different realizations of auxiliary negation as not or the contracted -n't and an account of the different scope relations between the negative and the head auxiliary. These are dealt with in further partitions within negated. Notice that this approach differs from an account in terms of lexical rules in that there is no notion that one type is derived from another: there is merely differential sharing of information. While one might feel happy enough within a lexical rule framework to see as basic sentences which are not elliptical, not negated and not inverted, so that (7a) would be basic and the negated or inverted or elliptical variants (7b, c, d) would be derived, it is not clear, when we consider the derivation of the more complex combinations in (9), that this difference of status or the notion of derivational ordering imposed is anything more than an artifical construct of analysis, even where the ordering is determined by the formulation of the rules as in Warner (1993a). In the information-based approach, however, there is no sense in which the type of (9c) (not inverted + negated + elliptical) is "derived from" the type of (7d) (not inverted + not negated + elliptical) or, alternatively, from the type of (7b) (not inverted + negated + not elliptical): rather each of the types negated, inverted, not negated, not inverted, etc. repre-
English Auxiliaries without Lexical Rules
175
sents some definable information which is unified into the more complex types.7 This is a conceptual difference, and advantage, to dispensing with lexical rules. A standard set of features will be taken for granted here, with the minor change that the type auxiliary, which is a subtype of verb, substantive, and head, has a subtype finite auxiliary with the Boolean attribute INV (where +INV characterizes inverted clauses). So this attribute will not appear on nonfinite auxiliaries or on nonauxiliary verbs. Possibly unfamiliar features used in the discussion of negation will be explained at the appropriate point. I will also make use of defaults following Sag (1997), employing the approach of Lascarides et al. (1996) in which default values are identified not by some general principle but by specifying the individual values which may be replaced under unification with a distinct nondefault value. This avoids the problems of indeterminacy of definition for earlier approaches discussed in Carpenter (1993), Copestake (1993), and elsewhere. What is involved is, however, a strictly limited departure from monotonicity, and in most cases I have provided an alternative monotonic formulation.
3. NEGATION 3.1. The Distribution of Not The distribution of not is unique, and it has two components.8 One is its occurrence with finite auxiliaries: it follows the auxiliary (or its subject when inverted). This use of not corresponds to Klima's (1964) "sentential negation," and it will be introduced as an element on the ARG-ST (and COMPS) lists of auxiliaries. It will therefore be a sister of the finite auxiliary, not forming a constituent with either the auxiliary's complement or with the auxiliary, but having the structure of (10) in sentences like (6b). (Here and below I abbreviate feature structures in a familiar manner, using VP for the relevant nonsaturated phrase which is [HEAD verb], and using fin, bse, etc., for VFORM fin, VFORM bse, etc.)
176
Anthony Warner
I will argue that negation in the structure of (10) may have either wide scope (including the semantics of the auxiliary) or narrow scope (excluding it), depending on the particular auxiliary involved: for example, with should it has narrow scope, with could it has wide scope, and with may it has either, depending on the meaning of may. (See Quirk et al., 1985: §§10.67f. for a survey.) (11) a. Paul could not have worked as hard, could he? No, he could not. —Wide scope of negation: not (possible) b. Paul should not have been drinking, should he? No, he should not. —Narrow scope of negation: obligation (not). c. Paul may not drink alcohol, and neither may his younger sister. Indeed he may not. — Neutral intonation. Wide scope of negation. 'Paul is not permitted to drink alcohol.' d. Paul may not be at home, and neither may his younger sister. Truly, he may not. —Neutral intonation. Narrow scope of negation. 'It is possible that Paul is not at home.' The other use of not involves syntactic "constituent negation"; this may precede a wide range of phrases, and can be introduced as their initial modifier, forming a constituent with them. This will include occurrences of not in [vp not VP], where VP is nonfinite. The fact that some instances of not VP may be coordinated and may be subject to ellipsis, as in the examples of (12a, b), implies that they are constituents; this analysis is also appropriate in other cases, such as the double negation of (12c). In (12) constituent negation is in italics; other instances of not have the structural type of (10). (12) a. May we either [not go] or leave early? b. Please may we [not go}? —You [may] [not]! [sc. not go] c. I [will] [not] [permit you to escape without blame]! You [may] [not] [not own up]]. d. Paul may have been [not drinking]. Since Paul may not drink could in principle be analyzed with either structure, it would seem reasonable a priori to suggest that the interpretation which has narrow scope of negation should correspond to syntactic constituent negation, and that the interpretation which has wide scope of negation should correspond to the structure of (10). Gazdar, Pullum, and Sag (1982:604f.) do indeed make essentially this proposal, which results in the following analyses for (11c, d). (I postpone for the moment the interpretation 'Paul is permitted to not drink' which has a distinctive intonation pattern.) (13) a. Paul [[may] [not] [drink]]. Structure of (10). Wide scope of negation, deontic. 'Paul is not permitted to drink.'
English Auxiliaries without Lexical Rules
177
b. Paul [may [not be at home]]. Syntactic constituent negation. Narrow scope of negation, epistemic. 'It is possible that Paul is not at home.' But Klima's (1964) sentential negation is defined by a range of formal tests, including the possibility of reversed polarity tags, and and neither tags as illustrated in (11), and these tests hold whether the scope of negation is wide, including the auxiliary as in (lla, c), or narrow, excluding it as in (11b, d).9 This is consistent with the adoption of a single structure for sentential negation with finite auxiliaries, that of (10), even if the corresponding semantic scope of not may include or exclude the auxiliary. This decoupling of syntactic and semantic structure is supported by the fact that forms with contracted (inflectional) -n't also correspond to sentential negation and may show wide or narrow scope of negation, as illustrated in (14), and by the existence of other instances where ambiguities of scope do not correspond to a structural ambiguity, as in (15). The facts of ellipsis also support this interpretation: finite auxiliary + not readily allows ellipsis, whether the scope of negation is wide or narrow [cf. (lla, b)], but nonfinite + not is impossible or very restricted.10 This follows, since the structure of (10) is restricted to finite auxiliary heads, whereas not after a nonfinite auxiliary always forms a constituent with the auxiliary's complement, so that the entire complement including not is absent in ellipsis. If, however, narrow scope not were instead only generated in [vp not VP] even after finites, as by Gazdar, Pullum, and Sag (1982), then the fact that not survives in ellipsis after finites would imply that [not e] is a possible structure, as illustrated in (16). We would then most naturally predict the occurrence of [not e] after auxiliaries in general, including nonfinites. But this is incorrect, see (17). (14) a. Paul shouldn't have been drinking, should he? No, he shouldn't. —Narrow scope of negation: obligation (not). b. Paul couldn't have worked so hard, could he? No, he couldn't. —Wide scope of negation: not (possible) (15) a. He may have talked to nobody. —Narrow scope of negation: possible (not) b. You may talk to nobody. —Wide scope of negation: not (permitted) as well as narrow (permitted not, possible that not). (16)
Paul [may [vp not e]]. —Narrow scope of negation: 'It is possible that Paul does not [drink]'
(17) a. Paul has not been; The book may not be; Mary might not have. b. ?*Paul has been not; ?*The book may be not; ?*Mary might have not. A further minor argument for this decoupling of syntactic and semantic structure hangs on inversion. It will be argued below that inverted structures are "flat," so
178
Anthony Warner
that the subject and complement of the auxiliary are sisters. But then in the type of (18) (which, though formal, is perfectly grammatical), narrow scope not does not form a constituent with the item (or items) over which it has scope. (18)
Which foodstuffs [[should] [not] [my ageing father] [be allowed to eat]]? —Narrow scope of negation: obligation (not)
So not after finite auxiliaries in the structure of (10) may have either wide or narrow scope. But sentences which carry a tone movement on the finite auxiliary and have an intonational break before not VP seem reasonably characterized as showing syntactic constituent negation with narrow scope, so that syntactic constituent negation apparently occurs after at least some finite auxiliaries. These sentences are distinct from the sentences with neutral intonation and narrow scope of negation given above (11 a, d).11 The contrast is that between (19a) with neutral intonation and (19b, c) with special intonation. In (19a) ellipsis of the complement must strand not, and, in general, the not of Klima's sentential negation may not be removed in ellipsis; there is no contrast between wide and narrow scope not in this respect. This is entirely consistent with the analysis of not as sister to the finite auxiliary, if we suppose that neither ellipsis nor Complement Extraction may affect nonphrasal constituents. In contrast, in (19b, c) with special intonation, ellipsis of the complement does also affect not, which is not readily stranded.12 This is consistent with an analysis in which not forms a constituent with the following VP. But if narrow scope not ordinarily formed a constituent with the following VP, the prediction should be that it might suffer ellipsis along with that VP, and this would reverse the judgments of (19a). (19) a. They [must] [not] [arrive late]! —Neutral intonation, narrow scope of negation —Indeed they must not. —*Indeed they must. b. You [could (always) [not go to the party]], couldn't you? —Special intonation, narrow scope of negation —Yes, I suppose I could, [sc. not go to the party] c. You [may [not join us for lunch]]. —Special intonation, narrow scope of negation 'you are permitted not to join us for lunch' —?I repeat: you may not. —I repeat: you may. [sc. not join us for lunch] d. You [may] [not] [join us for lunch]. —Neutral intonation, wide scope of negation 'you are not permitted to join us for lunch' —I repeat: you may not. —*I repeat: you may. A further possibility is that not forms a constituent with the auxiliary. This is suggested by the existence of forms in -n't, and by the fact that not is a modifier
English Auxiliaries without Lexical Rules
179
elsewhere. But not may be separated from the verb by a variety of intermediate constituents, in particular by subjects and by conjuncts and disjuncts, which seem unlikely to intervene in structures of modification. The positive tags in (20b, d) confirm that these are examples of sentential negation. (20) a. Why should John and Paul not ask me for a favor? b. They should, though, I suggest, not expect me to grant it immediately, should they? c. Why could John and Paul not have acted more positively? d. They could, frankly, however, not have anticipated the problems that would arise, could they? There are excellent reasons, then, for supposing that the distribution of not is as follows: 1. It occurs, as in (10), as the sister of a finite auxiliary head. This structure corresponds to wide or narrow scope of negation depending on the auxiliary involved and its meaning. The structure is not found with a nonfmite auxiliary head. 2. It forms a constituent with a following phrase. This is found after a nonfinite auxiliary head. It may also occur after a finite auxiliary head when there is "special intonation" to mark the constituency. It corresponds to narrow scope of negation, and gives English the opportunity for instances of double negation like (1 la). 3.2. Accounting for Not We might suppose that lexicon will supply information of the kind shown in (21) for auxiliaries in basic positive sentences. Since the semantics of the complement VP includes that of its subject (a fuller specification of VP in ARG-ST would include SUBJ<[1]>), this partial feature matrix for should appropriately assigns should a single sentential argument.13 (21)
A typical modal: should in John should leave
It seems reasonable to treat the not of sentential negation as a member of the ARG-ST and COMPS lists, that is, essentially as a complement of auxiliaries, as
180
Anthony Warner
in Warner (1993a); for further detailed arguments for the plausibility of this position see Kim (1995), Kim and Sag (1996a).14 The question then arises of how to treat the scope relations of auxiliaries and not. It is clearly desirable to account for these in a compositional fashion which will make it possible to unify information about the auxiliaries and not within a lexical hierarchy. An interesting general framework for the treatment of scope in HPSG has been developed in the Minimal Recursion Semantics outlined in Copestake, Flickinger, and Sag (1997) and applied in Bouma, Malouf, and Sag (1997), and it is this framework which I shall use to give a compositional account of the interaction of negation with auxiliaries, revising the feature structure of content assumed so far. In Minimal Recursion Semantics, the individual relations which are to be combined within the semantics of a phrase or clause are placed as members of a special kind of list, and the interrelationships between them are mediated by "handles," which identify relations or the argument features of relations. The semantic organization is not represented in terms of bracketed structuring, but in terms of constraints on these handles, where possible constraints involve the coindexation of handles or a statement of their scope relationship. A consequence of this approach is that relevant interrelationships can be underspecified, so that there can be a single semantic representation for a sentence such as Every child read some book, which subsumes both the reading in which every has scope over some, and the reading in which some has scope over every.15 On this approach CONTENT is specified for attributes which include LISZT, KEY, and CONDS. The value of LISZT is the relevant list of relations; that of KEY is a particular designated relation within LISZT: normally (and everywhere in this chapter) it is a relation which a phrase shares with its head; and the value of CONDS is a set of restrictions on handles (h1, h2, etc.) (Bouma, Malouf, and Sag, 1997:8-9). Within this framework the feature structure for could in John could leave will include the information in (22). (22)
A typical modal: could in John could leave
English Auxiliaries without Lexical Rules
181
In (22) the value of LISZT is a single member list which specifies the semantic relation of could, though here for the sake of simplicity only the more general relation appropriate to modals of possibility (psbl—rel) is given. This relation is also the token identical value of KEY. The value of its argument is not simply given as that of the CONTENT of the complement of could, because of the possibility that some other element (a modifier or quantifier) may scope between could and its subordinate verb, as in the reading of John could find an island, which corresponds to the scope assignments: (poss (one (find))). Instead, the value of the argument is a handle, h3. The other relevant handle is that of the relation which gives the semantics of the head verb of the complement of could (i.e., the VP complement's CONTENT |KEY), which I have identified as [h1] following the colon after VP (I shall henceforth consistently use this convention for the CONTENT |KEY |HANDEL value of a phrase). These handles are constrained by the condition CONDS{h3 > h1}, which requires h3 to be identified with h1, as in John could leave, or to "outscope" it as in the example considered above.16 In adjunct structures which consist of a modifier and a syntactic head, such as truly exciting or leave quickly, the head is selected by the modifier by means of an attribute MOD on the modifier (Pollard and Sag, 1994:55ff.). In constituent negation, as in the not go of you can't not go, not occurs as a modifier within an adjunct structure. In the feature structure for not, which is partially specified in (23) where all syntactic information is omitted, MOD CONTENT |KEY | HANDEL defines the handle value of the KEY of the syntactic head of such an adjunct structure, and a condition in CONDS states that the argument of not either has a handle identical to that of the modified element or one that outscopes it: CONDS {h6>h4}.
(23)
Not
Now consider adding not as modifier of the auxiliary head to the ARG-ST list of that head, as in (24), which modifies the feature structure for could in (22) in just this respect. Here, not[MOD |CONTENT |KEY |HANDEL h2] (abbreviated not [MOD | KEY h2]) identifies the HANDEL value within KEY in the modified
182
Anthony Warner
category, could. This not is simply the normal modifier in an abnormal position, being distinct in that it does not make a constituent with the phrase it modifies. When not is added to the ARG-ST list, it will also appear on the COMPS list: this follows from the Argument Realization constraint (discussed below) which states that ARG-ST is the append of the valence lists. At the phrasal level, the Semantics Principle of Copestake, Flickinger, and Sag (1997) will specify values on the mother as follows: the value of LISZT will be the append of the LISZT values for could and its sisters, the value of CONDS will be the union of the daughters' values, and the value of KEY will be identical with that of the head daughter, as in (25). For convenience the indices of h correspond in these and subsequent examples, except that the h4 of not's feature structure in (23) is replaced by the relevant token identical handle (h2 in (25)). (24) Wide scope of negation: could in John could not leave
(25) Wide scope at phrase level: could not leave
The order of scopes imposed at phrase level by the unified set of conditions in CONDS is not > could > leave. So there is a straightforward integration of the
English Auxiliaries without Lexical Rules
183
semantics of not, with appropriate results, where not is specified as the modifier of the auxiliary head. Now let us turn to narrow scope. Here we add not as modifier of the nonsaturated complement to the ARG-ST list of the auxiliary, as in (26), hence also to the COMPS list. At the phrasal level the Semantics Principle will (as before) integrate the values of LISZT and CONDS appropriately, as in (27). Indices again correspond for convenience, except that the MOD | CONTENT | KEY | HANDEL of not is h1, and the condition it supplies therefore h6 > h1. (26) Narrow scope of negation: should in John should not leave
(27) Narrow scope at phrase level: should not leave
Here, though, the feature structure for should parallel to (22) has been modified not just by the addition of not to its ARG-ST list, but by a further condition. The order of scopes imposed by the set of conditions derived from should in positive sentences and from not as modifier of the nonsaturated complement would be should > leave (h3>hl) and not > leave (h6>hl). In order to impose the required should > not the further constraint CONDS {h3>h5} has been added, stated on
184
Anthony Warner
should. So, here, too, there is a straightforward integration of the semantics of not, with appropriate results, provided that not is specified as the modifier of the auxiliary's nonsaturated complement, and a further scope condition is added. Note that the scope conditions here permit a quantifier to intervene between the modal and negation, as in the second of the three readings of You must not eat one cake: (must ( (one))) 'You must eat none'; (must (one( ))) 'There must be one that you don't eat'; (one(must ( ))) 'There is (a particular) one that you must not eat.' See Copestake, Flickinger, and Sag (1997) for details of the assignment of quantifier scope within the HPSG implementation of Minimal Recursion Semantics. 3.3. Integration into the Lexical Hierarchy How will the information in these lexical entries be organized in the lexical hierarchy? Auxiliaries are for the most part raising predicates, and they will inherit information from this general constraint. This includes predicative BE, which has a nonsaturated complement whose SUBJ value is token identical to that of BE. Davis (1996) discusses semantic regularities across the valences of different lexical items, seen as constraints on the relationship between values of CONTENT and those of the ARG-ST list (see,also Wechsler, 1995). In the light of Davis's discussion it seems reasonable to suggest that a "linking type" which encodes the relevant constraints on this relationship for raising to subject categories will be roughly as follows: (28)
Linking type raising to subject
where the whole synsem is + AUX if the second member of ARG-ST is word. Here the relational predicate rel has an attribute ARG whose propositional argument's handle is identical to or outscopes that of the nonsaturated complement's KEY. The possibility of a not on the list in second position is provided for by the optional member word, with the constraint that this is only present if the whole synsem is +AUX.17 Now for the specification of wide scope negation all that is needed is the unification of this type word on the ARG-ST list with not, where not modifies the head auxiliary, i.e., unification with the following information, which places not in second position on an auxiliary's ARG-ST list, and identifies the auxiliary's KEY | HANDEL value with that of not's MOD feature. The rest of
English Auxiliaries without Lexical Rules
185
the necessary information, including the condition specifying scope, is part of the lexical entry for not and need not be specified here. (29)
Wide scope auxiliary negation
This will also undergo unification with the types specified by Davis as underlying transitive verbs. These place the subject in initial position on the ARG-ST list, but allow the direct object to occur at some later point on that list, thus accounting for the typical intervention of the indirect object when both are NP. As in the raising to subject type, not will supply scope information, resulting in wide scope negation. So unification with a single statement for wide scope negation with not will account for all such negation in auxiliaries, including transitive auxiliaries (possessive HAVE and identificational BE) as well as raising auxiliaries. Narrow scope negation is only found with raising auxiliaries, not with transitives. I accounted for it above by placing not on the ARG-ST list as modifier of the nonsaturated complement, and adding to CONDS the condition h3>h5, which requires the modal to outscope negation but leaves open the possibility of a quantifier scoping between them. The relevant information to be unified with the linking type constraint for raising to subject is as follows. The other information required belongs to the lexical entry for not. (30)
Narrow scope auxiliary negation
Thus far we have a notably simple analysis, which makes full use of the lexical entry for not, and depends on general principles of combination by unification. We are in a position to set up a type negated, with a partition of subtypes: wide scope and narrow scope. Negated would establish the basic constraint (that not is added to the ARG-ST list of a finite auxiliary); unification with wide scope or
186
Anthony Warner
narrow scope would add information about what not modifies. Individual auxiliaries in particular senses would inherit either from wide scope or narrow scope. Both of these types would have a unification with raising to subject; the first also with transitive. Negated would also be a subtype of finite aux lex which restricts the partition to finite auxiliaries. But there are some considerations which imply a somewhat more complex hierarchy to permit an appropriate treatment of scope, and this will imply a modification of our treatment of not. 3.4. Other Considerations 3.4.1. SCOPE RESTRICTIONS ON INDIVIDUAL AUXILIARIES Individual auxiliaries are restricted in their occurrence with wide and narrow scope negation. In general, each individual lexeme retains the same scope of negation whether it corresponds to epistemic, dynamic, or deontic modality (here using the distinctions of Palmer, 1979), though may and might are obvious exceptions.18 I shall therefore suppose that this is fundamentally an area of lexical idiosyncrasy, in which only partial generalizations are to be expected. The identification of appropriate scopes is not, however, straightforward, being dependent on the interpretation assigned to the auxiliary in question, itself dependent on the type of account being offered.19 I shall adopt an interpretation based essentially on writings on English grammar, principally Quirk et al. (1985) and Palmer (1979, 1988). I shall also suppose that the auxiliaries do, perfect have and predicative be occur with wide-scope negation; predicative be includes so called "progressive" and "passive" be, in which I argue that it is the complement of be that has "progressive" and "passive" properties, not be itself (Warner, 1993a). The major uses of the modals themselves can reasonably be subdivided as follows (see Quirk etal., 1985: §§10.67ff.): 1. wide scope of negation: can, could, may (deontic), need, dare, will, would. 2. narrow scope of negation: may (epistemic), might (epistemic), must, ought, shall, should. Will and would are particularly difficult to interpret. Quirk et al. say that the distinction between wide and narrow scope is neutralized; I shall in the first instance interpret negation with will and would as having wide scope, taking epistemic uses such as there won't be a problem as (not (future)) rather than (predictable (not)). Similarly "subject-oriented" instances such as She won't behave, I will not surrender will be interpreted as "not willing to" rather than "intend not to," "be disposed not to" (cf. Perkins, 1983:47ff. for a different view). But in note 23 I allow for an account which assigns epistemic will and would either scope, and an account which assigns epistemic will, would narrow scope can readily be constructed. So little hangs on the assignment of particular scopes to these words.
English Auxiliaries without Lexical Rules
187
The best overall account will be the one which most appropriately reduces the amount of lexical idiosyncrasy. From (2) above it is clear that narrow scope negation holds broadly of modals of necessity and obligation, and one might add two complementary constraints to the lexical hierarchy, representing a partial generalization across auxiliaries. The first ties together narrow scope and necessity/ obligation, supposing that a more abstract relation oblig—rel will underlie the modals of necessity and obligation (need, must, ought, shall, should). The second associates wide scope with other relations. This substantially reduces the amount of lexical idiosyncrasy.20 The exceptions are epistemic may and might, which have narrow scope negation, need, which has wide scope negation (and perhaps also epistemic will and would if these may have narrow scope negation). But given the amount of apparent lexical variation in this area, any account must expect a small number of exceptions. I shall suppose that modals inherit from a set of relations which specifies them as epistemic or root (dynamic or deontic), so that epistemic may and might can be specified by reference to a relation may-epistemic—rel. Remember that what I have called syntactic constituent negation after a modal (as in (19b, c)) does not belong here. It has narrow scope because of the presence of not within a complement. This is not a property of the modal, but is a separate issue, which will be predicted from the syntactic combination of the modal with a constituent which happens to be negated. 3.4.2. INFLECTIONAL FORMS IN -N'T A second set of facts which needs to be integrated concern the forms in -n't. These are open to analysis as a series of negative forms (as traditionally), or, more recently, as carrying a negative inflection (Zwicky and Pullum, 1983), since only a proportion of -n't forms is phonologically predictable as the addition of a "cliticized" -n't to the positive. They show both wide and narrow scope of negation. (31) illustrates a typical wide-scope negation. The information associated with -n't is the relation not—rel in LISZT, and the condition CONDS {h6>h2}. (31) Wide scope: couldn't
188
Anthony Warner
(32) illustrates a typical narrow scope negation. Here the information associated with -n't is again the relation not rel in LISZT, and the condition CONDS{h6>hl}, if we suppose that {h3>h5} is rather a property of should. (32)
Narrow scope: shouldn't
These forms also imply that scope facts should be treated separately from the realization of negation as not or -n't, because shouldn't is a word whose semantics includes negation, whereas the should in should not does not include negation. One combination is within the word, the other is syntactic. 3.4.3. GENERAL NATURE OF LEXICALLY RESTRICTED SCOPE RESTRICTIONS Finally, it seems that a preference for wide or narrow scope is characteristic of each individual auxiliary (or sense of an auxiliary) whether negation is realized with not or with inflected -n't or with some other negative (see Palmer, 1979:26). So there is a relationship between the individual auxiliary and negation, not one set of distinct relationships for not and another for -n't. This also implies that scope should not be tied closely to the realization of negation. But here I will deal only not and -n't, leaving for future work the wider implications of the scope facts with never and other negatives. (33) a. Paul shouldn't /should not /should never eat anything uncooked. Paul should eat nothing uncooked, (narrow scope negation) b. Paul couldn't /could not /could never eat anything uncooked. Paul could eat nothing uncooked, (wide scope negation) 3.5. The Inheritance Hierarchy The considerations of section 3.4 imply that it makes sense to distinguish constraints determining scope from constraints which establish the valence alternations of auxiliaries. So I will analyze the facts of negation by setting up two partitions within the lexical hierarchy in a type finite aux lex from which all finite auxiliaries will inherit:
English Auxiliaries without Lexical Rules
189
(i) NEG FORM, a partition of negative types which are basically (but not solely) concerned with valence facts. I have included semantic information about -n't in this hierarchy, though it might better be placed in a separate part of the hierarchy concerned with inflected forms. (ii) NEG SCOPE, a partition with subtypes wide neg scope and narrow neg scope. All finite auxiliaries will also inherit from this, and I therefore place it alongside NEG FORM, recognizing that this location also may need revision in a more comprehensive treatment of scope. The separation of scope facts from the realization of negation means that it is no longer appropriate to treat not as a modifier. In effect, the scope constraints are being treated as a property of the auxiliary head of the construction. Where not is concerned, this can be stated as the selection by the auxiliary head of a particular value of MOD | CONTENT | KEY | HANDEL within not. But as soon as scope is treated more generally, this becomes redundant. It can be stated, but at the cost of some further complexity. So I shall suppose that it is no longer necessary, and that the lexical entry for not should be revised by adding the possibility that it may be [MOD none, CONDS eset]. This attributes to not a syntactic property which is typical of members of the ARG-ST list, and this underscores its status as a complement of the verb. Now in order to make the separation of scope and valence effective, so that there is a unitary statement of scope conditions, I need to set up a list-valued feature NEG, defining it as one of the attributes of content. Its possible values will be elist and a singleton list containing the not—rel relation, which will be identical with one on a relevant LISZT. Its default value is elist. It is parallel to KEY, in that it identifies a specific relation within LISZT. The point of this feature is that, like KEY, it permits reference to feature values within a specific relation, and it is necessary to do this in order to provide a unitary statement of the common scope properties of auxiliaries which have negation in different locations: within the head when inflected with -n't, or within the head's sister not. The partition for negation can then be set up as in (34). In NEGATION, the type negated contains CONTENT |NEG<[not—rel]>, which is common both to auxiliaries which combine with not and to their -n't forms. Not negated specifies that the second member of its ARG-ST list is a phrase: this prevents unification with the partition of raising to subject which has word in this position. In NEG FORM not arg places not in the second position of the ARG-ST list and identifies its CONTENT | KEY not-rel relation as the value of the CONTENT |NEG of the auxiliary head. The type -n't form states the basic semantics of negation by placing not—rel on the auxiliary's LISZT value, where this not—rel is token identical to the value in the NEG list. Here I have assumed that the value of KEY is that of the first member of the LISZT list, and that a high-level default states that it is the only member; this default is superceded by the constraint in -n'tform.21 An appropriate statement about the morphology of the form will also be needed.
190
Anthony Warner
(34)
Part of the inheritance hierarchy for finite auxiliaries
(35)
The types within the partition NEGATION
(36)
The types within the partition NEG FORM 22
Finally in (37) I give the constraints on scope. NEG SCOPE is the partition wide neg scope and narrow neg scope. In these, two relations within the finite auxiliary's CONTENT are isolated: CONTENT KEY is the relation which cor-
English Auxiliaries without Lexical Rules
191
responds to the auxiliary's meaning, and CONTENT |NEG contains not—rel. Then in (a) wide scope conditions are established for relations which are not oblig—rel, except for epistemic may and might; and in (b) narrow scope conditions are established for relations which are oblig—rel and for epistemic may and might. The (a) group covers all of the wide scope modals listed above: can, could, may (deontic), dare, will, would, but it omits need. The (b) group covers all the narrow scope modals listed above: may (epistemic), might (epistemic), must, ought, shall, should. It also includes need, but this actually has wide scope and needs separate statement as an exception. Narrow neg scope, which holds only for auxiliaries which are raising predicates, and which requires reference to the CONTENT | KEY handle of the nonsaturated complement for the statement of scope, refers additionally (as needed) to ARG-ST.23 (37)
The types within the partition NEG SCOPE
and I CONTENT I KEY ~i [oblig_rel] and
[may-epistemic_rel]
Members of type (a) are can, could, may,24 dare, will, would, epistemic uses of (can), could,25 will, would.
and I CONTENT I KEY [oblig_rel] or [may-epistemic_rel]
192
Anthony Warner
Members of type (b) are must, ought, shall, should (and need), and epistemic uses of may, might. Need requires a further exception statement. There is one default statement here, in the CONDS of (37b), where the back slash indicates a value which may be overridden. This permits the formulation of an exception statement for need, the only exception to the partition as formulated. Need is [oblig—rel] with wide scope negation.26 The lexical entry for need will allow for two possibilities: one with [NEG elist] will have no unification with negated; the other will have [NEG ([not—rel])] and a statement which will both affirm the wide scope condition (defining h2 as in 34a) and negate the relevant narrow scope condition, thereby taking precedence over the default in (37b): CONDS {h6>h2} and -CONDS{h3>h5}.27 The condition {h6>hl} need not be negated; it is consistent with wide scope. There is one final (but major) wrinkle. In negated yes-no questions, the normal (perhaps invariable) scope of sentential negation with notl-n't is wide.28 (38) a. Should you not keep sober for once? Shouldn't you keep sober for once? ('Is it not the case that you should keep sober for once?') b. Might there not have been a problem over his drinking? ('Is it not possible that there was a problem over his drinking?') c. Won't there have been a problem over his passport? ('Is it not predictable that there was a problem over his passport?') This looks like a different order of fact from the interaction between lexical item and scope considered here, and I will suppose that a subtype of a type yes-nointerrogative-clause will contain a constraint imposing wide scope negation on auxiliaries. This will unify straightforwardly with the constraint of (37a). If it also negates the specification of narrow scope negation (as in the lexical entry for need just discussed), it will have a unification with the information in (37b), taking priority over the default, and thereby assigning wide scope of negation to should, must, etc. as appropriate. The information in the different partitions of this hierarchy unifies straightforwardly to give the feature structures which permit us to characterize such categories as "epistemic modal with wide scope negation with not," and so on. To exemplify this, here in (39) are two particular results of the unification of the information in the hierarchy of (34)-(37), with raising to subject (28) and aux lex (which supplies only +AUX). The first structure, (39a), corresponds to a wide scope use of not, the second, (39b), to narrow scope -n't.
English Auxiliaries without Lexical Rules
193
(39) a. Information resulting from the unification of the types: raising to subject, aux lex, finite .aux lex, negated, not arg, wide neg scope
and I CONTENT I KEY
[oblig_rel] and
[may-epistemic_rel]
b. Information resulting from the unification of the types: raising to subject, aux lex, finite aux lex, negated, -n't form, narrow neg scope, and the information on type word given in note 21.
and I CONTENT I KEY [oblig_rel] or [may-epistemic_rel]
194
Anthony Warner
3.6. Conclusion In effect I have presented two accounts of negation here. In the first, not is a modifier, and scope differences depend on the fact that what it modifies is selected by its head auxiliary. The second account makes a general statement about scope which covers -n't negatives as well as not. Under this account, the conditions which establish scope restrictions are stated directly as a property of the auxiliary head, and it is no longer necessary to treat not as a modifier. Both of these accounts of negation depend on the integration of simple properties: those of the semantics and syntax of auxiliaries, those of not and -n't, and those of statements of scope. There is some complexity in identifying the relevant scope for individual modals, but this is only to be expected when dealing with the lexical idiosyncrasies of this intricate group. The accounts are compositional in the very straightforward sense that when not is on the ARG-ST list, the semantics of not is unified into the analysis. This is a desirable treatment, which argues for the appropriateness of these analyses. The second analysis accounts not only for both wide and narrow scope negation, but also for the partially systematic distribution of auxiliaries with these different scopes, and can even accommodate the different behavior of yes-no questions. It therefore goes beyond the earlier analyses of Warner (1993a), Kim and Sag (1996a), and Kim (1995) in several respects. But what is most striking is the fact that these accounts depend almost solely on the simple addition or unification of information. Not even the restricted manipulations of lexical rules are required. They therefore provide a strong argument against the use of lexical rules in HPSG. Moreover, the fact that the unification of information can smoothly account for so much strengthens the position developed in Kim and Sag (1996b), which argues against the use of a more complex structure containing functional categories and employing movement to capture generalizations.
4. SUBJECT-AUXILIARY INVERSION Inversion of subject and finite auxiliary occurs in main clause interrogatives, in tag questions, after a fronted negative with scope over the auxiliary, in and neither and and so tags, and restrictedly in conditionals and comparatives (Quirk et al., 1985: §§18.24, 15.36). (40) a. Could you see the horizon? b. At no point could I see the horizon. c. I could see the horizon, and so could Harry. Such clauses are best analyzed within the framework adopted here as having a "flat" structure in that the finite auxiliary, the subject phrase, and the complement phrase are all sisters, as in (41), though the feature content of V here is a matter for further discussion.
English Auxiliaries without Lexical Rules
195
An alternative which has been proposed (e.g., by Gazdar, Pullum, and Sag, 1982) takes the subject and complement to form a constituent (as in: could [you see the horizon]?}. But the arguments for this structure are unsatisfactory, and it involves difficulties, so it is better rejected. Thus the possibility of coordinations of the sequence subject + complement as in (42a, b) does not necessarily imply the constituency of subject + complement in (40), given that the possibility of coordination in ditransitive structures, such as (43), implies that we need a more general account of these phenomena. The same line of thinking leads to the rejection of instances of Right Node Raising of the subject + complement sequence as an argument for constituency (suggested in Borsley, 1989), since Right Node Raising is not an infallible test for constituency (Abbott, 1976). (42) a. Will Paul sing and Lee dance? (Gazdar, Pullum, and Sag, 1982:612 [49a]) b. Is Paul beautiful and Lee a monster? (Gazdar, Pullum, and Sag, 1982: 612 [49g]) (43) John gave a record to Mary and a book to Harry. There would be evident difficulties, too, for dealing in a natural and motivated way with the facts of subject-verb agreement and nominative case assignment if the sequence subject + complement were a constituent, since we would naturally expect the mother of this "small clause" to be [SUBJ elist], and this would imply that the subject should be oblique and that the auxiliary should lack agreement. Moreover, a further argument against this constituency can be constructed from the survival of the subject in ellipsis. Suppose that the sequence subject + complement is indeed a constituent in inverted clauses. Then the inverted auxiliary must have the category of this "small clause" on its COMPS list. But ellipsis of phrasal complements after auxiliaries is free, and the clause complement of an
196
Anthony Warner
auxiliary may undergo ellipsis, as in (44).29 So why should not the possibility of ellipsis be generalized to inverted instances? This would predict the occurrence of the auxiliary without subject or complement as an instance of elliptical inversion, as in (45), and this is not well formed. (44)
Would they rather Paul came on Tuesday? —Yes, they would rather.
(45) Can we go to Disneyland? *Please can? [sc. we go to Disneyland] So it is best to analyze these inverted clauses as "flat" in structure: the finite auxiliary, the subject phrase, and the complement phrase are all sisters. The finite head auxiliary will also carry the head feature [+INV], which is justified on both syntactic and morphological grounds: syntactically by the restricted distribution of inverted clauses and morphologically from the uniqueness of aren't in aren't I?30 The most satisfactory way of generating this flat structure is to change the valence list membership of its head by placing the first member of the ARG-ST list on the COMPS list [so that the head auxiliary has its subject as the first item on its COMPS list, and is consequently SUBJ elist, as in (41)], and to use Schema 2, the HEAD-COMPLEMENT SCHEMA, which is also used to specify the structure of VPs. The case of the subject and subject-verb agreement will be specified by reference to the initial member of the auxiliary's ARG-ST list. The information which will appear in a type inverted, a subtype of finite aux lex, will be as follows: (46) inverted
here ® is append, [1] is a synsem, and [2], [3] are lists of synsem (which may be empty). This immediately raises two questions. The first concerns the use of Schema 2; the second the details of this formulation. Pollard and Sag discuss two ways in which the flat structure might be generated (1987, 1994: §§1.5, 9.6).31 One is to specify a change of valence and use Schema 2, as just suggested.32 The other is to retain the same values of SUBJ and COMPS as in a noninverted structure, and to use Schema 3, the HEAD-SUBJECTCOMPLEMENT SCHEMA, in which both the subject and the complements of the lexical head are its sisters. This schema is apparently required for other languages; see Borsley (1995) for the suggestion that universal grammar must provide for Schema 3 to cope with the facts of Syrian Arabic (alongside Schema 2 which is required for Welsh). On this account, the types inverted and not inverted will differ
English Auxiliaries without Lexical Rules
197
only in that the former is [+INV], the latter [—INV], though it will be necessary to constrain the values of INV which occur with the structural schemata.33 Consequently, the demonstration that it is possible to account for constructions with auxiliaries within an inheritance hierarchy (which forms my more general concern in this chapter) goes through directly for this construction, since the feature content of these inverted and uninverted subtypes differs only in this particular respect. I prefer, however, to reject this account, on the ground that this would constitute the only use in English of Schema 3, and such an isolated use is unattractive unless there is a clearer rationale for its adoption.34 But the matter remains uncertain, though the adoption of Schema 2 is probably currently better motivated, and my general point can also be demonstrated with respect to this schema. Proceeding then on the assumption that Schema 2 is appropriate, the other question is essentially as follows. Given that we need to ensure that the first member of the ARG-ST list appears on the COMPS list, what is the best way of doing this? The definition of inverted in (46) identifies the initial members of the ARG-ST and COMPS lists, and leaves the specification of SUBJ as elist to the Argument Realization constraint of (47), which defines the ARG-ST list as the append of the valence lists. (47)
Argument Realization
But why not reverse this? Why not define the type inverted by constraining the value of SUBJ (and SPR) to be elist, thus forcing the subject onto the COMPS list via the Argument Realization constraint? This apparent point of detail matters because the formulations have different consequences. If the subject is forced onto the COMPS list via the Argument Realization constraint, then it is difficult to see why it should not behave like other members of COMPS and be subject to ellipsis, so that examples like the Please can? in (45) would be predicted grammatical, or extracted, so that sentences like John must leave would be assigned an additional filler-gap structure (John [+INVmust leave]) alongside the subject predicate structure.35 Given the formulation of (46), however, the initial member of the COMPS list cannot be subject to ellipsis under the account developed below, since this would require unification with a constraint requiring its absence from the COMPS list, and (46) specifies precisely that it is present. Nor can it be subject to extraction if we have an account in which complement extraction is defined by unification with a constraint permitting an appropriate mismatch between the COMPS list
198
Anthony Warner
and the ARG-ST list of some lexeme (in the spirit of Bouma, Malouf, and Sag, 1997), and for the same reason: its presence on the COMPS list is specified in inverted, and there can be no unification of this with a constraint which specifies its absence from the list. Given the important consequences of this minor difference of formulation, some justification for the adoption of (46) over the alternative is clearly in order. Two considerations are relevant. The first involves the role of specifiers, which has not so far been discussed, but which figure in the statement of Argument Realization. It seems reasonable to suggest that it would in fact be necessary to specify inverted as [+INV, SUBJ elist, SPR elist] to ensure that the subject was forced onto the COMPS list.36 This is not as intuitively appropriate as (46), and it is not more economical, since both constrain the values of three attributes. So at least the alternative carries no advantage. The rationale underlying this position is that it seems likely that auxiliaries may have specifiers, and that they are not restricted to being lexical items. Candidates for such status are floated quantifiers, which could reasonably be analyzed (at least in part) as specifiers of verbs and predicative items. The distribution of all, both, and nearly all (for example) is to a large extent consistent with this, since these readily precede overt VP, as illustrated in (48), but do not precede an ellipsis site, or follow VP. If they are treated as specifiers of V, this gives a better motivated account of these positional restrictions than does analyzing them as adverbs (as in Kim and Sag, 1996b). (48) The old men (_) would (_) have (_) liked (_) to (_) fly (*_). All may appear in any of the positions indicated except the last. Second, the acquisition of the formalism of (46) is in accord with the thinking behind the Subset Principle (Berwick, 1985), which posits a preference for the most restrictive hypothesis as a factor in learning, thus requiring a grammar generating the smallest set of structures. This principle holds for very precisely defined cases. In the present instance we have two output languages, one of which is a subset of the other, since the formulations differ only in that one generates additional instances with an empty subject position in inverted structures. We might suppose that a learner is moving from an antecedent analysis in which inverted structures are fully specified. Of the two alternatives, the unmarked analysis (to be preferred in the absence of contrary evidence) is the conservative one, which retains the output of the antecedent analysis, whereas the marked analysis, requiring positive evidence for its adoption, goes beyond this in adding the possibility that the subject may be absent. This is exactly the kind of situation to which the Subset Principle is relevant. This is, moreover, surely an area with plentiful data: children will presumably be exposed to ample evidence of the subject's presence in tag questions and inverted elliptical structures, to set alongside the complete absence of data such as the Please can of (45). So this analysis is learn-
English Auxiliaries without Lexical Rules
199
able in principle. On both of these fronts, then, it seems reasonable to adopt the formulation of (46). On this account, finite auxiliaries must meet one of the two constraints inverted and not inverted. Inverted assigns [+INV], and it identifies the first item on the ARG-ST list with the first item on the COMPS list. Unification with Argument Realization will specify the value of SUBJ as elist. Not inverted assigns [—INV, SUBJ <[]>]and auxiliaries of this type will be SUBJ<[1]>, ARG-ST<[1], . . .) in accordance with the Argument Realization constraint. Both constraints will be subsorts of finite aux lex within the lexical hierarchy. (49) a. finite aux lex
[HEAD[fin]] b. inverted
c. not inverted
So we can set up the partition of (50):
Here the types within NEGATION are as in (35)-(37). Any member of the type finite aux lex must inherit from both of the partitions NEGATION and INVERSION, just as any member of negated must inherit from both of NEG FORM and NEG SCOPE. This results in the pair-wise unification of the subordinate types in each partition into more complex types as follows:
200
Anthony Warner
wide neg scope + inverted + not arg wide neg scope + not inverted + not arg wide neg scope + inverted + -n't form wide neg scope + not inverted + -n't form not negated + inverted
narrow neg scope + inverted + not arg narrow neg scope + not inverted + not arg narrow neg scope + inverted + -n't form narrow neg scope + not inverted + -n't form not negated + not inverted
These unifications contain the information which characterizes the associated constructions. Thus the information in the partition of (50) will be unified as follows in the case of an inverted structure containing not and showing wide scope negation. This unifies the information in inverted with that of (39a). I omit the further specification of ARG-ST and COMPS, which depends on the specific auxiliary selected, but unify in Argument Realization. (51)
Unification of the types raising to subject, aux lex, finite aux lex, negated, not arg, wide neg scope, inverted with Argument Realization.
and I CONTENT I KEY
[oblig_rel] and
[may-epistemic_rel]
In the case of can, this will result in (52), which has wide scope conditions as appropriate for a yes-no question. I also include the specification of LISZT, and the further specification of the complement as VP. (52)
Can in inverted sentences with not of sentential negation; as in Can you not go and help her?
English Auxiliaries without Lexical Rules
201
Thus the various unifications supply the information required for all the finite nonelliptical auxiliary constructions listed in section 2, given the relevant structural schemata and constraints on linear precedence.
5. LINEAR PRECEDENCE The proposal that inverted clauses with auxiliaries have a flat structure raises the question of how appropriate linear ordering is established. In HPSG generalizations about the left-right order of sisters are made by Linear Precedence Constraints. Linear Precedence Constraint 1 (LP1) says simply "lexical heads in English are phrase-initial" (Pollard and Sag, 1987:172).37 (53) LP1
HEAD [word] <[]
This ensures that the auxiliary is first in its VP, and that it is initial in "flat" inverted constructions. Linear Precedence Constraint 2 refers to the "obliqueness hierarchy," which for complements is established by the order of members within ARG-ST, from left to right, taking the subject as the least oblique member. But the Linear Precedence Constraint does not simply impose the ordering which corresponds to the obliqueness hierarchy, because particles show a limited departure from it. The correct generalization is that "complements must precede more oblique phrasal
202
Anthony Warner
complements" (Pollard and Sag, 1987:176). For inverted structures this establishes the ordering subject < other phrasal complement (where the "other phrasal complement" may be a VP, or XP[+PRD] or NP). Since not is subcategorized for, and occurs in second position in ARG-ST, it too is part of the obliqueness hierarchy, and it will precede those phrasal complements that the subject must also precede. But if it is of type word, the subject (although less oblique) need not precede it in inverted structures: the order subject + not and the order not + subject both comply with Linear Precedence Constraint 2. Hence the principles set up for English in Pollard and Sag (1987), together with the interpretation of not as a complement of type word in second position on the ARG-ST list, account immediately for the variation in position of not found with NP subjects in inverted structures, and provide some further evidence for its analysis as a complement.38 (54) a. Will not this hypothesis be upheld? b. Will this hypothesis not be upheld?
HEAD [wore?]-no?-subject-VP HEAD [word] - subject - not - VP
6. POSTAUXILIARY ELLIPSIS 6.1. A Feature Mismatch Account Now that I have shown that it is possible to state the constructional differences of negated/not negated and inverted/not inverted auxiliary structures within a lexical inheritance hierarchy, I will turn to the final topic, ellipsis. Auxiliaries both finite and nonfinite may appear in elliptical constructions without their normal complement, where the sense of the complement is to be retrieved from the linguistic context. This context need not be within the same sentence or uttered by the same speaker. What is characteristic of this construction is the presence of an auxiliary before the gap, and the fact that a subcategorized phrase is missing.39 (55) a. John may come on Tuesday, but I don't think Paul will. [sc. come on Tuesday] b. John may come on Tuesday. —Well, I don't think Paul will. [sc. come on Tuesday] c. Mary is happy to eat meat or fish. —Is she? Well Paul never has been, and John certainly won't be. [sc. happy to eat meat or fish] d. Go to bed! —I am! [sc. going to bed] e. Are you going to bed? —Not yet, but I will [sc. go to bed] It seems most appropriate to deal with Postauxiliary Ellipsis in terms of a mismatch between the COMPS list and the ARG-ST list: the subcategorized phrase
English Auxiliaries without Lexical Rules
203
is absent from the COMPS list, but remains on the ARG-ST list. Several considerations are consistent with this approach. (i) What is retrieved is constrained by the context of ellipsis. In (55d), for example, the progressive going, not the imperative go, is what is needed; in (55e) we must retrieve the infinitive which is required after will, not the sense of the progressive antecedent. This syntactic information will be present in the ARG-ST list where the complement of be is [XP[+PRD]], matching the [+PRD] of the progressive; the complement of will is VP[bse], matching the required infinitive. Thus the information in ARG-ST can straightforwardly supply the basis for an account of this aspect of the construction. (ii) Any account of Postauxiliary Ellipsis must allow for the fact that unbounded dependencies can enter ellipsis sites, as in (56), and that they are subject to syntactic constraints within the site, as in (57). This implies an account of retrieval which involves syntactic reconstruction in situ, like that of Lappin (1997), a position which is compatible with the presence of the relevant syntactic information on the ARG-ST list. (56) a. Mystic Meg told me which dishes Harry had eaten, and which George would [sc. eat]. b. I wonder which of the twins George will propose to, and which he will not [sc. propose to]. c. How much will Harry earn next year I wonder? —More importantly: how much will George? [sc. earn next year] (57) a. John read everything which Mary believes that he did. b. *John read everything which Mary believes the claim that he did. c. *John read everything which Mary wonders why he did. (examples from Lappin, 1997:112) Since there is no reason why members of the ARG-ST list should not carry the SLASH values which encode information about the categories involved in unbounded dependencies, this falls out straightforwardly. Notice, though, that this rules out an account of Postauxiliary Ellipsis as a dependency of this type which is locally bound by the auxiliary, so that will in (55a) would be ARG-ST , with the complement's LOCAL value token identical to the single member of its SLASH set, which is bound off by the auxiliary head (cf. the partial parallel with tough constructions as analyzed in Pollard and Sag, 1994:166ff.).40 Since the value of SLASH is of type local, it cannot be a category which is itself slashed; but this is what would be required in (56).41 (iii) The need to provide for a SLASH category in ellipsis also makes less plausible an account in terms of a separate preform type for categories on the ARG-ST list which are absent from the COMPS list (cf. the pro-ss suggested for Japanese "free pro-drop" in Manning and Sag, 1997:3). The null pronominal
204
Anthony Warner
would need to admit a gap, and this does not seem plausible. Nor is there any parallel in the use of English so, which would support such an account.42 It seems most satisfactory, then, to treat Postauxiliary Ellipsis as showing a mismatch of canonical-synsem between the COMPS list, from which the subcategorized phrase is absent, and the ARG-ST list, on which it remains (where canonical-synsem is the type overt categories belong to Sag, 1997:446). In the case of Paul will [sc. come on Tuesday] (55a) the VP is:
6.2. There Clauses So far I have spoken as if Postauxiliary Ellipsis involved the absence of a single category. But there is the added complication of clauses with subject expletive there. In these, BE is not followed by a "small clause" but has two phrases in its COMPS list in its basic subcategorization, if we follow Pollard and Sag (1994: 147ff.).43 May either or both of these be absent by virtue of ellipsis? I want briefly to consider this question, in the hope of achieving a more adequate view of Postauxiliary Ellipsis. First, then, consider sentences in which the second complement is locative. Here it seems that either of the complement phrases may be absent, as may both simultaneously; see (59) and (60). Ellipsis of only the first phrase can be difficult, though clearly possible, as in (59c) and (60C).44 One requirement is apparently that the second phrase not merely repeat old information; thus the first response to (59c) cannot naturally follow the question of (59a). A nonelliptical variant with some pronominal NP inserted is also often preferable, but a range of examples without it seem nonetheless to be grammatical, and I give some further examples of ellipsis of only the first phrase in (61). Ellipsis of the second phrase seems in contrast to be rather free, in that it typically depends on the prior establishment of a location which need not be overt linguistically; see (62). This is unlike most other cases of ellipsis after auxiliaries which do typically require an overt, linguistic antecedent. This reflects the distinction between deep anaphora, which is under pragmatic control, and surface anaphora with its "requirement for antecedent-anaphor parallelism" drawn by Hankamer and Sag (1976; Sag, 1979:156). The deep anaphora of an absent locative complement might be dealt with by gen-
English Auxiliaries without Lexical Rules
205
erating it optionally and permitting contextual retrieval of the relevant information. This might reflect an optionality of content, or a mismatch between CONTENT (with locative information) and ARG-ST (without a corresponding locative phrase), which is a property of the construction with there. Either way, Postauxiliary Ellipsis could be formulated to remove only a single complement of BE, the first; where the locative is absent but understood, this can be dealt with separately. (59) a. Is there a seat in this row? —Yes, there is [sc. a seat in this row]. —No there isn't. b. Is there a seat in this row? —Yes, there is a seat [sc. in this row]. —No there isn't a seat. c. Is there a seat somewhere? —Yes, there is in this row [sc. a seat]. —No there isn't anywhere. —There's a seat in this row, and there is in the next row too. (60) a. He says there is a decent restaurant in the marketplace, but there isn't [sc. a decent restaurant in the marketplace]. b. He says there is a decent restaurant in the marketplace, but he's wrong. There's a scruffy cafe, but no restaurant [sc. in the marketplace]. c. Is there a decent restaurant in the marketplace? —No, there isn't anywhere in the village. (61) a. There's not much soup. —There is on my dress! b. Was there much of a mess? —There was on the floor. (62) a. We are going to the theatre in Scarborough. —Will there be a decent restaurant? [sc. in the theatre; in Scarborough; on the trip] b. When it rains, the flat roof retains water. Sometimes there is quite a lake, [sc. on the flat roof] c. I've used a feather duster but there are still lots of cobwebs, [sc. up by the cornice, etc.] This does not, however, seem to be a sufficient account for instances in which an adjectival or participial phrase follows the NP. In these, it seems that either or both of the phrases may be absent, as with locatives [see (63), (64)], though examples typically need more careful contextualization than locatives if they are to seem natural. But the process of retrieval is unlike that with locatives in that it is apparently tightly controlled by the existence of an appropriate linguistic antecedent in a parallel construction [see (65) and contrast (63)], so that we seem to have
206
Anthony Warner
instances of surface anaphora, not deep anaphora, in such cases. This implies that Postauxiliary Ellipsis must provide for the ellipsis of either or both of the complement phrases in this construction. I suggest then that: 1. Postauxiliary Ellipsis may affect any phrasal member belonging to the strict subcategorization of an auxiliary (including locatives).45 2. Locative complements are optional in there constructions. This gives us a nicely general picture of Postauxiliary Ellipsis. This ellipsis is, however, subject to some quite sharp restrictions of naturalness in particular circumstances, which I do not understand but which I suppose are semantic/pragmatic in nature.46 (63) a. Are there any first-year students angry about their grades? —Yes, there are [sc. some first-year students angry about their grades]. —No there aren't. b. Are there any first-year students angry about their grades? —Yes, there are some [sc. angry about their grades]. —No there aren't any. c. Are there any first-year students angry about their grades? —No, but there are upset with their teachers. (64) a. He didn't tell me there was any treasure buried in the garden, but I wonder if there is hidden in the orchard. b. There is a complex variable hidden in the first formula, so I wonder if there is lurking in the second. (65) a. *First-year students are writing a tutorial essay this week, but there aren't any second years [sc. writing a tutorial essay this week] b. There are some first-year students writing a tutorial essay this week, but there aren't any second years [sc. writing a tutorial essay this week]. c. The postgraduates are upset about their grades —*Are there any first-year students? [sc. upset about their grades] d. Many of those attending said they were interested in Amnesty International. —*Were there any capable postgraduates? [sc. interested in Amnesty International] 6.3. Formulating an Ellipsis Constraint So we need a statement of ellipsis which will allow for the absence not just of a single complement, but of either (or both) of two complements: of any phrasal member of the auxiliary's strict subcategorization. In formulating this it is simplest to start by considering a version of the Argument Realization constraint which deals with extraction. In (66) I give a statement which is straightforwardly
English Auxiliaries without Lexical Rules
207
derived from that of Bouma, Malouf, and Sag (1997) by substituting ARG-ST for their list feature DEPS, and by using the operation of "list subtraction" in place of "shuffle" or "sequence union." SPR has been omitted for simplicity. Their constraint is intended to introduce a mismatch between the COMPS list and (the noninitial members of) the ARG-ST list, as part of their account of the structures introducing the gaps which terminate the percolation of SLASH categories through a tree. Such a gap is characterized in terms of a type gap-synsem, which has its LOCAL value token identical to the single member of its SLASH set, [LOCAL [1], SLASH {[1]}] (Sag, 1997:446). Then at the extraction site, the head's ARG-ST list contains a gap-ss which is not also in the COMPS list, although in the alternative construction without extraction the corresponding canon-ss would have been in the COMPS list. The grammar will include a constraint that members of COMPS are all canon-ss. Then (66) requires the ARG-ST list to be the adjunction of the SUBJ list and the COMPS list, omitting any gap-ss which would have appeared within the COMPS list. Remember that there will be further constraints on the value of SUBJ, such as those in INVERSION above, or Bouma, Malouf, and Sag's (1997) constraint that verb lexemes all have a singleton list as the value of SUBJ. (66)
Argument Realization allowing gaps
© is "list subtraction." It is defined to remove the members of the second list from among the members of the first, while preserving their order with respect to each other. So (66) requires that any canon-ss members of ARG-ST not in the SUBJ list are present on the COMPS list, and that any gap-ss members of the of ARG-ST not in the SUBJ list are absent from the COMPS list. Now Postauxiliary Ellipsis can be stated as a generalization of this constraint which lacks the requirement that the list El should consist of gap-ss. The relevant information will be as in (67). (67) Argument Realization for auxiliaries including Postauxiliary Ellipsis
Here El may be null, in which case we get the full construction; it may contain one or more canon-ss, in which case we get an elliptical construction; and it may contain a gap-ss, in which case we get a construction with a gap (e.g., an
208
Anthony Warner
unbounded leftward dependency).47 The value of [H may never include the subject, since INVERSION places this synsem either within a singleton SUBJ list, or (in an inverted structure) within COMPS. As a member of the COMPS list it cannot, of course, be missing from the COMPS list, so the subject of an inverted structure can neither be a gap-ss nor removed in ellipsis. I shall also suppose that not will be placed on COMPS in not arg in order to prevent its being interpreted as a member of the SPR list by the full version of Argument Realization. Then, and for the same reason as in the case of subjects, not also cannot be removed in ellipsis, as is correct: see the discussion of section 3.1 and examples (19a, d) which I repeat here.48 As before, this assumes a constraint that all members of the COMPS list are required to be canon-ss. (19) a. They [must] [not] [arrive late]! —Neutral intonation, narrow scope of negation —Indeed they must not. —*Indeed they must. d. You [may] [not] [join us for lunch]. —Neutral intonation, wide scope of negation 'you are not permitted to join us for lunch' —I repeat: you may not. —*I repeat: you may. The relevant constraints can be stated in the hierarchy given in (68), which revises (66). Here the partitions ELLIPSIS, LEX require any member of word to be of a type or types found within LEX, the lexical hierarchy, and to be either of type nonelliptical or of type potentially elliptical. The type nonelliptical defines the additional constraint that the difference between [U in ARG-ST and the value of COMPS is a list whose members are gap-ss, adding that the overall synsem may not be [+AUX]. The only information that potentially elliptical need supply is that the synsem is [+AUX].49
English Auxiliaries without Lexical Rules
209
This is a simple and general statement, which maximizes the parallel between ellipsis and extraction by treating the list-mismatch involved as the same fact in each case, and which incorporates ellipsis into Argument Realization. It is difficult to imagine a more economical statement. The constraint of (68) [or (67)] has a straightforward unification with the constraints proposed earlier for negation and inversion to yield the full list of the basic distinctive constructions of auxiliaries. In the case of negated inverted can, see (52), we previously assumed unification with the subcase of Argument Realization which involved neither gapped nor elliptical complements. But unification with potentially elliptical will permit a mismatch in the final member of the ARG-ST list (and in only that member). Thus the unification of the feature structure for can in negative inversions given in (52) with potentially elliptical will define a feature structure for the can of (say) Can you not?, which differs from (52) only in that the final member of the COMPS list is absent. So this analysis can cope simply with the interaction of the negated, inverted, and elliptical constructions distinctive of the English auxiliary within an analysis which has two distinguishing characteristics.50 First, it is fundamentally lexical. The discussion has centrally involved the properties of lexical items. Second, it involves the mere addition of information, its unification without alteration, mapping, or manipulation, except for that involved in a small number of defaults. Moreover, none of the types posited is recursive; what is involved is the simple, unifold addition of information. Thus even the relatively minor freedom conferred by lexical rules within HPSG is not needed to deal with English auxiliaries. This is quite an impressive result. When I started to investigate this possibility there seemed to be many difficulties, especially in dealing with negation. But they have given way, and the resulting account is simple and elegant. If such an overtly unpromising area can be described without lexical rules, then it must seem likely that lexical rules can be avoided more generally. 6.4. Extraction and Ellipsis A final point: the constraints just formulated predict that both extraction and ellipsis may simultaneously apply after an auxiliary. Is this correct? The relevant data are essentially restricted to clauses introduced by there, and they are far from transparent, perhaps because of the interaction of the discourse and pragmatically based requirements of this construction: so there are difficulties for many instances that seem impossible. As with the data cited above, there can be a contrast with a preferable version which lacks ellipsis of the first complement. But examples like the following imply that ellipsis and extraction may apply simultaneously to the complements of existential BE; certainly there seems to be no reason to deny this possibility. In (69) I cite instances of ellipsis of the first complement, movement of the second complement.
210
Anthony Warner
(69) a. The author says there is a variable in each of these expressions. In the first I think we can agree that there is [sc. a variable]. But in which of the others do you think that there actually is? [sc. a variable] b. He told me there might be buried treasure under some of these tumuli, but he didn't say under which of them he thought there really would be [sc. buried treasure]. c. There isn't any dust under the carpet now, but I'll show you where there was. [sc. some dust] d. There isn't anyone eager to move office, but willing to be persuaded there just might be [sc. someone] e. Is there a problem over finances? Not exactly, but over the timing of payments, there certainly is [sc. a problem]. In seeking instances of the ellipsis of the second complement with movement of the first complement, we need to take note of the distinction drawn in section 6.2 between the absence of a locative complement and the Postauxiliary Ellipsis of the adjectival or participial complement. The second required an appropriately parallel linguistic antecedent and was taken to show Postauxiliary Ellipsis; the first did not, and was dealt with as a possible mismatch between ARG-ST and CONTENT. In (70a, b) I give instances of the second type. (70) a. There were several of our vehicles involved in accidents last year. —How many cars were there? [sc. involved in accidents last year] b. You say there are several postgraduates upset at his cavalier attitude to teaching. We must also ask how many undergraduates there are [sc. upset at his cavalier attitude to teaching]. There are also restricted instances that may show the Postauxiliary Ellipsis of a locative, as in (7la, b) (contrast (72a, b) which imply that the locative is indeed subject to ellipsis rather than being optional). (71) a. There is a good deal on my conscience. I hate to remember how many lies and deceptions there are [sc. on my conscience] b. There are some funny anecdotes in his repertoire. I was amazed at how many spicy tales there are. [sc. in his repertoire] (72) a. I have a good deal on my conscience. ?*I hate to think how many lies and deceptions there are [sc. on my conscience] b. Are you familiar with his repertoire? ?You'11 be amazed at how many spicy tales there are. [sc. in his repertoire] So I shall conclude that it is reasonable to accept that ellipsis and extraction may indeed apply simultaneously in principle, and that the difficulties which infect many instances are to be separately accounted for. Within this account the lack of a restriction on the length of the list which represents the mismatch between
English Auxiliaries without Lexical Rules
211
ARG-ST and COMPS seems natural: it is a less informative, less constrained statement than one which prescribes a singleton list. The naturalness of this generalization deals neatly both with the fact that there sentences may show ellipsis of both complements, and that they also allow both ellipsis and a gap. If we suppose that such structures as Who did you show? Who did you tell? Who do you teach? What did you ask? How much do you owe? involve both ellipsis and a gap, then children will presumably have ample evidence for the acquisition of this form of the condition, even if such evidence is uncommon in sentences with there.
7. CONCLUSION In this chapter I hope to have accomplished two things. The first is to provide a rather simple and convincing formal analysis of auxiliaries, which appropriately generates their characteristic structures, and integrates the account of negation with an account of scope. The account of ellipsis is itself integrated into the account of Argument Realization which incorporates the distribution of gaps. The second is to establish the theoretical point that it is straightforwardly possible to deal with the distribution of auxiliaries across different constructions correctly and insightfully simply by making use of the framework for the integration of information provided by the lexical hierarchy of HPSG. It seems safe to conclude that the constructions which are distinctive for English auxiliaries can be interrelated by unification within a lexical inheritance hierarchy, without recourse to lexical rules, despite the complexity of this area of grammar. Kim and Sag (1996a, 1996b) have argued forcefully against analyses which characterize negation within the auxiliary system in terms of the movement of categories to functional heads. This chapter supports this line of argumentation, but goes further. It forms part of a case against the use of lexical rules for valence alternations which lack a morphological reflex in syntax. Other work that also points in this direction is the grammar of extraction and adjunction in Bouma, Malouf, and Sag (1997) which eschews lexical rules in favor of a constraint-based account, and the related discussion of Bouma (1997) which suggests the elimination of lexical rules for valence alternations. There are clear advantages to eliminating lexical rules: we gain what is in practice a more constrained descriptive apparatus, and one which is more conceptually unified; we also avoid the problem of interpreting or constraining the order of rule application. This suggestion should also be seen in the context of other proposals which dispense with lexical rules in favor of inheritance hierarchies in interpreting the internal structuring of morphologically complex words, like Riehemann (1994) or Manning, Sag, and lida's (1997) account of Japanese causatives, which imply that a more general avoidance of lexical rules may be possible.
212
Anthony Warner
The simple, clear conclusion, however, is that HPSG should not use lexical rules for valence alternations in syntax.
NOTES 1
I am particularly grateful to Ivan Sag for detailed comments and for constructive and valuable suggestions. I am also grateful to Bob Borsley, Ann Copestake, Steve Harlow, Dick Hudson, and Robert Levine for their helpful comments, and to David Adger for some discussion. But they are not to be blamed for the mistakes, which are mine. I completed this paper during my tenure of a British Academy Readership, and I am grateful to the British Academy for their generous support. 2 For this theory see principally Pollard and Sag (1994). For more recent developments see particularly Sag (1997), Miller and Sag (1997); for an introduction to the theory Borsley (1996). 3 Here and in what follows I normally omit SPR whose value is briefly discussed below in section 4. 4 Here I assume that within the type verb, +PRD is restricted to -ing forms and "passive" participle forms, as argued in Warner (1993a). 5 I suppose (following Pollard and Sag, 1994:337-8) that semantic relations are to be defined directly as types, dispensing with the attribute RELATION of Pollard and Sag (1994), and that available types include psbl—rel, which is the general type underlying the semantics of modals of possibility and permission; oblig—rel, the general type underlying the semantics of modals of necessity and obligation; and not—rel, which defines the semantics of negation by not. I omit discussion of quantification and the richer feature structure required to distinguish it. 6 The hierarchy as presented is in effect a lexeme hierarchy. I suppose that syntactic information about inflected categories appears in separate partitions at appropriate points within the hierarchy, and is unified in (in a version of the account in Miller and Sag, 1997), rather than being specified by lexical rules which map underspecified lexemes into fully specified words (as in Sag and Wasow, 1997). 7 Here for the sake of illustration I assume a contrast between types elliptical, not elliptical which is not explicitly defined below. 8 As proposed in Warner (1993a). See Kim (1995) and Kim and Sag (1996a) for a similar account of the structures of English with a more detailed presentation of particular arguments. 9 There are difficulties with may in tags, but note that the naturalness of deontic may in tags is restricted by the fact that it is subjective; and that epistemic may is "not often used in questions" (Quirk et al., 1985: §11.13). 10 For a short discussion of the occurrence of not "within" ellipsis after a nonfinite in pseudogapping see Warner (1993a: 249 note 18, to p. 83). 11 The kind with special intonation is what Quirk et al. call "predication negation," (1985: § 10.69). They note the possibility of a response lacking not as in (19b, c). 12 Stranding of not is sometimes possible (e.g., we have to go—we can't not); compare the limited possibility of elliptical not after nonfinites (cf. note 10).
English Auxiliaries without Lexical Rules
213
13 I argue that this is the correct analysis of deontics as well as epistemics in Warner (1993a: §1.4), but it is not important here: if you think this use of should may have two arguments, then the oblig—rel may be equipped with the relevant additional attribute. 14 Its position after the auxiliary and the fact that it is not a constituent with it imply that it is not a specifier of the auxiliary; its survival in ellipsis and the fact that it may scope over the auxiliary imply that it is not a specifier of the complement. 15 An MRS representation is subject to the general constraint that it must be possible to map it into a connected and rooted tree structure which is scope resolved, see Copestake et al. (1997). 16 There will also be conditions constraining the upward availability of relations in terms of the outermost handle of CONTENT, but I will omit both the relevant attribute and the statement of these further conditions. 17 See section 5 for an argument that not is specified as [word] in the ARG-ST list. It may ultimately be possible to replace '([word])' by '. . .' in (28), provided the rest of the hierarchy can be constrained to prevent generation of inappropriate results. But for the moment it seems best to retain the more explicit formulation. A more detailed presentation of this formulation will involve a partition between [+AUX, ARG-ST<[1], [word], XP[SUBJ<[1]>]>] and [ARG-ST<[1], XP[SUBJ<[1]>],...>]. The nonsaturated complement is not final in the ARG-ST list in the second of these because linear precedence in English is determined in part by the "obliqueness" ordering of the ARG-ST list, and the PP complement of seem is more oblique than its nonsaturated complement, as is attested by the permitted ordering: She seems happy to me, *She seems to me happy. 18 See Warner (1993a: 14ff.) for a summary account of these different types of modality. Epistemic modality typically involves a statement of the speaker's attitude towards the status of the truth of a proposition: that the proposition is necessarily true, probably true, predicted to be true, etc. Deontic modality involves permission and obligation, or what is possible and what is necessary with respect to some authority, or to a set of moral values. Dynamic modality evaluates the occurrence of events or the existence of states of affairs as necessary, important, advisable, possible, desirable, and so on within a circumstantial frame of reference (commonly not stated). It may (but need not) include reference to the abilities or volition of the subject. Examples (i) and (ii) illustrate dynamic modality. Note the contrast between (ii) and (iii); (iii) has epistemic modality, and involves a contradiction, as (ii) does not (cf. Palmer, 1979:156).
(i)
I must have an immigrant's visa; otherwise they're likely to kick me out, you see. 'It is necessary for me to have,...'
(ii)
This picture could be a Chagall, but is in fact a Braque. 'It is possible for this picture to be a Chagall,
(iii)
This picture might be a Chagall, but is in fact a Braque. 'It is possible that this picture is a Chagall,.. .'
19
Note the conclusion of Huddleston (1969:173-4): "If we leave aside contrastive intonation, the clear cases where the negation applies to the modality are can (all senses), permissive may and need. The clear cases where the modal is outside the domain of the negative are possibility may and must,..." This assigns indeterminate or ambiguous scope to will/would, shall/should and ought.
214
Anthony Warner
20
This distinction has the advantage of predicting the apparent contrast of scope between obligation and other uses of be + to: (i)
I told Paul he was not to be late. Narrow scope,
(ii) The Carthaginians were not to prevail. Wide scope Have got to looks like an exception, since negation has wide scope over obligation in You haven't got to go. But if the semantics of obligation is associated with got to, this is irrelevant to the distinction drawn here. 21 The default will be
22
See section 6.3 for the suggestion that not will also need to be placed in COMPS in not arg to avoid being interpreted as a member of the SPR list. 23 If the scope of negation with epistemic will and would is taken to be uncertain, so that it may be either wide or narrow, this could be accommodated straightforwardly by allowing epistemics in general to occur with narrow scope negation. The final statement of CONTENT in (37b) would simply specify the more general [epistemic—.rel] replacing [mayepistemic—rel], to become: | CONTENT | KEY [oblig—rel] or [epistemic—rel] Then epistemic will and would could occur with either scope. If "future" will is to be distinguished, this must be nonepistemic to occur only with wide scope of negation. Claimed instances of epistemic could will be interpreted as dynamic, as suggested in the next but one footnote. 24 Includes might when it is preterite of may. 25 1 would prefer to interpret apparent epistemic uses of can/cannot/can't as dynamic. The case of epistemic could is difficult. Palmer (1979, 1988) claims that instances occur, but only in nonassertive contexts. In such contexts it seems to me that they are neutralized with the dynamic interpretation, so that systematically these instances of could not/couldn't should perhaps be accounted for as instances of dynamic modality, in an extension of the type of argumentation given in Palmer (1979:155-7). 26 The equivalent monotonic formulation will have to add need to the wide scope type, and remove it from the narrow scope type. This results in a relatively complex statement: but that is to be expected when we are so close to the properties of the individual item. A monotonic statement of conditions for wide scope would be 'where | CONTENT |KEY is need—rel or (oblig—rel or may-epistemic—reiy; for narrow scope: 'where |CONTENT |KEY is ((oblig—rel and need—rel) or (may-epistemic—reiy. This leaves the scope of (38) as a problem. 27 Narrow neg scope is a bit of a misnomer then. The notion of negating the narrow scope condition here is a shorthand for a more complex formulation, since the logic for defaults in Lascarides et al. (1996) is not set up to allow negation. Handle relationships might be represented in a type structure, with CONDS{h3>h5} given default encoding,
English Auxiliaries without Lexical Rules
215
alongside another (overriding) type encoding the absence of any relationship between these handles. I am grateful to Ann Copestake for pointing out both problem and solution. 28 Distinguish constituent negation with not which of course has narrow scope. Palmer (1979:96; 1988:127) suggests that sentential negation may remain narrow in questions with mustn't, but qualifies the observation: "Although native speaker intuition is uncertain here." I have not included this possibility in my analysis. 29 This holds good whether would is the relevant auxiliary and rather is phrasal complement, or rather is itself an infinitival auxiliary; see note 46. 30 [+INV] must occur only in main clauses. This restriction can be dealt with by a constraint (which has some lexically controlled exceptions) that no member of an ARG-ST list may be [+INV]. There is no need to extend INV to all verbs. 31 The existence of twin possibilities within a framework containing the attribute SUBJ seems to have been first noted by Borsley (1986:83). 32 Pollard and Sag (1994: §9.6) pointed to two particular problems with this approach. One concerned difficulties avoiding unwanted extractions of the subject from the COMPS list. This will be discussed below. The second problem is the suggestion that if a subject is present in COMPS, SLASH categories will percolate into it without also percolating into the complement. This would falsely predict the grammaticality of (i) beside (ii). (i) (ii)
*Which rebel leader did rivals of e assassinate the British Consul? Which rebel leader did rivals of e assassinate e?
But in such instances this will be ruled out by the "Subject Condition," as noted in Warner (1993a:249, note 20 to p. 84). Since in subject-raising structures information about the subject is shared by the valence statements for the lower and higher predicates, the lower verb (here assassinate) will have a value for SUBJ which contains SLASH. So the Subject Condition (which prevents subjects from having nonparasitic gaps) will hold for it. Hence assassinate must also have a SLASH within its COMPS list; (cf. Pollard and Sag, 1994: §4.5,§9.2). There remains however the problem of identificational BE, as in (Hi), which is surely not a raising predicate, but which apparently shows a (weak) restriction on extraction from its inverted subject, contrast (iv) and (v). (iii)
Is the first man the thief?
(iv)
Of which book was the reviewer also the author?
(v)
?Of which book was the reviewer also the author of Gamelan Studies?
It is harder to tell whether there is a similar set of facts for the inverted "possessive" HAVE of British English, since this is formal and increasingly restricted in usage. There may be a difficulty here for the use of Schema 2 in English inversions, unless the "Subject Condition" facts can be made to fall out in a way which covers the data given above. 33 The type hierarchy under head will state that the type finite auxiliary has the Boolean attribute INV, so finite aux lex will automatically be marked for INV. There will, however, be the unwelcome complexity that Schema 3 will need to be specified +INV for English, while Schemas 1 and 2 will need to be specified —'[+INV]. 34 There is, however, some further possible evidence in favor of the use of Schema 3 in the operation of Linear Precedence statements, since subjects which are Ss or VPs are
216
Anthony Warner
not distributed just like complements. Pollard and Sag (1987:181) formulate their Linear Precedence statement 2 so that the ordering of phrasal complements in English reflects the ordering of the ARG-ST list, except for synsem of type verb. This allows the freedom of order found in (i) beside (ii), but bars (iv) since AP must precede PP as in (iii). (i)
Kim appeared to Sandy to be unhappy ARG-ST
(ii)
Kim appeared to be unhappy to Sandy ARG-ST
(iii)
Kim appeared unhappy to Sandy
ARG-ST
(iv)
*Kim appeared to Sandy unhappy
ARG-ST
We might expect to find this freedom in the case of subjects if they are indeed on the COMPS list, so that (vi) would be available beside (v). But this is not so. (v)
Is (for a journalist) to reveal sources legitimate? ARG-ST<S, AP>; ARG-ST
(vi) *Is legitimate (for a journalist) to reveal sources? ARG-ST<S, AP>; ARG-ST It is, however, not clear to me what the most satisfactory account of this restriction is, nor whether it will indeed constitute an argument for the use of Schema 3 within a more detailed account of Linear Precedence principles of English grammar. 35 Pollard and Sag (1994: §9.6) discuss the problem of assigning a double structure to John must leave. More recently, Bouma, Malouf, and Sag (1997) have proposed a fillergap account of Who visits Alcatraz? But this does not imply that a subject may be extracted from [+INV] structures since the phrase visits Alcatraz is not [+INV], and Who will visit Alcatraz? can be dealt with as the extraction of the subject from [—INV]. Indeed, if subject extraction from [+INV] is not permitted, this will avoid the assignment of a further [+INV] structure to Who will visit Alcatraz? 36 Then the interrogative Would the old men all have liked to fly? corresponds most directly to the declarative The old men would all have liked to fly in that in both all is the specifier of have liked to fly, and there is no interrogative corresponding directly to The old men all would have liked to fly. Note that adopting (46) has the same result, since the unification of (46) and (47) has SPR elist. 37 Here and in (54) below, HEAD designates the head daughter of an ID schema; it is not the attribute HEAD. 38 Quirk et al. (1985:809) observe that "some speakers accept" the "rather formal" construction of Is not history a social science? and that this order is especially likely in formal contexts where the subject is lengthy. Some further principle (perhaps weight ordering) will be needed to account for the greater difficulty of the order with pronoun subjects, which are more marginal (e.g., Should not you talk to him about it?) and for the absolute impossibility of this order in tags: aren't they?, are they not? *are not they? 39 This covers the central area of what has been called VP Deletion, but both the term and the implied analysis are inappropriate (Warner, 1993a: 5f.). 40 If I follow the system of Bouma, Malouf, and Sag (1997), but substitute the ARG-ST list for their DEPS list, the complement would appear on the ARG-ST list as gap-ss, and its SLASH value would be re-entrant with that of the auxiliary's (TO-)BIND feature: will in (55b) would be [ARG-ST<. . . , [gap-ss LOCAL VP, SLASH {VP}]>,
English Auxiliaries without Lexical Rules
217
SLASH{VP}, BIND{VP}], where all VP are token identical feature structures of type local. 41 Note that Pseudogapping, illustrated in (i), should be generalized with Postauxiliary Ellipsis, as argued (among others) by Miller (1990) and Warner (1993a). A syntactic account of this might treat it as the ellipsis of a synsem with non-null SLASH within a gapfiller structure [i.e., as parallel in major respects to the examples of (56)]. The missing complement of will in (i) would be VP/NP. (i) We're agreed then: You will try to persuade your father this weekend and I will _ your mother. 42
The proform so with auxiliaries is regrettably narrowly distributed, and it is necessary carefully to distinguish the connective adverb. But note that the construction of (i), in which so is a proform for AP/PP, is ungrammatical, unlike (ii) which has both ellipsis and extraction and is grammatical. The contrast between (iii) and (iv) also implies that so is not a proform which may contain a non-null SLASH, if we suppose that (iv) is a gap-filler structure: so [VP/NP] ...the ham[NP]. (i)
*She was fond of Harry, and of George she will be so too.
(ii)
She was fond of Harry, and of George she will be too.
(iii)
If you order Harry to eat up all his food, then so he will!
(iv)
*If you order Harry to eat up all his food, then so he will the ham!
43
The arguments in Milsark (1976) and those noted in Lumsden (1988:51) seem to me to show that we should reject the analysis of there sentences as containing a single NP complement with an internal adjunct, as proposed by Williams (1984), even for instances with AP or participial phrase, in favour of that adopted by Pollard and Sag (1994). 44 One problem of analysis (which I shall not pursue) is that of establishing the relationship of instances of ellipsis of the first complement with retention of the second to Pseudogapping. But see note 47. 45 There may be some evidence for the ellipsis of locatives as distinct from their optional status in restricted constructions such as: (i) I have some minor matters on my conscience. *?But there are no real misdeeds [sc. on my conscience]. 46
Another construction which might show two complements with an auxiliary is found with would rather, had better. The restricted optionality of rather and better shown in (i) and (ii) is interesting. (i)
I don't know whether he would rather leave early, or whether he would *(rather) leave late.
(ii) You would rather leave early? —Yes, I would (rather). —No, I would *(rather) leave late. The simplest solution is that rather and better are themselves [+AUX, bse], that they take a plain infinite complement (or a clause in the case of rather), and that they provide a context for deletion, but can only themselves be deleted along with their complements.
218
Anthony Warner
47 1 have not here discussed the relationship between Postauxiliary Ellipsis and Pseudogapping, although examples like (61) might readily be referred to Pseudogapping. Instances like (63b) (65b), however, seem most naturally interpreted as the Postauxiliary Ellipsis of only one complement phrase (given that there constructions have two phrases within their complement, see note 43). This may open an alternative of analysis in which Postauxiliary Ellipsis affects only the final element (or elements) of the complement, and clauses with ellipsis of nonfinal phrases are analyzed as showing Pseudogapping. But if John reads to his children, Mary cooks for the family show ellipsis of a nonfinal complement, there is no simple generalization with such ellipsis in transitives. 48 Alternatively, if lexical items may not be extracted, then it may be appropriate to restrict the value of [3] on word so that it cannot include synsem of type word. Then not could not be extracted or removed in ellipsis. 49 A generalization of this will be required if a similar statement (with a mismatch between ARG-ST and COMPS lists) is to be made for transitive verbs with a "deleted" object (eat, read, etc.), as proposed in Davis (1996). 50 I have not here discussed the peculiarity of the distribution of do, that unstressed affirmative do is absent, or the facts of imperative do, which has a distinct distribution. For an account of these see Warner (1993a:86ff.), where I suggest an informal "blocking" (or default) account of the relationship between the realization of tense as the word do and as verbal affix, and analyze imperative do, don't as unique finites.
REFERENCES Abbott, B. (1976). Right node raising as a test for constituenthood. Linguistic Inquiry, 7, 639-642. Berwick, R. (1985). The acquisition of syntactic knowledge. Cambridge, MA: MIT Press. Borsley, R. D. (1986). A note on HPSG. Bangor Research Papers in Linguistics, 1, 77-85. Borsley, R. D. (1987). Subjects and complements in HPSG. Report No. CSLI-87-107. Stanford: Center for the Study of Language and Information. Borsley, R. D. (1989). Phrase-structure grammar and the Barriers conception of clause structure. Linguistics, 27, 843-863. Borsley, R. D. (1995). On some similarities and differences between Welsh and Syrian Arabic. Linguistics, 33, 99-122. Borsley, R. D. (1996). Modern phrase structure grammar. Oxford: Blackwell. Bouma, G. (1997). Valence alternation without lexical rules. Unpublished manuscript, Rijksuniversiteit. Bouma, G., Malouf, R., and Sag, I. A. (1997). Satisfying constraints on extraction and adjunction. Unpublished manuscript, Stanford University. Carpenter, B. (1992). The logic of typed feature structures with applications to unification grammars, logic programs and constraint resolution. Cambridge: Cambridge University Press. Carpenter, B. (1993). Skeptical and credulous default unification with applications to tern-
English Auxiliaries without Lexical Rules
219
plates and inheritance. In T. Briscoe, A. Copestake, and V. de Paiva (Eds.), Inheritance, defaults, and the lexicon (13-37). Cambridge: Cambridge University Press. Chomsky, N. (1957). Syntactic structures. The Hague: Mouton. Copestake, A. (1993). Defaults in lexical representation. In T. Briscoe, A. Copestake, and V. de Paiva (Eds.), Inheritance, defaults, and the lexicon (223-245). Cambridge: Cambridge University Press. Copestake, A., Flickinger, D., and Sag, I. A. (1997). Minimal Recursion Semantics: an introduction. Unpublished manuscript, Stanford University. Davis, T. (1996). Lexical semantics and linking in the hierarchical lexicon. Ph.D. dissertation, Stanford University. Flickinger, D. (1987). Lexical rules in the hierarchical lexicon. Ph.D. dissertation, Stanford University. Gazdar, G., Pullum, G. K., and Sag, I. A. (1982). Auxiliaries and related phenomena in a restrictive theory of grammar. Language, 58, 591-638. Hankamer, J., and Sag, I. A. (1976). Deep and surface anaphora. Linguistic Inquiry, 7, 391-428. Huddleston, R. (1969). Review of Madeline Ehrman (1966). The meanings of the modals in present-day American English. Lingua, 23, 165-176. Kathol, A. (1994). Passives without lexical rules. In J. Nerbonne, K. Netter, and C. Pollard (Eds.), German in Head-driven Phrase Structure Grammar (237-272). Stanford: CSLI. Kim, J.-B. (1995). English negation from a non-derivational perspective. Proceedings of the 21st Annual Meeting, Berkeley Linguistics Society, 186-197. Kim, J.-B., and Sag, I. A. (1996a). The parametric variation of French and English negation. Proceedings of the 14th Annual Meeting of the West Coast Conference on Formal Linguistics, 303-317. Kim, J.-B., and Sag, I. A. (1996b). French and English negation: a lexicalist alternative to head movement. Unpublished manuscript, Stanford University. Klima, E. (1964). Negation in English. In J. A. Fodor and J. J. Katz (Eds.), The structure of language (246-323). Englewood Cliffs, NJ: Prentice-Hall. Krieger, H.-U., and Nerbonne, J. (1993). Feature-based inheritance networks for computational lexicons. In T. Briscoe, A. Copestake, and V. de Paiva (Eds.), Inheritance, defaults, and the lexicon (90-136). Cambridge: Cambridge University Press. Lappin, S. (1997). An HPSG account of antecedent contained ellipsis. SOAS Working Papers in Linguistics and Phonetics, 7, 103-122. Lascarides, A., Briscoe, T, Asher, N., and Copestake, A. (1996). Order independent and persistent typed default unification. Linguistics and Philosophy, 19, 1-90. Lumsden, M. (1988). Existential sentences: Their structure and meaning. London: Croom Helm. Manning, C., Sag, I. A., and lida, M. (1996). The lexical integrity of Japanese causatives. In T. Gunji (Ed.), Studies on the universality of constraint-based phrase structure grammars (9-37). Report of International Scientific Research Program Project 06044133. Osaka. Manning, C., and Sag, I. A. (1997). Dissociations between argument structure and grammatical relations. Unpublished manuscript, Stanford University.
220
Anthony Warner
Miller, P. H. (1990). Pseudogapping and do so substitution. Proceedings of the 26th Meeting of the Chicago Linguistics Society (293-305). Chicago: Chicago Linguistics Society. Miller, P. H., and Sag, I. A. (1997). French clitic movement without clitics or movement. NLLT, 15, 573-639. Milsark, G. L. (1976). Existential sentences in English. Bloomington: IULC. Palmer, F. R. (1979). Modality and the English modals. London: Longman. Palmer, F. R. (1988). The English verb (2nd ed.). London: Longman. Perkins, M. R. (1983). Modal expressions in English. London: Francis Pinter. Pollard, C., and Sag, I. A. (1987). Information-based syntax and semantics, vol. I: Fundamentals. Stanford: Center for the Study of Language and Information. Pollard, C., and Sag, I. A. (1994). Head-driven Phrase Structure Grammar. Stanford: CSLI and Chicago: University of Chicago Press Pollock, J.-Y. (1989). Verb movement, Universal Grammar, and the structure of IP. Linguistic Inquiry, 20, 365-424. Quirk, R., Greenbaum, S., Leech, G., and Svartvik, J. (1985). A comprehensive grammar of the English language. London and New York: Longman. Riehemann, S. (1994). Morphology and the hierarchical lexicon. Unpublished manuscript, Stanford University. Sag, I. A. (1979). The nonunity of anaphora. Linguistic Inquiry, 10, 152-164. Sag, I. A. (1997). English relative clause constructions. Journal of Linguistics, 33, 431483. Sag, I. A., and Wasow, T. (1997). Syntactic theory: A formal introduction. Unpublished manuscript, Stanford University. (Partial draft of Sept 1997) Warner, A. R. (1993a). English auxiliaries: Structure and history. Cambridge: Cambridge University Press. Warner, A. R. (1993b). The grammar of English auxiliaries: An account in HPSG. York Research Papers in Linguistics, YLLS/RP 1993-4. York: Department of Language and Linguistic Science, University of York. Wechsler, S. (1995). The semantic basis of argument structure. Stanford: CSLI Publications. Williams, E. S. (1984). There-insertion. Linguistic Inquiry, 15, 131-153. Zwicky, A. M., and Pullum, G. K. (1983). Cliticization vs. Inflection: English n't. Language, 59, 502-13.
THE DISCRETE NATURE OF SYNTACTIC CATEGORIES: AGAINST A PROTOTYPE-BASED ACCOUNT FREDERICK J. NEWMEYER Department of Linguistics University of Washington Seattle, Washington
1. PROTOTYPES, FUZZY CATEGORIES, AND GRAMMATICAL THEORY 1.1. Introduction There are many diverse approaches to generative grammar, but what all current models share is an algebraic approach to the explanation of grammatical phenomena.1 That is, a derivation consists of the manipulation of discrete formal objects drawn from a universal vocabulary. Foremost among these objects are the syntactic categories: NP, V, S, and so on. The inventory of categories has changed over the years and differs from model to model. Likewise, their distribution has been constrained by proposals such as X-bar theory, feature subcategorization schemes, and the current (albeit controversial) distinction between lexical and functional categories. Nevertheless, what has remained constant, for the past two decades at least, is the idea that among the primitives of grammatical theory are discrete categories whose members have equal status as far as grammatical processes are concerned. That is, the theory does not regard one lexical item as being "more of a noun" than another, or restrict some process to apply only to the "best sorts" of NP.2 This classical notion of categories has been challenged in recent years by many Syntax and Semantics, Volume 32 The Nature and Function of Syntactic Categories
221
Copyright © 2000 by Academic Press All rights of reproduction in any form reserved. 0092-4563/99 $30
222
Frederick J. Newmeyer
working within the frameworks of functional and cognitive linguistics (see especially Comrie, 1989; Croft, 1991; Cruse, 1992; Dixon, 1977; Heine, 1993; Hopper and Thompson, 1984, 1985; Langacker, 1987, 1991; Taylor, 1989; Thompson, 1988). In one alternative view, categories have a prototype structure, which entails the following two claims for linguistic theory: (1)
Categorial Prototypicality: a. Grammatical categories have "best case" members and members that systematically depart from the "best case." b. The optimal grammatical description of morphosyntactic processes involves reference to degree of categorial deviation from the "best case."
Representatives of both functional linguistics and cognitive linguistics have taken categorial prototypicality as fundamental to grammatical analysis, as the following quotes from Hopper and Thompson (leading advocates of the former) and Langacker (a developer of the latter) attest (I have added emphasis in both passages): It is clear that the concept of prototypicality (the centrality vs. peripherality of instances which are assigned to the same category) has an important role to play in the study of grammar. Theories of language which work with underlying, idealized structures necessarily ignore very real differences, both crosslinguistic and intra-linguistic, among the various degrees of centrality with which one and the same grammatical category may be instantiated. (Hopper and Thompson, 1985:155) How then will the theory achieve restrictiveness? Not by means of explicit prohibitions or categorical statements about what every language must have, but rather through a positive characterization of prototypicality and the factors that determine it.... The theory will thus incorporate substantive descriptions of the various kinds of linguistic structures with the status of prototypes. (Langacker, 1991:513-514)
These approaches attribute prototype structure to (virtually) all of the constructs of grammar, not just the syntactic categories (see, for example, the treatment of the notion "subject" along these lines in Bates and MacWhinney, 1982; Langendonck, 1986; Silverstein, 1976; and Van Oosten, 1986). However, this chapter will focus solely on the syntactic categories. Another position that challenges the classical approach to grammatical categories is that they have nondistinct boundaries: (2) Fuzzy Categories: The boundaries between categories are nondistinct. My impression is that the great majority of functionalists accept categorial prototypicality, and a sizable percentage accept fuzzy categories. Comrie (1989) and Taylor (1989), for example, are typical in that respect. However, Langacker
Syntactic Categories: Against Prototype
223
(1991), while accepting an internal prototype structure for categories, rejects the idea that the boundaries between them are nondistinct, arguing that syntactic categories can be defined by necessary and sufficient semantic conditions. Wierzbicka (1990) accepts this latter conception, but rejects prototypes. She writes: In too many cases, these new ideas [about semantic prototypes] have been treated as an excuse for intellectual laziness and sloppiness. In my view, the notion of prototype has to prove its usefulness through semantic description, not through semantic theorizing, (p. 365)
And Heine (1993), on the basis of studies of grammaticalization, was led to accept fuzzy categories, but to reject categorial prototypicality. In his view, the internal structure of categories is based on the concept of "degree of family resemblance" rather than "degree of prototypicality." The specific goal of this chapter is to defend the classical theory of categories. First, it will provide evidence against categorial prototypicality by rebutting (1b), namely the idea that descriptively adequate grammars need to make reference to the degree of prototypicality of the categories taking part in grammatical processes. To the extent that it is successful, it will thereby provide evidence against (la) as well. Since grammatical behavior gives us the best clue as to the nature of grammatical structure, any refutation of (1b) ipso facto presents a strong challenge to (la). To be sure, it is possible to hold (la), but to reject (1b). Such a view would entail the existence of judgments that categories have "best-case" and "less-thanbest-case" members, without the degree of "best-casedness" actually entering into grammatical description. Does anybody hold such a position? It is not clear. George Lakoff seems to leave such a possibility open. He writes that "prototype effects ... are superficial phenomena which may have many sources" (1987:56) and stresses at length that the existence of such effects for a particular phenomenon should not be taken as prima facie evidence that the mind represents that phenomenon in a prototype structure (see in particular his discussion of prototype effects for even and odd numbers in chapter 9). On the other hand, his discussion of strictly grammatical phenomena suggests that he does attribute to grammatical categories a graded structure with inherent degrees of membership, and the degree of membership is relevant to syntactic description (see his discussion of "nouniness" on pages 63-64, discussed in section 3.4.3 below). In any event, in this chapter I will be concerned only with theories advocating the conjunction of claims (la) and (1b). That is, I will attend only to approaches in which descriptively adequate grammars are said to make reference (in whatever way) to graded categorial structure. Limitations of space force me to ignore a number of topics that are relevant to a full evaluation of all facets of prototype theory. In particular, I will not address the question of whether nonlinguistic cognitive categories have a prototype
224
Frederick J. Newmeyer
structure. Much has appeared in the psychological literature on this topic, and a wide variety of opinions exist (see, for example, Armstrong, Gleitman, and Gleitman, 1983; Dryer, 1997; Fodor and Lepore, 1996; Kamp and Partee, 1995; Keil, 1989; Lakoff, 1987; Mervis and Rosch, 1981; Rosch and Lloyd, 1978; and Smith and Osherson, 1988). However, the evidence for or against a prototype structure for grammatical categories can, I feel, be evaluated without having to take into account what has been written about the structure of semantic, perceptual, and other cognitive categories. The question of whether grammatical categories have a prototype structure is, to a degree, independent of whether they can be defined notionally, that is, whether they can be defined by necessary and sufficient semantic conditions. The arguments put forward to support notional definitions of categories will be addressed and challenged in Newmeyer (1998). Second, I will argue against fuzzy categories. Nothing is to be gained, either in terms of descriptive or explanatory success, in positing categorial continua. The remainder of section 1 provides historical background to a prototype-based approach to syntactic categories. Section 2 discusses how prototype theory has been applied in this regard and discusses the major consequences that have been claimed to follow from categories having a prototype structure. Section 3 takes on the evidence that has been adduced for (1b) on the basis of the claim that prototypical members of a category manifest more morphosyntactic complexity than nonprototypical members. I argue that the best account of the facts makes no reference, either overtly or covertly, to categorial prototypicality. Section 4 argues against fuzzy categories and is followed by a short conclusion (section 5). 1.2. Squishes and Their Legacy Prototype theory was first proposed in Rosch (1971/1973) to account for the cognitive representation of concepts and was immediately applied to that purpose in linguistic semantics (see Lakoff, 1972, 1973).3 This work was accompanied by proposals for treating syntactic categories in nondiscrete terms, particularly in the work of J. R. Ross (see especially Ross, 1973a, 1973b, 1975). Ross attempted to motivate a number of "squishes," that is, continua both within and between categories, among which were the "Fake NP Squish," illustrated in (3), and the "Nouniness Squish," illustrated in (5). Consider first the Fake NP Squish: (3)
The Fake NP Squish (Ross, 1973a): a. Animates b. Events c. Abstracts d. Expletive it e. Expletive there f. Opaque idiom chunks
Syntactic Categories: Against Prototype
225
Progressing downward from (3a) to (3f) in the squish, each type of noun phrase was claimed to manifest a lower degree of noun phrase status than the type above it. Ross's measure of categorial goodness was the number of processes generally characteristic of the category that the NP type was able to undergo. Consider the possibility of reapplication of the rule of Raising. The "best" sort of NPs, animates,4 easily allow it (4a), "lesser" NPs, events, allow it only with difficulty (4b), while "poor" NPs, idiom chunks, do not allow it at all (4c): (4) a. John is likely to be shown to have cheated. b. ?The performance is likely to be shown to have begun late. c. *No headway is likely to have been shown to have been made. Ross proposed the "Nouniness Squish" to illustrate a continuum between categories. Progressing from the left end to the right end, the degree of sententiality seems to decrease and that of noun phrase-like behavior to increase: (5) The "Nouniness Squish" (Ross, 1973b: 141): that clauses > for to clauses > embedded questions > Ace-ing complements > Poss-mg complements > action nominals > derived nominals > underived nominals Ross put forward the strong claim that syntactic processes apply to discrete segments of the squish. For example, preposition deletion must apply before that and for-to complements (6a), may optionally apply before embedded questions (6b), and may not apply before more "nouny" elements (6c): (6) a. I was surprised (*at) that you had measles. b. I was surprised (at) how far you could throw the ball. c. I was surprised *(at) Jim's victory. Given the apparent fact that different processes apply to different (albeit contiguous) segments of the squish, Ross was led to reject the idea of rigid boundaries separating syntactic categories. In other words, Ross's approach involved hypothesizing both categorial prototypicality and fuzzy categories. By the end of the 1970s, however, very few syntactic analyses were still being proposed that involved squishes. Ross's particular approach to categorial continua was problematic in a number of ways. For one thing, it did not seek to provide a more general explanation for why categories should have the structure that he attributed to them. Second, his formalization of the position occupied by an element in the squish, the assignment of a rating between 0 and 1, struck many linguists as arbitrary and unmotivated. No reasonable set of criteria, for example, was ever proposed to determine if an abstract NP merited a rating of, say, .5 or .6 on the noun phrase scale. Third, Ross dealt with sentences in isolation, abstracted away from their extralinguistic context. Since at
226
Frederick J. Newmeyer
that time those linguists who were the most disillusioned with generative grammar were the most likely to take a more "sociolinguistic" approach to grammar, Ross's silence on the discourse properties of the sentences he dealt with seemed to them to be only a slight departure from business as usual. And finally, some doubts were raised about the robustness of the data upon which the squishes were based. Gazdar and Klein (1978) demonstrated that one of them (the "Clausematiness Squish" of Ross, 1975) did not exhibit statistically significant scalar properties that would not show up in an arbitrary matrix. But while Ross's particular approach was abandoned, the central core of his ideas about grammatical categories has lived on. In particular, many linguists continued to accept the idea that they have a prototype structure and/or have fuzzy boundaries. The 1980s saw the development of alternatives to generative grammar that have attempted to incorporate such ideas about categorial structure into grammatical theory. It is to these approaches that we now turn, beginning with an examination of prototypes within functional linguistics.
2. PROTOTYPE THEORY AND SYNTACTIC CATEGORIES Among linguists who take a prototype approach to syntactic categories there is considerable disagreement as to how to define the prototypical semantic and pragmatic correlates of each category. Just to take the category "adjective," for example, we find proposals to characterize its prototypical members in terms of a set of concepts such as "dimension," "physical property," "color," and so on (Dixon, 1977); their "time-stability" (Givon, 1984); their role in description, as opposed to classification (Wierzbicka, 1986); and their discourse functions (which overlap with those of verbs and nouns respectively) of predicating a property of an existing discourse referent and introducing a new discourse referent (Thompson, 1988). This lack of consensus presents a bit of a dilemma for anyone who, like this author, would wish to evaluate the success of prototype theory for syntax without undertaking an exhaustive critique of all positions that have been put forward as claiming success in this matter. My solution will be to adopt for purposes of discussion what I feel is the best motivated, most elaborate, and most clearly explicated proposal for categorial prototypicality, namely that presented in Croft (1991). His proposals for the prototypical semantic and pragmatic properties of noun, adjective, and verb are summarized in Table 1. In other words, the prototypical noun has the pragmatic function of reference, it refers to an object with a valency of 0 (i.e., it is nonrelational), and it is stative, persistent, and nongradable. The prototypical verb has the pragmatic function of
227
Syntactic Categories: Against Prototype
TABLE 1 PROTOTYPICAL CORRELATIONS OF SYNTACTIC CATEGORIES Syntactic category
Semantic class Valency Stativity Persistence Gradability Pragmatic function
Verb
Adjective
Noun
Action
Object
Property
0
1
>1
state persistent nongradable Reference
state persistent gradable Modification
process transitory nongradable Predication
From Croft, 1991:55,65.
predication, it refers to an action, it has a valency of 1 or greater, and is a transitory, nongradable process. The links between semantic class and pragmatic function are, of course, nonaccidental (see Croft, 1991:123), though I will not explore that matter here. Table 1 characterizes the most prototypical members of each category, but not their internal degrees of prototypicality. Most prototype theorists agree that definite human nouns are the most prototypical, with nonhuman animates less prototypical, followed by inanimates, abstract nouns, and dummy nouns such as it and there. As far as adjectives are concerned, Dixon (1977) finds that words for age, dimension, value, and color are likely to belong to the adjective class, however small it is, suggesting that adjectives with these properties make up the prototypical core of that category. Words for human propensities and physical properties are often encoded as nouns and verbs respectively, suggesting that their status as prototypical adjectives is lower than members of the first group. Finally, Croft notes that it is difficult to set up an elaborated prototypicality scale for verbs. However, there seems to be no disagreement on the point that causative agentive active verbs carrying out the pragmatic function of predication are the most prototypical, while nonactive verbs, including "pure" statives and psychological predicates are less so. It is important to stress that the approach of Croft (and of most other contemporary functional and cognitive linguists) differs in fundamental ways from that developed by Ross in the 1970s.5 Most importantly, it adds the typological dimension that was missing in Ross's squishes. Prototypes are not determined, as for Ross, by the behavior of particular categories with respect to one or more grammatical rules in a particular language. Rather, the prototypes for the syntactic categories are privileged points in cognitive space, their privileged position being
228
Frederick J. Newmeyer
determined by typological grammatical patterns. Hence, no direct conclusion can be drawn from the hypothesized universal (cognitive) properties of some prototypical syntactic category about how that category will behave with respect to some particular grammatical process in some particular language. Indeed, it is consistent with Croft's approach that there may be languages in which the category Noun, say, shows no prototype effects at all. Another difference has to do with the structure of categories themselves. Ross assumes that all nonprototypical members of a category can be arranged on a one-dimensional scale leading away from the prototype, that is, hierarchically. Croft, on the other hand, assumes a radial categorial structure (Lakoff, 1987). In such an approach, two nonprototypical members of a category need not be ranked with respect to each other in terms of degree of prototypicality. Croft's theory thus makes weaker claims than Ross's. One might even wonder how the notion "prototypicality" surfaces at all in grammatical description. Croft explains: These [markedness, hierarchy, and prototype] patterns are universal, and are therefore part of the grammatical description of any language. Languagespecific facts involve the degree to which typological universals are conventionalized in a particular language; e.g. what cut-off point in the animacy hierarchy is used to structurally and behaviorally mark direct objects. (Croft, 1990:154)
In other words, grammatical processes in individual languages are sensitive to the degree of deviation of the elements participating in them from the typologically established prototype.
3. PROTOTYPICALITY AND PARADIGMATIC COMPLEXITY The most frequently alluded to morphosyntactic manifestation of prototypicality is that it correlates with what might be called "paradigmatic complexity." That is, more prototypical elements are claimed to have a greater number of distinct forms in an inflectional paradigm than less prototypical elements or to occur in a larger range of construction types than less prototypical elements. In this section, I will challenge the idea that any correlation other than the most rough sort holds between paradigmatic complexity and prototypicality. My conclusion will serve to undermine the grammatical evidence for the idea that categories are nondiscrete. In section 3.1, I review the evidence that has been adduced for the correlation between categorial prototypicality and paradigmatic complexity. Section 3.2 outlines the various positions that could be—and have been—taken to instan-
Syntactic Categories: Against Prototype
229
tiate this correlation in a grammatical description. Section 3.3 shows that for three well-studied phenomena, the postulated correlation is not robust, while section 3.4 presents alternative explanations for phenomena that have been claimed to support a strict correlation and hence nondiscrete categories. 3.1. Paradigmatic Complexity and Prototypes Croft (1991:79-87) defends the idea that the prototypical member of a category manifests more distinct forms in an inflectional paradigm than the nonprototypical and occurs in a broader range of construction types. As he notes (p. 79), each major syntactic category is associated with a range of inflectional categories, though of course languages differ as to which they instantiate: Nouns: number (countability), case, gender, size (augmentative, diminutive), shape (classifiers), definiteness (determination), alienability; Adjectives: comparative, superlative, equative, intensive ("very Adj"), approximative ("more or less Adj" or "Adj-ish"), agreement with head; Verbs: tense, aspect, mood, and modality, agreement with subject and object(s), transitivity. Croft argues that there is systematicity to the possibility of a particular category's bearing a particular inflection. Specifically, if a nonprototypical member of that category in a particular language allows that inflection, then a prototypical member will as well. Crosslinguistically, participles and infinitives, two nonpredicating types of verbal elements, are severely restricted in their tense, aspect, and modality possibilities. (Nonprototypical) stative verbs have fewer inflectional possibilities than (prototypical) active verbs (e.g., they often cannot occur in the progressive). Predicate Ns and (to a lesser extent) predicate As are often restricted morphosyntactically. Predicate Ns in many languages do not take determiners; predicate As do not take the full range of adjectival suffixes, and so on. The same can be said for mass nouns, incorporated nouns, and so on—that is, nouns that do not attribute reference to an object. Furthermore, nonprototypical members of a syntactic category seem to have a more restricted syntactic distribution than prototypical members. As Ross (1987: 309) remarks: "One way of recognizing prototypical elements is by the fact that they combine more freely and productively than do elements which are far removed from the prototypes." This point is amply illustrated by the Fake NP Squish (3). Animate nouns are more prototypical than event nouns, which are more prototypical than abstract nouns, which are more prototypical than idiom chunks. As degree of prototypicality declines, so does freedom of syntactic distribution. The same appears to hold true of verbs. In some languages, for example, only action verbs may occur in the passive construction.
230
Frederick J. Newmeyer
3.2. Paradigmatic Complexity and Claims about Grammar-Prototype Interactions The idea that inflectional and structural elaboration declines with decreasing categorial prototypicality has been interpreted in several different ways. Four positions can be identified that express this idea. In order of decreasing strength, they are "Direct Mapping Prototypicality," "Strong Cut-off Point Prototypicality," "Weak Cut-off Point Prototypicality," and "Correlation-only Prototypicality." I will now discuss them in turn. According to Direct Mapping Prototypicality, morphosyntactic processes make direct reference to the degree of prototypicality of the elements partaking of those processes. In other words, part of our knowledge of our language is a Prototypicality Hierarchy and a grammar-internal mapping from that hierarchy to morphosyntax. Ross's squishes are examples of Direct Mapping Prototypicality. As I interpret his approach, the correlation in English between the position of a noun on the Prototypicality Hierarchy and its ability to allow the reapplication of Raising (see 4a-c) is to be expressed directly in the grammar of English. In Strong Cut-off Point Prototypicality, the effects of prototype structure are encoded in the grammar of each language, but there is no language-particular linking between gradations in prototypicality and gradations in morphosyntactic behavior. To repeat Croft's characterization of this position: These [markedness, hierarchy, and prototype] patterns are universal, and are therefore part of the grammatical description of any language. Languagespecific facts involve the degree to which typological universals are conventionalized in a particular language; e.g., what cut-off point in the animacy hierarchy is used to structurally and behaviorally mark direct objects. (Croft, 1990:154)
One can think of Strong Cut-off Point Prototypicality as a constraint on possible grammars. For example, it would prohibit (i.e., predict impossible) a language in other respects like English, but in which the reapplication of Raising would be more possible with nonprototypical NPs than with prototypical ones. Weak Cut-off Point Prototypicality allows a certain number of arbitrary exceptions to prototype-governed grammatical behavior. Thus it would admit the possibility that the reapplication of Raising could apply to a less prototypical NP than to a more prototypical one, though such cases would be rather exceptional. I interpret the analyses of Hungarian definite objects in Moravcsik (1983) and English there-constructions in Lakoff (1987) as manifesting Weak Cut-off Point Prototypicality. The central, prototypical, cases manifest the phenomenon in question, and there is a nonrandom, yet at the same time unpredictable, linking between the central cases and the noncentral ones. Correlation-only Prototypicality is the weakest position of all. It simply states
Syntactic Categories: Against Prototype
231
that there is some nonrandom relationship between morphosyntactic behavior and degree of prototypicality. 3.3. On the Robustness of the Data Supporting Cut-off Point Prototypicality In this section, I will demonstrate that for three well-studied phenomena, Cutoff Point Prototypicality, in both its strong and weak versions, is disconfirmed. At best, the data support Correlation-only Prototypicality. 3.3.1. THE ENGLISH PROGRESSIVE English is quite poor in "choosy" inflections, but it does permit one test of the correlation between prototypicality and paradigmatic complexity. This is the marker of progressive aspect, -ing. Certainly it is true, as (7a-b) illustrates, that there is a general correlation of categorial prototypicality and the ability to allow progressive aspect (note that both verbs are predicating): (7) a. Mary was throwing the ball. b. *Mary was containing 10 billion DNA molecules. However, we find the progressive with surely nonprototypical temporary state and psychological predicate verbs (8a-b), but disallowed with presumably more prototypical achievement verbs (9): (8) a. The portrait is hanging on the wall of the bedroom, b. I'm enjoying my sabbatical year. (9)
*I'm noticing a diesel fuel truck passing by my window.
Furthermore, we have "planned event progressives," where the possibility of progressive morphology is clearly unrelated to the prototypicality of the verb (cf. grammatical l0a and ungrammatical l0b): (10) a. Tomorrow, the Mariners are playing the Yankees, b. *Tomorrow, the Mariners are playing well. In short, the English progressive falsifies directly the idea that there is a cut-off point on the scale of prototypicality for verbs and that verbs on one side of the cut-off point allow that inflection, while those on the other side forbid it. Furthermore, the exceptions (i.e., the verbs of lesser prototypicality that allow the progressive) do not appear to be simple arbitrary exceptions. Therefore, the facts do not support Weak Cut-off Point Prototypicality either.
232
Frederick J. Newmeyer
One could, of course, attempt to by-pass this conclusion simply by exempting English progressive inflection from exhibiting prototype effects in any profound way. One might, for example, appeal to some semantic or pragmatic principles that account for when one finds or does not find progressive morphology. Indeed, I have no doubt that such is the correct way to proceed (for discussion, see Goldsmith and Woisetschlaeger, 1982; Kearns, 1991; Smith, 1991; Zegarac, 1993; and Swart, 1998). But the point is that degree of verbal prototypicality fails utterly to account for when one finds progressive morphology in English. Therefore the facts lend no support to Croft's claim that the prototypical member of a category manifests more distinct forms in an inflectional paradigm than the nonprototypical member.6 3.3.2. ADJECTIVES Dixon (1977:22-23) cites two languages that distinguish a certain subset of adjectives morphologically. Rotuman (Churchward, 1940) has an open-ended adjective class, but only the (translations of) the following 12 have distinct singular and plural forms: big; long; broad; whole, complete; black; small; short; narrow, thin; old; white; red; female. Acooli (Crazzolara, 1955) has a closed class of about 40 adjectives, 7 of which have distinct singular and plural forms: great, big, old (of persons); big, large (of volume); long, high, distant (of place and time); good, kind, nice, beautiful; small, little; short; bad, bad tasting, ugly. The remaining adjectives translate as new; old; black; white; red; deep; shallow; broad; narrow; hard; soft; heavy; light; wet; unripe; coarse; warm; cold; sour; wise. These two languages, then, refute Strong Cut-off Point Prototypicality. While 11 of the 12 Rotuman adjectives fall into the prototypical adjective classes of age, dimension, value, and color (female would appear to be the exception), any number of adjectives in these classes do not have distinct singular and plural forms. Since there is an open-ended number of adjectives in the language, and there is no reason to think that old is more prototypical than new or young, or female more prototypical than male, there is no cut-off point separating the prototypical from the nonprototypical. And in Acooli there are even more putatively prototypical adjectives in the class with only one form for number than in the class with two forms. Weak Cut-off Point Prototypicality does not fare much better. It is true that no nonprototypical adjectives (except for the word for female in Rotuman) have two number possibilities. But in this language the exceptions turn out to be the norm: 12 forms out of an open-ended number and 7 out of 40 do not provide very convincing support for what is put forward as a universal statement about prototypicality. Weak Cut-off Point Prototypicality is in even more serious trouble for Turkish. Croft (1991:128), citing work by Lewis (1967), mentions that only a subset of
Syntactic Categories: Against Prototype
233
Turkish adjectives allow reduplication for intensification. These include "basic color terms, 'quick,' 'new,' and 'long,' as well as less prototypical adjectives."7 3.3.3. ENGLISH VERBAL ALTERNATIONS As we have noted, Strong Cut-off Point Prototypicality predicts that there should be no grammatical processes applying only to nonprototypical forms. Levin (1993) has provided us with the means to test this hypothesis with respect to English verbal alternations. Depending on how one counts, she describes from several dozen to almost one hundred such alternations. Significantly, many of these are restricted to nonprototypical stative and psychological predicates. The following are some of the alternations that illustrate this point: Various subject alternations: (11) a. The world saw the beginning of a new era in 1492. b. 1492 saw the beginning of a new era. (12) a. We sleep five people in each room, b. Each room sleeps five people. (13) a. The middle class will benefit from the new tax laws, b. The new tax laws will benefit the middle class. There-insertion: (14) a. A ship appeared on the horizon. b. There appeared a ship on the horizon. Locative inversion: (15) a. A flowering plant is on the window sill, b. On the window sill is a flowering plant. 3.4. Some Explanations for Prototypicality Effects We have seen several examples that seem to falsify cut-off point prototypicality. And yet, there are undeniable correlations between prototypicality and the possibility of morphosyntactic elaboration. In general, actives do progressivize more easily than statives; it would seem that in general certain adjective classes allow more structural elaboration than others; and undeniably there are more verbal alternations in English involving active verbs (or an active and its corresponding stative) than statives alone. Any theory of language should be able to explain why this correlation exists. This section will examine a number of English syntactic processes that manifest such correlations and have thereby been invoked to suggest that categories are
234
Frederick J. Newmeyer
nondiscrete. For each case, I will argue that the facts fall out from a theory with discrete categories and independently needed principles.
3.4.1. MEASURE VERBS AND PASSIVE An old problem is the fact that English measure verbs (cost, weigh, measure, etc.) do not passivize: (16) a. The book cost a lot of money, b. John weighed 180 pounds. (17) a. * A lot of money was cost by the book. b. * 180 pounds was weighed by John. The earliest work in generative grammar attempted to handle this fact by means of arbitrarily marking the verbs of this class not to undergo the rule (Lakoff, 1970). But there has always been the feeling that it is not an accident that such verbs are exceptional—there is something seemingly less "verb-like" about them than, say, an active transitive verb like hit or squeeze. Translated into prototype theory, one might suggest that measure verbs are "on the other side of the cut-off point" for passivization in English. And, in fact, such an analysis has been proposed recently in Ross (1995) (though Ross refers to "defectiveness" rather than to "lack of prototypicality"). I will now argue that these facts can be explained without recourse to prototypes.8 I believe that it was in Bresnan (1978) that attention was first called to the distinction between the following two sentences (see also Bach, 1980): (18) a. The boys make good cakes, b. The boys make good cooks. Notice that the NP following the verb in (18a) passivizes, while that following the verb in (18b) does not: (19) a. Good cakes are made by the boys, b. *Good cooks are made by the boys. Bresnan noted that the argument structures of the two sentences differ. Good cakes in (18a) is a direct object patient, while good cooks in (18b) is a predicate nominative. Given her theory that only direct objects can be "promoted" to subject position in passivization, the ungrammaticality of (19b) follows automatically. Turning to (16a-b), we find that, in crucial respects, the semantic relationship between subject, verb, and post-verbal NP parallels that of (18b). "A lot of money" and "180 pounds" are predicate attributes of "the book" and "John" respectively. In a relationally based framework such as Bresnan's, the deviance of
Syntactic Categories: Against Prototype
235
(17a-b) has the same explanation as that of (19b). In a principles-and-parameters approach, a parallel treatment is available. Since "a lot of money" and "180 pounds" are predicates rather than arguments, there is no motivation for the NP to move, thereby accounting for the deviance of the passives.9 Crucially, there is no need for the grammar of English to refer to the degree of prototypicality either of the verb or of the NP that follows it. 3.4.2. THERE AS A NONPROTOTYPICAL NP Recall Ross's Fake NP Squish, repeated below: (3) The Fake NP Squish a. Animates b. Events c. Abstracts d. Expletive it e. Expletive there f. Opaque idiom chunks Expletive there occupies a position very low on the squish. In other words, it seems to manifest low NP-like behavior. First, let us review why one would want to call it a NP at all. The reason is that it meets several central tests for NP status. It raises over the verb seem and others of its class (20a); it occurs as a passive subject (20b); it inverts over auxiliaries (20c); and it can be coindexed with tags (20d): (20) a. b. c. d.
There seems to be a problem. There was believed to be a problem. Is there a problem? There is a problem, isn't there?
The null hypothesis, then, is that there is an NP, with nothing more needing to be said. Now let us review the data that led Ross to conclude that rules applying to NPs have to be sensitive to their categorial prototypicality. He gives the following ways that there behaves like less than a full NP (the asterisks and question marks preceding each sentence are Ross's assignments): It doesn't undergo the rule of "promotion" (21a-b); it doesn't allow raising to reapply (22a-b); it doesn't occur in the "think o f . . . as X" construction (23a-b) or the "what's . . . doing X" construction (24a-b); it doesn't allow "bemg-deletion" (25a-b); it doesn't occur in dislocation constructions (26a-b); it doesn't undergo towgh-movement (27a-b), topicalization (28a-b), "swooping" (29a-b), "equi" (30a-b), or conjunction reduction (31a-b). In each case an example of a more prototypical NP is illustrated that manifests the process:
236
Frederick J. Newmeyer
Promotion: (21) a. Harpo's being willing to return surprised me. / Harpo surprised me by being willing to return. b. There being heat in the furnace surprised me. / *There surprised me by being heat in the furnace. Double raising: (22) a. John is likely _ to be shown _ to have cheated. b. ?*There is likely _ to be shown _ to be no way out of this shoe. Think of . . . as NP: (23) a. I thought of Freud as being wiggy. b. *I thought of there as being too much homework. What's. . .doing X?: (24) a. What's he doing in jail? b. *What's there doing being no mistrial? Being deletion: (25) a. Hinswood (being) in the tub is a funny thought. b. There *(being) no more Schlitz is a funny thought. Left dislocation: (26) a. Those guys, they're smuggling my armadillo to Helen. b. *There, there are three armadillos in the road. Tough movement: (27) a. John will be difficult to prove to be likely to win. b. *There will be difficult to prove likely to be enough to eat. Topicalization: (28) a. John, I don't consider very intelligent. b. *There, I don't consider to be enough booze in the eggnog. Swooping: (29) a. I gave Sandra my zwieback, and she didn't want any. / I gave Sandra, and she didn't want any, my zwieback. b. I find there to be no grounds for contempt proceedings, and there may have been previously. / *I find there, which may have been previously, to be no grounds for contempt proceedings.
Syntactic Categories: Against Prototype
237
Equi: (30) a. After he laughed politely, Oliver wiped his mustache. / After laughing politely, Oliver wiped his mustache. b. After there is a confrontation, there's always some good old-time head busting. / * After being a confrontation, there's always some good oldtime head-busting. Conjunction reduction: (31) a. Manny wept and Sheila wept. / Manny and Sheila wept. b. There were diplodocuses, there are platypuses, and there may well also be diplatocodypuses. / *There were diplodocuses, are platypuses, and may well also be diplatocodypuses. I wish to argue that all of these distinctions follow from the lexical semantics of there and the pragmatics of its use. What does expletive there mean? The tradition in generative grammar has been to call there a meaningless element, or else to identify it as an existential quantifier with no intrinsic sense. Cognitive linguists, on the other hand, following earlier work by Dwight Bolinger (1977), have posited lexical meaning for it. To Lakoff (1987), for example, expletive there designates conceptual space itself, rather than a location in it. To Langacker (1991:352), there designates an abstract setting construed as hosting some relationship. In fact, we achieve the same results no matter which option we choose. Meaningless elements / abstract settings / conceptual spaces are not able to intrude into one's consciousness, thus explaining (21b) and (23b). (24b) is bad because abstract settings, and so on, cannot themselves "act"; rather they are the setting for action. Furthermore, such elements are not modifiable (29-30) nor able to occur as discourse topics (26-28). (25b) is ungenerable, given the uncontroversial requirement that there occur with a verb of existence. In my opinion and that of my consultants, (22b) and (31b) are fully acceptable. In short, the apparent lack of prototypical NP behavior of expletive there is a direct consequence of its meaning and the pragmatics of its use. Nothing is gained by requiring that the rules that affect it pay attention to its degree of prototypicality. Syntax need no more be sensitive to prototypicality to explain the examples of (21-31) than we need a special syntactic principle to rule out (32): (32) The square circle elapsed the dragon. As a general point, the possibility of syntactic elaboration correlates with the diversity of pragmatic possibilities. Concrete nouns make, in general, better topics, better focuses, better new referents, better established referents, and so on than do abstract nouns. We can talk about actions in a wider variety of discourse contexts and for a greater variety of reasons than states. The syntactic
238
Frederick J. Newmeyer
accommodation to this fact is a greater variety of sentence types in which objects and actions occur than abstract nouns and states. There is no reason to appeal to the prototypicality of the noun or the verb. 3.4.3. ENGLISH IDIOM CHUNKS Notice that in the Fake NP Squish, idiom chunks occupy an even lower position than expletive there. Lakoff (1987:63-64), in his presentation of cognitive linguistics, endorsed the idea that their behavior is a direct consequence of their low prototypicality and even went so far as to claim that idiom chunk NPs can be ranked in prototypicality with respect to each other. Drawing on unpublished work by Ross (1981), he ranked four of them as follows, with one's toe the highest in prototypicality, and one's time the lowest: (33) a. b. c. d.
to stub one's toe to hold one's breath to lose one's way to take one's time
Lakoff (for the most part citing Ross's examples), argued that each idiom was more restricted in its syntactic behavior than the next higher in the hierarchy. For example, only to stub one's toe can be converted into a past participle-noun sequence: (34) a. b. c. d.
A stubbed toe can be very painful. *Held breath is usually fetid when released. *A lost way has been the cause of many a missed appointment. *Taken time might tend to irritate your boss.
To stub one's toe and to hold one's breath allow gapping in their conjuncts: (35) a. b. c. d.
I stubbed my toe, and she hers. I held my breath, and she hers. *I lost my way, and she hers. *I took my time, and she hers.
Pluralization possibilities distinguish to stub one's toe from to hold one's breath, and both of these from to lose one's way and to take one's time. When to stub one's toe has a conjoined subject, pluralization is obligatory; for to hold one's breath it is optional; and for the latter two it is impossible: (36) a. b. c. d.
Betty and Sue stubbed their toes. *Betty and Sue stubbed their toe. Betty and Sue held their breaths. Betty and Sue held their breath.
Syntactic Categories: Against Prototype
e. f. g. h.
239
*Betty and Sue lost their ways. Betty and Sue lost their way. *Betty and Sue took their times. Betty and Sue took their time.
Finally, Lakoff judges all but to take one's time to allow pronominalization: (37) a. b. c. d.
I stubbed my toe, but didn't hurt it. Sam held his breath for a few seconds, and then released it. Harry lost his way, but found it again. *Harry took his time, but wasted it.
Lakoff concludes: In each of these cases, the nounier nouns follow the general rule . . . while the less nouny nouns do not follow the rule. As the sentences indicate, there is a hierarchy of nouniness among the examples given. Rules differ as to how nouny a noun they require. (Lakoff, 1987:64).
In all cases but one, however, I have found an independent explanation for the range of judgments on these sentences. Beginning with the participle test, we find that (for whatever reason) held and taken never occur as participial modifiers, even in their literal senses: (38) a. *Held cats often try to jump out of your arms, b. *The taken jewels were never returned. I confess to not being able to explain (34c), since lost does occur in this position in a literal sense: (39)
A lost child is a pathetic sight.
Turning to Gapping, sentences (40a-d) show that the facts cited by Lakoff have nothing to do with the idioms themselves:10 (40) a. b. c. d.
I lost my way, and she her way. I took my time, and she her time. ?I ate my ice cream and she hers. In the race to get to the airport, Mary and John lost their way, but we didn't lose ours (and so we won).
(40a-b) illustrates that the idioms lose one's way and take one's time do indeed allow gapping in their conjuncts. The correct generalization appears to lie in discourse factors. Gapping apparently requires a contrastive focus reading of the gapped constituent. Hence (40c) seems as bad as (35c-d), while (40d) is fine. In the examples involving plurals, what seems to be involved is the ability to individuate. We can do that easily with toes and less easily, but still possibly, with
240
Frederick J. Newmeyer
breaths. But we cannot individuate ways and times. So, nonplural (41a) is impossible—legs demand individuation—while plural (41b) is impossible as well, for the opposite reason. Rice in its collective sense is not individuated: (41) a. *Betty and Sue broke their leg. b. *My bowl is full of rices. Finally, (42a) and (42b) illustrate that time in take one's time is not resistant to pronominalization. (42a) is much improved over (37d) and (42b) is impeccable: (42) a. Harry took his time, and wasted it. b. Harry took his time, which doesn't mean that he didn't find a way to waste it. Again, there is no reason whatever to stipulate that grammatical processes have to be sensitive to the degree of prototypicality of the NP. Independently needed principles—none of which themselves crucially incorporate prototypicality—explain the range of acceptability. 3.4.4. EVENT STRUCTURE AND INFLECTIONAL POSSIBILITIES Let us further explore why there is in general a correlation between degree of prototypicality and inflectional possibilities. Since Vendler (1967) it has been customary to divide the aspectual properties of verbs (and the propositions of which they are predicates) into four event types, generally referred to as "states," "processes," "achievements," and "accomplishments." States (know, resemble) are not inherently bounded, have no natural goal or outcome, are not evaluated with respect to any other event, and are homogeneous. Processes (walk, run) are durative events with no inherent bound. Achievements (die, find, arrive) are momentary events of transition, while Accomplishments (build, destroy) are durative events with a natural goal or outcome. There have been a number of proposals for the representation of event structure (Dowty, 1979; Grimshaw, 1990; Pustejovsky, 1995). The following are the proposals of Pustejovsky (1991): (43) a. States:
b. Processes:
c. Achievements and Accomplishments have the same schematic structure (both are called "transitions"), though the former are nonagentive and the latter agentive:
Syntactic Categories: Against Prototype
241
Two observations are in order. The first is that there is a general increase in the complexity of event structure from states to accomplishments. The second is that this increase in complexity corresponds roughly to the the degree of prototypicality for verbs. From these observations, we may derive the reason for the correlation between prototypicality and inflectional possibilities to hold for verbs as a general tendency. There is clearly a mapping between the event structure of a proposition and those aspects of morphosyntactic structure in which tense, aspect, modality, and so on are encoded. In standard varieties of principles-andparameters syntax, again, this is the "functional structure" representation of the sentence. Now, the more complex the event structure of a proposition, the more aspectual possibilities that it allows. Hence, the more complex (in terms of number of projections) the functional structure can be. And, of course, it follows that the possibilities for inflection will be greater. In other words, we have derived a general correlation between degree of prototypicality and potential richness of verbal inflection without any reference to prototypicality per se. It should be pointed out that this approach demands that functional projections exist only where they are motivated (i.e., that there can be no empty projections). Otherwise, there would be no difference in functional structure between states and accomplishments. In other words, it presupposes the principle of Minimal Projection, proposed and defended in Grimshaw (1993). According to this principle, a projection must be functionally interpreted, that is, it must make a contribution to the functional representation of the extended projection of which it is part.11 The correlation between semantic complexity and inflectional possibilities holds for nouns as well. Objects (e.g., people, books, automobiles, etc.) can be individuated and specified in a way that abstract nouns such as liberty and dummy nouns like there cannot. So it follows that the semantic structure of concrete nouns will have the general potential to map onto more layers of nominal functional structure than that of abstract nouns. There are two problems, however, that are faced by an exclusively semantic account of restrictions on inflection. The first is that some languages have inflections that are restricted to some forms but not others, even though there appears to be no semantic explanation for the restriction. For example, Croft (1990:82) notes that process verbs in Quiche take tense-aspect inflectional prefixes, while stative verbs do not and writes: There is no apparent reason for this, since there is no semantic incompatibility between the inflectional prefixes ... and in fact in a language like English stative predicates do inflect for tense. It is simply a grammatical fact regarding the
242
Frederick J. Newmeyer expression of stative predicates in Quiche. As such, it provides very strong evidence for the markedness of stative predicates compared to process predicates.
Although I do not pretend to have a full explanation for cases such as these, I would venture to guess that pragmatic factors are overriding semantic ones. Both process verbs and stative verbs can logically manifest tense, aspect, and modality, though in many discourse contexts such distinctions are irrelevant for the latter. Thus pragmatic factors have kept the grammars of Quiche and languages manifesting similar phenomena from grammaticalizing tense, aspect, and modality for stative verbs. No appeal to prototypicality is necessary. A second problem with a semantic approach to inflection is that inflections that appear to be semantically empty also manifest prototype effects. Take agreement inflections, for example. As is noted in Croft (1988), where languages have a "choice" as to object agreement, it is always the more definite and/or animate (i.e., more prototypical) direct object that has the agreement marker. Does this fact support prototype theory? Not necessarily: In the same paper Croft argues that agreement has a pragmatic function, namely to index important or salient arguments. So, if agreement markers are not "pragmatically empty," then their presence can be related to their discourse function and need not be attributed to the inherently prototypical properties of the arguments to which they are affixed.
4. THE NONEXISTENCE OF FUZZY CATEGORIES We turn now to the question of whether categories have distinct boundaries, or, alternatively, whether they grade one into the other in fuzzy squish-like fashion. I examine two phenomena that have been appealed to in support of fuzzy categories—English near (section 4.1) and the Nouniness Squish (section 4.2)—and conclude that no argument for fuzzy categories can be derived from them. 4.1. English Near Ross (1972) analyzes the English word near as something between an adjective and a preposition. Like an adjective, it takes a preposition before its object (44a) and like a preposition, it takes a bare object (44b): (44) a. The shed is near to the barn, b. The shed is near the barn. So it would appear, as Ross concluded, that there is a continuum between the categories Adjective and Preposition, and near is to be analyzed as occupying a position at the center of the continuum. A prototype theorist would undoubtedly
Syntactic Categories: Against Prototype
243
conclude that the intermediate position of near is a consequence of its having neither prototypical adjectival nor prototypical prepositional properties. In fact, near can be used either as an adjective or a preposition. Maling (1983) provides evidence for the former categorization. Like any transitive adjective, it takes a following preposition (45a); when that preposition is present (i.e., when near is adjectival), the degree modifier must follow it (45b); it takes a comparative suffix (45c); and we find it (somewhat archaically) in prenominal position (45d): (45) a. b. c. d.
The gas station is near to the supermarket. Near enough to the supermarket Nearer to the supermarket The near shore
Near passes tests for Preposition as well. It takes a bare object (46a); when it has a bare object it may not be followed by enough (46b), but may take the prepositional modifier right (46c): (46) a. The gas station is near the supermarket. b. *The gas station is near enough the supermarket.12 c. The gas station is right near (*to) the supermarket. It is true that, as a Preposition, it uncharacteristically takes an inflected comparative: (47) The gas station is nearer the supermarket than the bank. But, as (48) shows, other prepositions occur in the comparative construction; it is only in its being inflected, then, that near distinguishes itself:13 (48) The seaplane right now is more over the lake than over the mountain. Thus I conclude that near provides no evidence for categorial continua. 4.2. The Nouniness Squish Recall Ross's Nouniness Squish (5), repeated below: (5) The "Nouniness Squish": that clauses > for to clauses > embedded questions > Ace-ing complements > Poss-ing complements > action nominals > derived nominals > underived nominals This squish grades nouns (or, more properly, the phrases that contain them) along a single dimension—their degree of nominality. Subordinate and relative clauses introduced by the complementizer that are held to be the least nominal; underived nominals (i.e., simple nouns) are held to be the most nominal. But, according to Ross, there is no fixed point at which the dominating phrase node
244
Frederick J. Newmeyer
ceases to be S and starts to be NP; each successive element on the squish is held to be somewhat more nominal than the element to its left. As Ross is aware, demonstrating a fuzzy boundary between S and NP entails (minimally) showing that syntactic behavior gradually changes as one progresses along the squish; that is, that there is no place where S "stops" and NP "starts." We will now examine two purported instances of this graduality. As we will see, the facts are perfectly handlable in an approach that assumes membership in either S or NP. First, Ross claims that "the nounier a complement is, the less accessible are the nodes it dominates to the nodes which command the complement" (Ross, 1973b: 174). That is, it should be harder to extract from a phrase headed by an underived N than from a that clause. He illustrates the workings of this principle with the data in (49) and concludes that "the dwindling Englishness of [these] sentences supports [this principle]" (p. 175): (49) a. b. c. d. e. f. g. h.
I wonder who he resented (it) that I went steady with. I wonder who he would resent (it) for me to go steady with. *I wonder who he resented how long I went steady with. ?I wonder who he resented me going out with. ??I wonder who he resented my going out with. ?*I wonder who he resented my careless examining of. ?*I wonder who he resented my careless examination of. ?*I wonder who he resented the daughter of.
But one notes immediately that, even on Ross's terms, we do not find consistently "dwindling Englishness": (49c) is crashingly bad. Furthermore, he admits (in a footnote) that many speakers find (49h) fine. In fact, the data seem quite clear to me: (49a-b, d-e, h) are acceptable, and (49c, f-g) are not. The latter three sentences are straightforward barriers violations, given the approach of Chomsky (1986), while the others violate no principles of Universal Grammar. The degree of "nouniness" plays no role in the explanation of these sentences. Second, Ross suggests that the phenomenon of pied piping—that is, whmovement carrying along material dominating the fronted wh-phrase, is sensitive to degree of nouniness. He cites (50a–f) in support of this idea. It would appear, he claims, that the more nouny the dominating phrase, the more pied piping is possible: (50) a. b. c.
*Eloise, [for us to love [whom]] they liked, is an accomplished washboardiste. *Eloise, [us loving [whom]] they liked, is an accomplished washboardiste. *Eloise, [our loving [whom]] they liked, is an accomplished washboardiste.
Syntactic Categories: Against Prototype
245
d. ?*Eloise, [our loving of [whom]] they liked, is an accomplished washboardiste. e. ?Eloise, [our love for [whom]] they liked, is an accomplished washboardiste. f. Eloise, [a part of [whom]] they liked, is an accomplished washboardiste. Again, there is no support for a categorial continuum in these data. For to clauses, Ace-ing complements, and Poss-wg complements are all dominated by the node S, which can never pied pipe. Hence (50a-c) are ungrammatical. (50d-e), on the other hand, are all fully grammatical, though this is masked by the stylistic awkwardness of the loving... liked sequence. By substituting was a joy to our parents for they liked some of the awkwardness is eliminated and both sentences increase in acceptability.
5. CONCLUSION The classical view of syntactic categories assumed in most models of generative grammar has seen two major challenges. In one, categories have a prototype structure, in which they have "best-case" members and members that systematically depart from the "best case." In this approach, the optimal grammatical description of morphosyntactic processes is held to involve reference to degree of categorial deviation from the "best case." The second challenge hypothesizes that the boundaries between categories are nondistinct, in the sense that one grades gradually into another. This chapter has defended the classical view, arguing that categories have discrete boundaries and are not organized around central "best cases." It has argued that many of the phenomena that seem to suggest the inadequacy of the classical view are best analyzed in terms of the interaction of independently needed principles from syntax, semantics, and pragmatics.
NOTES 1 An earlier version of this chapter was presented at Universidade Federal de Minas Gerais, Universidade de Campinas, Universidade Federal do Rio de Janeiro, the University of California at San Diego, and the University of Washington, as well as at two conferences: The International Conference on Syntactic Categories (University of Wales) and the Fifth International Pragmatics Conference (National Autonomous University of Mexico). It has benefited, I feel, from discussion with Paul K. Andersen, Leonard Babby, Robert
246
Frederick J. Newmeyer
Borsley, Ronnie Cann, William Croft, John Goldsmith, Jeanette Gundel, Ray Jackendoff, Jurgen Klausenburger, Rob Malouf, Pascual Masullo, Edith Moravcsik, Elizabeth Riddle, Margaret Winters, and Arnold Zwicky. I have no illusions, however, that any of these individuals would be entirely happy about the final product. For deeper discussion of many of the issues treated here, see Newmeyer (1998). 2 In most generative approaches, categories have an internal feature structure, which allows some pairs of categories to share more features than others and individual categories to be unspecified for particular features. Head-driven Phrase Structure Grammar (HPSG) goes further, employing default-inheritance mechanisms in the lexicon. These lead, in a sense, to some members of a category being "better" members of that category than others (see, for example, the HPSG treatment of auxiliaries in Warner (1993a, b). A similar point can be made for the "preference rules" of Jackendoff and Lerdahl (1981) and Jackendoff (1983). Nevertheless, in these (still algebraic) accounts, the distance of an element from the default setting is not itself directly encoded in the statement of grammatical processes. 3 For more recent work on prototype theory and meaning representation, see Coleman and Kay (1981); Lakoff (1987); Geeraerts (1993); and many of the papers in Rudzka-Ostyn (1988), and Tsohatzidis (1990). For a general discussion of prototypes, specifically within the framework of cognitive linguistics, see Winters (1990). 4 Ross's work did not employ the vocabulary of the then nascent prototype theory. However, as observed in Taylor (1989:189), his reference to "copperclad, brass-bottomed NP's" (p. 98) to refer to those at the top of the squish leaves no doubt that he regarded them as the most "prototypical" in some fundamental sense. 5 Iam indebted to William Croft (personal communication) for clarifying the differences between his approach and Ross's. 6 Along the same lines, Robert Borsley informs me (personal communication) that Welsh and Polish copulas provide evidence against the idea that prototypical members of a category necessarily have more inflected forms than nonprototypical members. One assumes that the copula is a nonprototypical verb, but in Welsh it has five (or six) tenses compared with three (or four) for a standard verb, and in Polish it has three tenses, compared with two for a standard verb. On the other hand, one might take the position expressed in Croft (1991) that copulas are categorially auxiliaries, rather than verbs. 7 And it should be pointed out that Dixon says that words in the semantic field of "speed" (e.g., quick) tend to lag behind the four most prototypical classes in their lexicalization as adjectives. 8 I would like to thank Pascual Masullo and Ray Jackendoff (personal communication) for discussing with me the alternatives to the prototype-based analysis. 9 Adger (1992, 1994) offers a treatment of measure verbs roughly along these lines. In his analysis, measure phrases, being "quasi-arguments" (i.e., not full arguments), do not raise to the specifier of Agreement, thereby explaining the impossibility of passivization. Indeed, thematic role-based analyses, going back at least to Jackendoff (1972), are quite parallel. For Jackendoff, measure phrases are "locations." They are unable to displace the underlying "theme" subjects of verbs such as cost or weigh, since "location" is higher on the thematic hierarchy than "theme." Calling them "locations," it seems to me, is simply another way of saying that they are not true arguments. 10 I am indebted to Ronnie Cann for pointing this out to me.
Syntactic Categories: Against Prototype
247
"Grimshaw (1997) derives Minimal Projection from the principles of Economy of Movement and Oblig Heads. 12 Maling (1983) judges sentences of this type acceptable, and on that basis rejects the idea that near is a P. I must say that I find (46b) impossible. 13 Presumably the inflectional possibilities of near are properties of its neutralized lexical entry; not of the ADJ or P branch of the entry.
REFERENCES Adger, D. (1992). The licensing of quasi-arguments. In P. Ackema and M. Schoorlemmer (Eds.), Proceedings of ConSole I (pp. 1-18). Utrecht: Utrecht University. Adger, D. (1994). Functional heads and interpretation. Unpublished Ph.D. thesis, University of Edinburgh. Armstrong, S. L., Gleitman, L., and Gleitman, H. (1983). What some concepts might not be. Cognition, 13, 263-308. Bach, E. (1980). In defense of passive. Linguistische Berichte, 70, 38-46. Bates, E., and MacWhinney, B. (1982). Functionalist approaches to grammar. In E. Wanner and L. Gleitman (Eds.), Language acquisition: The state of the art (pp. 173-218). Cambridge: Cambridge University Press. Bolinger, D. (1977). Meaning and form. English Language Series 11. London: Longman. Bresnan, J. W. (1978). A realistic transformational grammar. In M. Halle, J. Bresnan, and G. Miller (Eds.), Linguistic theory and psychological reality (pp. 1-59). Cambridge, MA: MIT Press. Chomsky, N. (1986). Barriers. Cambridge, MA: MIT Press. Churchward, C. M. (1940). Rotuman grammar and dictionary. Sydney: Australasian Medical Publishing Co. Coleman, L., and Kay, P. (1981). Prototype semantics. Language, 57, 26-44. Comrie, B. (1989). Language universals and linguistic typology (2nd ed.). Chicago: University of Chicago Press. Crazzolara, J. P. (1955). A study of the Acooli language. London: Oxford University Press. Croft, W. (1988). Agreement vs. case marking and direct objects. In M. Barlow and C. A. Ferguson (Eds.), Agreement in natural language: Approaches, theories, descriptions (pp. 159-179). Stanford, CA: Center for the Study of Language and Information. Croft, W. (1990). Typology and universals. Cambridge: Cambridge University Press. Croft, W. (1991). Syntactic categories and grammatical relations. Chicago: University of Chicago Press. Cruse, D. A. (1992). Cognitive linguistics and word meaning: Taylor on linguistic categorization. Journal of Linguistics, 28, 165-184. Dixon, R. M. W. (1977). Where have all the adjectives gone? Studies in Language, 1, 1-80. Dowty, D. R. (1979). Word meaning and Montague grammar. Dordrecht: Reidel. Dryer, M. S. (1997). Are grammatical relations universal? In J. Bybee, J. Haiman, and
248
Frederick J. Newmeyer
S. A. Thompson (Eds.), Essays on language function and language type (pp. 115143). Amsterdam: John Benjamins. Fodor, J. A., and Lepore, E. (1996). The red herring and the pet fish: Why concepts still can't be prototypes. Cognition, 58, 253-270. Gazdar, G., and Klein, E. (1978). Review of Formal semantics of natural language by E. L. Keenan (ed.). Language, 54, 661-667. Geeraerts, D. (1993). Vagueness's puzzles, polysemy's vagaries. Cognitive Linguistics, 4, 223-272. Givon, T. (1984). Syntax: A functional-typological introduction (vol. 1). Amsterdam: John Benjamins. Goldsmith, J., and Woisetschlaeger, E. (1982). The logic of the English progressive. Linguistic Inquiry, 13, 79-89. Grimshaw, J. (1990). Argument structure. Cambridge, MA: MIT Press. Grimshaw, J. (1993). Minimal projection, heads, and optimality. Technical Report 4. Piscataway, NJ: Rutgers Center for Cognitive Science. Grimshaw, J. (1997). Projection, heads, and optimality. Linguistic Inquiry, 28, 373-422. Heine, B. (1993). Auxiliaries: Cognitive forces and grammaticalization. New York: Oxford University Press. Hopper, P. J., and Thompson, S. A. (1984). The discourse basis for lexical categories in universal grammar. Language, 60, 703-752. Hopper, P. J., and Thompson, S. A. (1985). The iconicity of the universal categories 'noun' and 'verb.' In J. Haiman (Ed.), Iconicity in syntax (pp. 151-186). Amsterdam: John Benjamins. Jackendoff, R. (1972). Semantic interpretation in generative grammar. Cambridge, MA: Cambridge University Press. Jackendoff, R. (1983). Semantics and cognition. Cambridge, MA: MIT Press. Jackendoff, R., and Lerdahl, F. (1981). Generative music theory and its relation to psychology. Journal of Music Theory, 25, 45-90. Kamp, H., and Partee, B. H. (1995). Prototype theory and compositionality. Cognition, 57, 129-191. Kearns, K. S. (1991). The semantics of the English progressive. Unpublished Ph.D. dissertation, MIT. Keil, F. C. (1989). Concepts, kinds, and cognitive development. Cambridge, MA: Bradford Books. Lakoff, G. (1970). Irregularity in syntax. New York: Holt, Rinehart, and Winston. Lakoff, G. (1972). Hedges: A study in meaning criteria and the logic of fuzzy concepts. Chicago Linguistic Society, 8, 183-228. Lakoff, G. (1973). Fuzzy grammar and the performance / competence terminology game. Chicago Linguistic Society, 9, 271-291. Lakoff, G. (1987). Women, fire, and dangerous things: What categories reveal about the mind. Chicago: University of Chicago Press. Langacker, R. W. (1987). Nouns and verbs. Language, 63, 53-94. Langacker, R. W. (1991). Foundations of cognitive grammar: volume 2; descriptive application. Stanford, CA: Stanford University Press. Langendonck, W. van (1986). Markedness, prototypes, and language acquisition. Cahiers de I'institut de linguistique de Louvain, 12, 39-76.
Syntactic Categories: Against Prototype
249
Levin, B. (1993). English verb classes and alternations: A preliminary investigation. Chicago: University of Chicago Press. Lewis, G. L. (1967). Turkish grammar. Oxford: Oxford University Press. Maling, J. (1983). Transitive adjectives: A case of categorial reanalysis. In F. Heny and B. Richards (Eds.), Linguistic categories: Auxiliaries and related puzzles 1: Categories (pp. 253-289). Dordrecht: Reidel. Mervis, C. B., and Rosch, E. (1981). Categorization of natural objects. Annual Review of Psychology, 32, 89-115. Moravcsik, E. A. (1983). On grammatical classes—the case of "definite" objects in Hungarian. Working Papers in Linguistics, 15, 75-107. Newmeyer, F. J. (1998). Language form and language function. Cambridge, MA: MIT Press. Pustejovsky, J. (1991). The syntax of event structure. Cognition, 41, 47-81. Pustejovsky, J. (1995). The generative lexicon. Cambridge, MA: MIT Press. Rosch, E. (1971/1973). On the internal structure of perceptual and semantic categories. In T. E. Moore (Ed.), Cognitive development and the acquisition of language (pp. 111144). New York: Academic Press. Rosch, E., and Lloyd, B. B. (Eds.) (1978). Cognition and categorization. Hillsdale, NJ: Erlbaum. Ross, J. R. (1972). The category squish: Endstation Hauptwort. Chicago Linguistic Society, 8, 316-328. Ross, J. R. (1973a). A fake NP squish. In C.-J. N. Bailey and R. Shuy (Eds.), New ways of analyzing variation in English (pp. 96-140). Washington: Georgetown. Ross, J. R. (1973b). Nouniness. In O. Fujimura (Ed.), Three dimensions of linguistic theory (pp. 137-258). Tokyo: TEC Company, Ltd. Ross, J. R. (1975). Clausematiness. In E. L. Keenan (Ed.), Formal semantics of natural language (pp. 422-475). London: Cambridge University Press. Ross, J. R. (1981). Nominal decay. Unpublished ms., MIT. Ross, J. R. (1987). Islands and syntactic prototypes. Chicago Linguistic Society, 23, 309320. Ross, J. R. (1995). Defective noun phrases. Chicago Linguistic Society, 31, 398-440. Rudzka-Ostyn, B. (Ed.) (1988). Topics in cognitive linguistics. Amsterdam: John Benjamins. Silverstein, M. (1976). Hierarchy of features and ergativity. In R. M. W. Dixon (Ed.), Grammatical categories in Australian languages (pp. 112-171). Canberra: Australian Institute of Aboriginal Studies. Smith, C. (1991). The parameter of aspect. Dordrecht: Kluwer. Smith, E. E., andOsherson, D. N. (1988). Conceptual combination with prototype concepts. In A. Collins and E. E. Smith (Eds.), Readings in cognitive science: A perspective from psychology and artificial intelligence (pp. 323-335). San Mateo, CA: M. Kaufman. Swart, H. de (1998). Aspect shift and coercion. Natural Language and Linguistic Theory. Taylor, J. R. (1989). Linguistic categorization: Prototypes in linguistic theory. Oxford: Clarendon. Thompson, S. A. (1988). A discourse approach to the cross-linguistic category 'adjective.' In J. Hawkins (Ed.), Explaining language universals (pp. 167-185). Oxford: Basil Blackwell.
250
Frederick J. Newmeyer
Tsohatzidis, S. L. (Ed.) (1990). Meanings and prototypes: Studies in linguistic categorization. London: Routledge. Van Oosten, J. (1986). The nature of subjects, topics, and agents: A cognitive explanation. Bloomington, IN: Indiana University Linguistics Club. Vendler, Z. (1967). Linguistics in philosophy. Ithaca, NY: Cornell University Press. Warner, A. R. (1993a). English auxiliaries: Structure and history. Cambridge: Cambridge University Press. Warner, A. R. (1993b). The grammar of English auxiliaries: An account in HPSG. York Research Papers in Linguistics Research Paper (YLLS/RP 1993-4), 1-42. Wierzbicka, A. (1986). What's in a noun? (Or: How do nouns differ in meaning from adjectives?). Studies in Language, 10, 353-389. Wierzbicka, A. (1990). 'Prototypes save': On the uses and abuses of the notion of 'prototype' in linguistics and related fields. In S. L. Tsohatzidis (Ed.), Meanings and prototypes: Studies in linguistic categorization (pp. 347-367). London: Routledge. Winters, M. E. (1990). Toward a theory of syntactic prototypes. In S. L. Tsohatzidis (Ed.), Meanings and prototypes: Studies in linguistic categorization (pp. 285-306). London: Routledge. Zegarac, V. (1993). Some observations on the pragmatics of the progressive. Lingua, 90, 201-220.
SYNTACTIC COMPUTATION AS LABELED DEDUCTION: WH A CASE STUDY RUTHKEMPSON* WILFRIED MEYER VIOl + DOVGABBAY + * Department of Philosophy Kings College London University of London London, United Kingdom + Department of Computing Kings College London University of London London, United Kingdom
*Department of Computing Kings College London University of London London, United Kingdom
1. THE QUESTION Over the past 30 years, the phenomenon of long-distance dependence has become one of the most well-studied phenomena. Requiring as it does correlation between some position in a string and the c-commanding operator which determines its interpretation, it is uncontroversially assumed across different theoretical frameworks to involve an operator-variable binding phenomenon as in standard predicate logics (cf. Chomsky, 1981; Morrill, 1994; Pollard and Sag, 1991; Johnson and Lappin, 1997). However, it is known to display a number of properties which distinguish it from the logical operation of quantifier variable binding, and these discrepancies are taken to be indicative of the syntactic idiosyncracy of Syntax and Semantics, Volume 32 The Nature and Function of Syntactic Categories
251
Copyright © 2000 by Academic Press All rights of reproduction in any form reserved. 0092-4563/99 $30
252
Ruth Kempson et al.
natural language formalisms. Investigation of these properties has led to the postulation of increasing numbers of discrete phenomena. There has been little attempt until recently to ask the question as to why the overall cluster of w/z-processes exist (for recent partial attempts, cf. Cheng, 1991; Muller and Sternefeld, 1996).1 The primary purpose of this chapter is to propose an answer to this question. Having set out an array of largely familiar data in section 1, in section 2 we develop the LDSNL framework, within which the analysis is set. This is a formal deductive framework being established as a model of the process of utterance interpretation. Then in section 3 we present a unified account of the crossover phenomenon, and in sections 4-5 we briefly indicate analyses of wh-in situ, multiple wh-questions, and partial wh-movement phenomena, showing how a typology of wh-variation emerges. In all cases, the solution will make explicit reference to the discrete stages whereby interpretation is incrementally built up in moving on a left-right basis from the initial empty state to the completed specification of a logical form corresponding to the interpretation of the string in context. The account is thus essentially procedural, in the sense of focusing not just on properties of some resulting interpretation, but on how it is established stepwise. In closing we reflect on the direction which this conclusion suggests—that the boundaries between syntax, semantics, and pragmatics need to be redrawn, with syntax redefined as the dynamic projection of structure within an abstract parsing schema. 1.1. Failure to Display Scopal Properties in Parallel with Quantifying Expressions As is well known, w/z-expressions fail to display scopal properties in parallel with quantifying expressions. Initial w/z-expressions may take narrow scope with respect to any operator following it as long as that operator precedes the position of the gap. Hence (1) allows as answers to this question, ones in which the wh has been construed as having scope relative to the expression every British farmer in the subordinate clause. (1) What is the Union insisting that every British farmer should get rid of ? Answer: At least 1000 cattle. Answer: His cattle. On the assumption that scope is displayed in the syntactic structure assigned to the string, questions such as these appear to require an LF specification which displays the relative scope of the two expressions in contravention of the structure associated with the surface string. This phenomenon is quite unlike quantifiers in logical systems. A given quantifier may bind free variables if and only if these variables are within its scope, where this is defined by the rule of syntax that introduces that quantifier, hence by definition guaranteeing a configuration equivalent to c-command. Furthermore, other natural language quantifiers behave
Computation as Labeled Deduction
253
much more like logical quantifiers, and in the main must be interpreted internally to the clause in which they are contained.2 Thus (2)-(3) are unambiguous. Neither can be interpreted with the quantified expression in the subordinate clause taking scope over the matrix subject: (2) Every British farmer is complaining that most countries of the EU fail to appreciate the problem =£ Tor most countries of the EUX, every British farmer is complaining that x fails to appreciate the problem' (3) Most countries of the EU are responding that every British farmer fails to appreciate the seriousness of the problem. 'Of every British farmery most countries of the EU are responding that y fails to appreciate the seriousness of the problem.' This phenomenon can be analyzed by defining w/i-expressions to be a complex higher-type quantifier simultaneously binding two positions, one of which is an invisible pronominal element (Chierchia, 1992), but this technical solution fails to provide any basis for explaining other phenomena associated with whexpressions. Crossover phenomena in particular, though an essential piece of supporting evidence for this analysis, become a mere syntactic stipulation. 1.2. Crossover Pretheoretically, the crossover phenomena is simply the interaction between wh and anaphora construal. Within the General Binding (GB) paradigm, this has been seen as dividing into at least three discrete phenomena (Chomsky, 1981; Lasnik and Stowell, 1991; Postal, 1993). The data are as follows: (4)
*Whoi, does Joan think that hei worries ei is sick?
(5) *Who, does Joan think that his, mother worries ei is sick? (6) *Whosei exam resultsi was hei certain ej would be better than anyone else's? (7) Whoi does Joan think ei worries hisi mother is sick? (8) Whoi does Joan think ei worries that he, is sick? (9) Whosei exam resultsj ej were so striking that hei was suspected of cheating? (10) *Johni who Sue thinks that hei worries ei is sick unecessarily, was at the lecture. (11) John, whoi hisi mother had ignored ei fell ill during the exam period. (12) John, whosei exam results hei had been certain ei would be better than anyone else's, failed dismally.
254
Ruth Kempson et al.
The need to distinguish discrete subclasses of phenomena arises from the analysis of the gap as a name, subject to Principle C of the A-binding principles (Chomsky, 1981). A strong crossover principle is said to preclude a gap (as a name) being coindexed with any c-commanding argument expression, hence precluding (4), (10), and possibly (6), while licensing (7)-(9) on the grounds that the relation between gap and w/z-operator is a relation of A' binding and not of A-binding. ((6) has been dubbed "extended strong crossover" because the wh-expression, being a possessive determiner, doesn't, strictly speaking, bind the gap, but a subexpression within it.) Such a restriction however fails to preclude (5) and (6), for which a separate restriction of weak crossover is set up. There are several versions of this principle (Higginbotham, 1981; Koopman and Sportiche, 1982; Chomsky, 1981; Lasnik and Stowell, 1991)—the simplest is that a pronoun which does not c-command a given trace may nevertheless not be coindexed with it if it is to the left of the trace (and right of the binding operator). This restriction in turn, however, fails to predict that in some circumstances this restriction may get suspended, as in (11)-(12), and an alternative analysis in which the traces in this position are not names but prominal-like "epithet" expression, is advocated. The phenomenon is thereby seen as a cluster of heterogeneous data, not amenable to a unified analysis. No explanation is proffered for why the data should be as they are, and Postal (1993) describes the phenomenon as a mystery. 1.3. Subjacency Effects and the W/i-initial versus Wh-in Situ Asymmetry There are also the familiar island restrictions associated with wh-initial expressions, which are alien to quantification in formal systems, and so, like the other data, differentiate long-distance dependency effects from regular operatorvariable binding. However, more striking is that wh-in situ expressions, despite commonly being said to be subject to the same movement as wh-initial expressions but at the level of LF (Reinhart, 1991; Aoun and Li, 1991; Huang, 1982) are characteristically not subject to these same restrictions. So, unlike (13), (14) allows an interpretation in which the w/i-expression is construed, so to speak, externally to the domain within which the wh-expression is situated: (13)
* Which document did the journalist that leaked to the press apologize?
(14)
The journalist that leaked which document to the press became famous overnight?
The phenomenon of wh-in situ is arguably peripheral in English, but in languages where this is the standard form of wh-question, the distribution of wh-in situ, unless independently restricted (cf. the data of Iraqi Arabic below), is characteristically not subject to the same constraints as w/z-movement (Chinese, Japanese, Malay) (data from Simpson, 1995):
Computation as Labeled Deduction
255
(15) Ni bijiao xihuan [[ta zenmeyang zhu] de cat] ? CHINESE You more like how cook REL food 'What is the means x such that you prefer the dishes which he cooks by x?' 1.4. Multiple W/t-Structures Paired with this phenomenon are multiple w/z-questions, of which the initial wh-expression is subject to island restrictions, but the w/i-expression in situ is not: (16) Who do you think should review which book? (17) *WhOi did the journalist leak the document in which Sue had criticized ei to which press? (18)
Who reported the journalist that leaked which document to the press?
1.5. Partial Wh-Movement Of the set of data we shall consider, there is finally the phenomenon in German dubbed "partial wh-movement" in which apparently expletive wh-elements anticipate full wh-expressions later in the string, but are not themselves binders of any gapped position. (19)
Was glaubst du was Hans meint mil wem Jakob gesprochen hat? 'With whom do you think Hans thought/said Jakob had spoken?'
Such expletive elements must invariably take the form was in all complementizer positions between the initial position and the wh-expression they anticipate, but subsequent to that full wh-expression, the complementizer selected must be dass. This gives rise to a number of discrete forms, with identical interpretation: (20)
Was glaubst du mit wem Hans meint dass Jakob gesprochen hat?
(21) Mit wem glaubst du dass Hans meint dass Jakob gesprochen hat? 'With whom do you think Hans thought/said Jakob had spoken?' This phenomenon, with minor variations, is widespread in languages in which the primary structure is the wh-'m situ form. Iraqi Arabic for example has a reduced wh-expression which is suffixed to the verb indicating the presence of a w/z-expression in a subordinate clause. However, unlike German, the subordinate clause contains the full wh-expression in situ. Also unlike German, this suffix sh—, a reduced form of sheno (= 'what'), must precede the verb in each clause between the initial clause carrying the first instance of the expletive and the clause within which the full wh-expression itself occurs. Without sh—, the presence of the wh- in situ in a tensed clause is ungrammatical (data from Simpson, 1995):
256
Ruth Kempson et al.
(22) Mona raadat [riijbir Su 'ad tisa 'ad meno] ? Mona wanted to force Suad to help who 'Who did Mona want to force Suad to help?' (23)
*Mona tsawwarat [AH ishtara sheno]? Mona thought All bought what (Intended: 'What did Mona think that Ali bought?')
(24)
Sheno, tsawwarit Mona [Ali ishtara e i ]? What thought Mona Ali bought 'What did Mona think Ali bought?'
(25)
sh- 'tsawwarit Mona [Ali raah weyn ] ? Q-thought Mona Ali went where 'Where did Mona think that Ali went?'
(26) sh-'tsawwarit Mona [Ali ishtara sheno]! Q-thought Mona Ali bought 'What did Mona think Ali bought?' These phenomena have only recently been subject to serious study, but their analysis in all frameworks remains controversial (McDaniel, 1989; Dayal, 1994; Simpson, 1995; Johnson and Lappin, 1997; Muller and Sternefeld, 1996). Faced with this apparent heterogeneity, it is perhaps not surprising that these phenomena are generally taken in isolation from each other, requiring additional principles. Of those who provide a general account, Johnson and Lappin (1997) articulate an account within the Head-driven Phrase Structure Grammar (HPSG) framework which involves three distinct operators:—a binding operator for whexpressions, a discrete operator for wh- in situ, and yet a further operator to express the expletive phenomena. The primary task in the various theoretical paradigms seems to have been that of advocating sufficient richness within independently motivated frameworks to be able to describe the data. Little or no attention has been paid to why wft-expressions display this puzzling array of data.
2. THE PROPOSED ANSWER The answer we propose demands a different, and more dynamic, perspective. Linguistic expressions will be seen to project not merely some logical form mirroring the semantic content assigned to a string, but also the set of steps involved in monotonically building up that structure. This dynamic projection of structure is set within a framework for modeling the process of utterance interpretation, which is defined as a left-to-right goal-directed task of constructing a proposi-
Computation as Labeled Deduction
257
tional formula. Two concepts of content are assumed within the framework— the content associated with the logical form which results from the output of the structure-building process, and the content associated with the process itself— both reflected in lexical specifications. The process is modeled as the growth of a tree representing the logical form of some interpretation. The initial state of the growth process is merely the imposition of the goal to establish some proposi– tional formula as interpretation. The output state is an annotated tree structure whose root node is annotated with a well-formed prepositional formula compiled from annotations to constituent nodes in the tree. The emphasis at all stages is on the partial nature of the information made available at any one point: a primary commitment is to provide a representational account of the often observed asymmetry between the content encoded in some given linguistic input and its interpretation in context (cf. Sperber and Wilson, 1986, whose insights about context dependence this framework is designed to reflect). W/z-initial expressions will be seen as displaying such asymmetry. As clause-initial expressions they do not from that position project a uniquely determined position in the emergent tree structure, and this poses a problem to be resolved during the interpretation process as it proceeds from left to right through a string. Wh-in situ constructions are the mirror image to w/i-initial constructions, their tree relation with their sister nodes being uniquely determined. Finally, partial movement constructions will emerge as a direct consequence of combining this underspecification analysis of the characterization of wh with the dynamics of the goal-directed parsing task. The consequence of this shift to a more dynamic syntactic perspective is a much closer relation between formal properties of grammars and parsers, a consequence which we shall reflect on briefly in closing. 2.1. The Framework: A Labeled Deductive System for Natural Language—LDSNL The general framework within which the analysis is set is a model of the process of natural language interpretation, where the goal is to model the pragmatic process of incrementally building up propositional structure from only partially specified input. The underlying aim is to model the process of understanding reflecting at each step the partial nature of the information encoded in the string and the ways in which this information is enriched by choice mechanisms which fix some particular interpretation. The process is driven by a mixed deductive system—type-deduction is used to project intraclausal structure (much as in Categorial Grammar—cf. Moortgat, 1988; Morrill, 1994; Oehrle, 1995), but there is in addition inference defined over databases as units for projecting interclausal (adjunct) structure (cf. Joshi and Kulick, 1997, for a simple composite type-deduction system). The background methodology assumed is that of Labeled Deductive Systems (LDS) (Gabbay, 1996). According to this methodology, mixed
258
Ruth Kempson et al.
logical systems can be defined, allowing systematically related phenomena to be defined together while keeping their discrete identity. A simple example is the correlation between the functional calculus and conditional logic (known as the Curry-Howard isomorphism) with functional application corresponding to Modus Ponens, Lambda abstraction corresponding to Conditional Introduction. Thus we might define Modus Ponens for labeled formulae as: (27) Modus Ponens for labeled formulae
In the system we adopt here, intraclausal structure is built up by steps of type deduction, much as in Categorial grammar, but, since the primary task is that of building up a propositional structure, we define the formula to be the expression being established, and the labels to be the set of specifications/constraints which drive that process. (28) provides the simplest type of example, displaying how type deduction (and its twinned operation of function application) drives the process of projecting a representation of propositional content which duly reflects the internal mode of combination:
The interpretation process is formalized in an LDS framework in which labels guide the parser in the goal-directed process of constructing labeled formulas, in a language in which these are defined together. Declarative units consist of pairs of sequences of labels followed by a content formula. The formula of a declarative unit is the side representing the content of the words supplied in the course of a parse. The labels annotate this content with linguistic features and control information guiding the direction of the parse process. Since the aim is to model the incremental way information is built up through a sequence of linguistic expressions, we shall need a vocabulary that enables us to describe how a label-formula constellation is progressively built up. With this in mind, declarative units are represented as finite sets of formulas (cf. section 2.2.1):
In the course of a parse these feature sets grow incrementally. The dynamics of the parse is given by a sequence of parse states, each of which is a partial description of a tree. The task is goal-driven, the goal to establish a formula of type t using information as incrementally provided on a left-right basis by a given input string. At each state of the parse, there is one node under
Computation as Labeled Deduction
259
development which constitutes the current task state. Each such task state has a database and a header. The database indicates the information established at that node so far. The header indicates the overall goal of that task—SHOW X for some type X; the tree node of the particular task being built up; the subgoal of the given task—what remains TO DO in the current task; and a specification of which task state it is. (29) displays the general format:
(P = "P holds at a daughter of me") In the initial state, the goal SHOW and the subgoal TO DO coincide: establish a formula of type t, for node m (the putative top node). In the final state, the goal of node m is fulfilled, and a prepositional formula established. Each subtask set in fulfilling that task is assigned a task number, and described according to a
260
Ruth Kempson et al.
tree-node vocabulary which enables trees to be defined in terms of properties holding at its nodes, and relations between them. Successive steps of introduction rules introduce subtasks, which once completed combine in steps of elimination to get back to the initial task and its successful completion. Thus for example, the opening sequence of states given presentation of a subject NP introduces the subtasks, TODO, of building first a formula of type e, and then a formula of type e t (enabling a formula of type t to be derived). And correspondingly, the last action in the sequence of parse states is a step at which, both these subtasks having been completed, a step of Modus Ponens applying to the completed tasks establishes the goal of deducing a formula of type t at the root node m which constitutes the initial task state. The result is a sequence of task states each completed, so with no subtasks TODO remaining outstanding. Notice that these task states progressively completed reflect the anticipation of structure corresponding to the semantic interpretation of the string, and they are not a semantically blind assignment of syntactic structure. As the proto-type sketch in (29) suggests, the system is an inferential building of a feature-annotated tree structure. One of its distinguishing properties is that this articulation of the process of building a tree structure is itself the syntactic engine, as driven by lexical specifications. There is no externally defined syntactic mechanism over and above this. Unlike other syntactic models, the system combines object-level information pertaining to the structure and interpretation of the resulting formula with metalevel information about the process of establishing it. So there is DECLARATIVE structure which indicates what the content is (type plus formula plus tree node). And there is IMPERATIVE structure which indicates what remains to be done. 2.2. The Logical Language To express this degree of richness, we need a formal language with a number of basic predicates, a formula predicate, a label predicate, a tree-node predicate, each a monadic predicate of the form P(d) with values a. A composite language combines these various formulae in defining the structure within which the annotated tree node is built up. 2.2.1. THE FORMULA PREDICATE The values of the formula predicate Fo are expressions of an extended quantifier-free lambda calculus LC. Terms are predicate constants sing, see, smile, and so on and a range of lambda expressions; individual constants John, Mary, and so on.3 The quantifier-variable notation of predicate logic is replaced by epsilon (equivalent to 3) and tau (equivalent to V) terms, each with a restrictive clause nested within the term itself. For example, a man is projected as ex(x, man(x)). In
Computation as Labeled Deduction
261
addition, there are a range of specialized metavariables, "m-variables." These are annotated to indicate the expression from which they are projected: e.g., wh (to be read as 'gap'), upro. In all cases such expressions are taken as placeholders of the appropriate kind, and operations map these expressions onto some expression of the formula language which replaces them.4 2.2.2. THE LABEL PREDICATES Labels present all information that drives the combinatorial process. These include: (i) The Type predicate, with logical types as value represented as type-logical formulae e, t, e t, e (e i), . . , corresponding to the syntactic categories DP, IP, intransitive verb, transitive verb, and so on. These are displayed as: Ty(e), Ty(e f), etc. We may also allow Ty(cn) as a type to distinguish nouns from intransitive verbs. (ii) The Tree node predicate, with values identifying the tree position of the declarative unit under construction from which its combinatorial role is determined (see below). (iii) Additional features as needed. We shall, for example, distinguish discrete sentence types such as +Q associated with questions. We might also add a further range of features such as case or tense features, for example defining tense as a label to a formula of type t (following Gabbay, 1994). All issues of case and tense we leave to a later occasion, here allowing the set of label-types to be open-ended, assuming some additional syntactic feature +Tense (cf. 5.3.1). Boolean combinations of such atomic formulae are then defined in the standard way. 2.2.3. MODAL OPERATORS FOR DESCRIBING TREE RELATIONS Relations between nodes in a tree are described by a tree-node logic, LOFT (Logic of Finite Trees), a propositional modal language with four modalities (Blackburn and Meyer-Viol, 1994): P P holds at my mother P P holds at a daughter of current node P P holds of a left-sister of current node P P holds of a right-sister of current node In addition to the operator (x), x ranging over {u, d, I, r}, its dual [x] is defined:
262
Ruth Kempson et al.
We extend this language with an additional operator , which describes a link relation holding between an arbitrary node of a tree and the root node of an independent tree. This relation enables us to express a relation between pairs of trees, which we shall use to characterize adjunction. This modal logic allows nodes of a tree to be defined in terms of the relations that hold between them. For purposes of this chapter, the system can be displayed by example. In the following tree for example, from the standpoint of node n, where Ty(t) holds, (Fo(John) & Ty(e)) & Ty(e t) hold; from the standpoint of n', where Fo(John)&Ty(e) holds, Ty(t) and (r)Ty(e t) hold; and from the standpoint of n", where Ty(e t ) holds, Ty(t) and (l}(Fo(John) & Ty(e)) hold:
The language can have constants which may be defined as required: (e.g., 0 [u] root node—"nothing is above me"). Note the use of the falsum. Also manipulated are Kleene star operators for defining the reflexive transitive closure of the basic relations:
For example: *X
Some property X holds either here or at a node somewhere below here.
*Tn(m)
'Either here or somewhere above me is Tn(m)'. This property is true of all nodes dominated by the node m.
This use of the Kleene star operator provides a richness to syntactic description not hitherto exploited (though cf. Kaplan and Zaenen, 1988, for its use in defining a concept of functional uncertainty defined over LFG f-structures): it provides the capacity to specify a property as holding either at some given node or at some node elsewhere in the tree in which it is contained. It is this relatively weak disjunctive specification which we shall use to characterize wh or other expression initial in a string, whose properties and their projection within the string are not dictated by its immediate neighbors. The effect will be that not all expressions in a sequence fully determine their structural role in the interpretation of the string from that position in the string.
Computation as Labeled Deduction
263
2.2.4. THE LANGUAGE OF DECLARATIVE UNITS The language of declarative units is a first-order language with the following nonlogical vocabulary: 1. a denumerable number of sorted constants from Labi for i< n, where L, = (Labi, R],... ,Ri> structures the set at feature values in Labi, 2. monadic predicates Fo ('Formula'), Ty (Type'), Tn (Tree node'), Cj, i ^ n and identity ' = ', 3. modalities ,u. (up), (d) (down), {/) (left), . (right) and their starred versions *,*,>*,*,(L} for pairing some completed node in a tree and the root node of a new tree (for adjuncts), and , which are the analogue of *, and * defined over the union of and ., and and respectively. Formulas: 1. Ifj e Lc then Fo(j) is an (atomic) DU-formula. If k e Labi then Ty(k) is an (atomic) DU-formula. If A; e Lab2 then Tn(k) is an (atomic) DU-formula. If k Labi, 2< i < n, then Ci(k} is an (atomic) DU-formula. If t, t' are variables or individual constants, then t = t' is an (atomic) DUformula. 2. If f and y are DU-formulas then f # y is a DU-formula for # e {A,V,->, }• If x is a variable and f a DU-formula, then xf and 3xf are DU-formulas. If M is a modality and f a DU-formula, then Mf is a DU-formula. With this composite language system, inference rules which characterize the transition between input state and final outcome can now be set out. All inference operations are defined as metalevel statements in terms of DU -formulas and relations between them. For instance, applications of the rule of Modus Ponens
for declarative units become a metalevel statement licensing the accumulation of information at nodes in a tree structure represented as: "Modus Ponens" for DU-formulas:
An item has Type Feature t and Formula Feature y(f) if it has daughters with Type Features e t and e and Formula Features y and f, respectively. Controlled
264
Ruth Kempson et al.
Modus Ponens is then a straightforward generalization.5 For instance,
where Modus Ponens is restricted to daughters with features X and Y. In general, in this modal logic, a rewrite rule y 1 . . . , Yn X gets the form
3. THE DYNAMICS A parse is a sequence of parse states, each parse state an annotated partial tree. A parse state is a sequence of task states, one for every node in the partial tree so far constructed. A task state is a description of the state of a task at a node in a partial tree. A task is completely described by a location in the tree, a goal, what has been constructed, and what still has to be done (TODO). So the four Feature dimensions of a task state are 1. Goal (G). Values on this dimension are the semantic types in the label set Ty. This feature tells us which semantic object is under construction. 2. Tree Node (TN). Values are elements of the label set Tn. The 'top-node' in Tn will be denoted by 1. This feature fixes the location of the task in question within a tree structure. 3. Discrepancy (TODO). Values are (finite sequences of) DU-formulas. This dimension tells us what has to be found or constructed before the goal object can be constructed. 4. Result (DONE). Values are lists, sequences, of DU-formulas. These values will be the partial declarative units of the Incremental Model. We will represent the task state TS(i) by
We can distinguish three kinds of task states 1. Task Declarations
The Task Declaration. Nothing has yet been achieved with respect to the goal G. Everything is still to be done. Analogously to the description of
Computation as Labeled Deduction
265
declarative units we can represent the above task state as a list of featurevalue statements as follows. 2. Tasks in Progress
In the middle of a task. If things are set up right, then ab G. The value of Done gives an element of the domain C of the Incremental Model, a partial declarative unit. The value of Todo gives a demand associated with this element that still has to be satisfied. 3. Satisfied Tasks
A Satisfied Task. There is nothing left to be done. Soundness of the deductive system amounts to the fact that the goal G can be computed, derived, from a in case TODO is empty. From a different perspective we can consider the state
as constituting an association between a node in a tree and a Labeled object decorating that node. Notice that this can be seen as a tree node decorated by some feature structure plus an unsatisfied demand. In the course of a parse we may have a Task State with the Tree Node feature undefined
In this case we are dealing with a decoration in search of a node to decorate.
266
Ruth Kempson et al.
3.1. Dynamics: The Basic Transition Rules The dynamics of the parse process consists then of a sequence of partial descriptions of a tree. Concretely, the dynamics of the parsing process is the dynamics of demand satisfaction. This sequence of parse states can be seen as a sequence of tree descriptions in which nodes are introduced which must subsequently be completed to derive a formula corresponding to an interpretation of the string. The tree corresponds to a skeletal anticipation of the internal semantic structure of the resulting propositional formula, and not to a tree-structure for the input sequence of words. Indeed there is no necessary one-to-one correspondence between the individual linguistic expressions in the string and nodes of the tree. 3.1.2. BASIC TRANSITION RULES In the following the symbols X, Y, Z, . . . will range over individual DUformulas, the symbols a, b, . . . will range over (possibly empty) sequences of such formulas, D, D',... will range over (possibly empty) sequences of tasks, and wi, wi+l,... will range over words. The start of a parsing sequence is a single task state, the Axiom state. The last element of such a sequence is the Goal state. The number of task states in a parse state grows by applications of the Subgoal Rule. Tasks become satisfied by applications of the Scanning and the Completion Rules. 1. Axiom ax
Goal go
where all elements of D are satisfied task states. 2. Scanning
The expression LEX(w) = Y refers to the lexical entry for the word w. This is a set of DU-formulas possibly containing U, the required element in the TODO box.
Computation as Labeled Deduction
267
3. Mode of Combination (a) Introduction
Notice that the premises Yi are indexed as daughters of the task p. The rule Y0, . . . , Yn Z stands for an arbitrary rule of combination in general. We can see it as an application of Modus Ponens, a syntactic rewrite rule, or interpret => as logical consequence, (b) Elimination
Notice that this rule effects the converse of Introduction. This inverse relation guarantees that an empty TODO compartment corresponds to a DONE compartment which can derive the goal. 4. Subordination (a) Prediction
where Rd is the relation holding between a node and its daughter, (b) Completion
These are the basic set of general rules driving the parsing process, exemplified by returning to our earlier display (29), filling out the leaves and annotating the
268
Ruth Kempson et al.
tree structure to show how the rules have applied in projecting an interpretation for John saw Mary:
In this example and the following ones, we use the formula m, abbreviating Tn(m), to stand for an element k of the label set Tn such that Ru(m, k) (that is, m is the mother of k) and, analogously, we will use the other modalities, e.g. and * as relative addresses for elements of Tn. There are other rules to add to this set. In particular there are the rules associated with subordination, and rules specifically associated with wh-expressions. We also presume on rules which relate sequences of tasks states as units. These give rise to linked task-sequences, to which we shall return, in considering adjunction.
Computation as Labeled Deduction
269
3.2. Modeling the Partial Nature of Natural Language Content Before increasing the complexity with additional rules, we indicate how we model the gap between lexically specified content and its assigned interpretation within the context of some given string, as this is the heart of any account of how utterances are interpreted in context.6 The most extensively studied phenomenon involving such asymmetry is anaphora. In this model, we take the underspecification of the input specification associated with pronominal anaphora a step further than in many other analyses. We assume that pronouns are invariably inserted from the lexicon with a single specification of content, and that any bifurcation into bound-variable pronoun, indexical pronoun, E-type, etc., is solely a matter of the nature of the contextually made choice, context here being taken to include logical expressions already established within the string under interpretation. Accordingly, pronouns are projected as m- variables with an associated procedure (not given here) which imposes limits on the pragmatic choice of establishing an antecedent expression from which to select the form which the pronoun is to be taken as projecting: Lex(he)
{Ty(e), Fo(upro), Gender (male),Ty (t),. . .}
(Notice how the condition on the mother of the formula projected from he is in effect a feature-checking device, licensing the occurrence of the pronoun within a particular frame.) Instantiation of the m-variable upro is generally on-line, as the m-variable is inserted from the lexicon into the tree. It must be selected only from identified formulae, where an identified formula is either a formula in some satisfied task (i.e., in the done box of a task with empty TODO and identified tree node) or a formula which has been derived elsewhere in the discourse. For a pronominal of type e, there is a further restriction that the formula selected as providing its value may not occur within the same r-domain within which upro is located. This is expressed as a side condition, given here only informally.7 Note the metalevel status of this characterization of pronouns. Anaphora resolution is defined not as a task of reference assignment defined in semantic terms, but as a selection process defined over available representations. The nature of this choice will determine whether the denotative content of the pronoun relative to the assigned interpretation is that of a variable, a referential term, and so on. 3.2.1. UNDERSPECIFICATION OF TREE CONFIGURATION More unorthodox than the recognition that a single pronoun has a single lexical specification which by enrichment of its input specification becomes a boundvariable, constant, and so on is the claim that expressions in a string may also underspecify the role the expression is to play in the compilation of interpretation for the string. Taking up the potential of LOFT to express disjunctive characterizations of node properties, there are rules which allow individual lexical items to
270
Ruth Kempson et al.
project tree descriptions which do not fully determine all branches of the tree. This is the primary distinguishing feature of our analysis of initial wh-expressions. Wh-expressions, we claim, project the following state:
(31) displays the projection of a task with the goal of showing Ty(t) at some node ra identified as a wft-question, but with everything still to do, except that a completed e task has been added, lacking merely the specification of where in the tree it holds. WH, which is the value of the Formula predicate, is an m-variable, either retained as a primitive term to be resolved by the hearer and so incomplete in content, or in relative clauses resolved by replacing the m-variable WH with the formula projected by the adjoined head. (u)*m is an abbreviation for (u)*(Tn(m))
'The root node is either here or somewhere above me'
In other words, the structural position of Fo(WH) and Ty(e), hence its functionargument role in the propositional formula under construction, is not fixed at this juncture in the parsing process. The structure is merely defined as having such a node. The + Q feature is an indication by feature specification of a propositional formula which is to be open with respect to at least one argument. Seen as an on-line target-driven parsing task, by a single step of inference we can add a conclusion at the current node m about the presence of the WH:
The (d)* form of specification holds because somewhere in the tree dominated by m is a node with the properties listed. (This inference is not a necessary part of the specification projected, but, as we shall see in section 5.3, is the form of characterization that brings out its parallel with wh-expletives.) We have not yet added any account of why the properties of wh might get carried down from one clausal domain to another. This transfer follows from the recursive definition of (#)*X. Consider the evaluation of the Du-formula holding at some node (u)*m. By definition, this property holds either at m or at a daughter of m or at a daughter of a daughter to m, and so on (cf. section 2.3.3). Given that information at all nodes must be locally consistent, the mismatch between TODO Ty(t) and Fo(WH) & Ty(e) will lead to the DU-formula annotating the node so
Computation as Labeled Deduction
271
far unfixed being evaluated with respect to some daughter, and then successively through the tree until resolution is possible. This resolution is achieved at node i by some TODO specification associated with a task state being taken as satisfied by the presented floating constituent, whose node characterization is thereby identified (WH-RESOLUTION): (33)
Wtf-RESOLUTION
Provided Ru*(m, i) and if Ty(x) is in the lexical specification of the current word, then x X ....> e and x e.
The side condition is a restriction that the type of the current word in the string must neither meet the TODO specification directly, nor set up a type specification which a sequence of Introduction and Elimination steps would satisfy. This guarantees that such resolution only takes place when there is no suitable input. With the information from the unfixed node copied into the tree, the underspecification intrinsic to (u)*m as a tree-node identifier is resolved with i= (u}*m, and the unfixed node is deleted. We now set out two examples. First is the specification of input state and output state for the string Who does John like?: (34)
Who does John like? INPUT TASK STATE
Notice how the lexical specification of who simultaneously projects information both about its mother node (that it is a question) and about some unplaced
272
Ruth Kempson et al.
constituent. The finally derived state with the t target duly completed no longer has this unfixed node as part of the tree description. (35), our second sample derivation, specifies a characterization of the parse state following the projection of information following think. It displays the disjunctive specification associated with who being carried down to information projected by the string making up the subordinate clause through inconsistency between the type of the wh-element and that assigned to each intermediate right-branching daughter, with the point at which the information projected by who still not fixed: (35)
Who do you think Bill likes?: PARSE STATE following entry of think:
Notice that we are in effect abandoning the assumption that a w/z-initial expression takes scope over the remaining nodes defined over the subsequent string, for the formula projected by the wh-expression has a fixed position only at the point at which its tree relation to the remaining structure is fixed—viz, the "gap." Hence we shall have a basis from which to characterize the scope idiosyncrasy of initial wft-expressions that they freely allow narrow scope effects with respect to expressions which follow them (listed as problem (1)).
4. CROSSOVER: THE BASIC RESTRICTION We are now in a position to present the basic crossover restriction, to wit that in questions, pronouns can never be interpreted as dependent on the preceding w/i-expression unless they also follow the gap. This restriction is uniform, and runs across strong and weak crossover configurations (cf. examples (4)-(9) repeated here):
Computation as Labeled Deduction
(36)
*Who, does Joan think that hei worries ei is sick?
(37)
*Who, does Joan think that his, mother worries ei is sick?
273
(38) *Whosei exam resultsj was hei certain eJ would be better than anyone else's? (39)
Who, does Joan think ei worries his; mother is sick?
(40)
Who, does Joan think ei worries that hei, is sick?
(41)
Whose, exam resultsj ej were so striking that he i was suspected of cheating?
This is directly accounted for by the concept of identification associated with anaphora resolution. As long as the underspecified node is not fixed, by definition it cannot serve as an antecedent for the pronoun (cf. section 3.2). The effect of wh-resolution when it later applies is indeed to determine the position within the configuration at which the properties projected by the w/i-expression should be taken to hold. Such features become available for pronominal resolution only after the gap has been projected. In this way, the system is able to characterize the way in which wh-expressions in questions do not provide an antecedent for a following pronominal until the gap (= Fo(Wh)) is constructed in a fixed position. Hence the primary crossover restriction
(for both weak, strong, and extended strong crossover data).8 4.1. Crossover and Relative Clauses With relative clauses, we face data that are apparently problematic for this restriction, as indeed for all accounts in terms of operator-gap binding, as these demonstrate that the crossover phenomenon is context-sensitive (Lasnik and Stowell, 1991; Postal, 1993). In some contexts, the primary crossover restriction is suspended altogether, in others it remains in force. In relatives, the crossover restriction against a sequence wh . . . . pronominal . . . . gap, all interpreted as picking out the same entity, does not hold, if either the pronoun "crossed over" is a determiner (the primary weak crossover cases), or if the w/z-expression, likewise, is contained within some larger noun phrase (the "extended strong crossover" cases—Postal, 1993): (42)
*John,, who Sue thinks hei said ei was sick, has stopped working.
(43)
John, who, Sue thinks ei said hei was sick, has stopped working.
(44)
John, whoi Sue said his, mother is unnecessarily worried about ei, has stopped working.
274
(45)
Ruth Kempson et al.
John, whosei motherj Sue said he, was unnecessarily worried about ej, has stopped working.
This contrast between (42) and (44)-(45) is less marked in restrictive relatives, but many English speakers report a difference between (46), and (47)-(48), with (46) being unacceptable on an interpretation in which the pronoun is construed as identical to the head nominal, but (47)-(48) to the contrary allowing an interpretation in which wh, pronoun, and gap position are all construed as picking out the same individual:9 (46)
Every actor who the press thinks he said e was sick, has stopped working.
(47)
Every actor who the press thinks his director is unnecessarily worried about e, has stopped working.
(48)
Every actor whose director the press said he was fighting with e, has stopped working.
This asymmetry between relatives and questions is inexplicable given an analysis of the primary crossover restriction solely in terms of the relative positions of the three elements wh, pronominal, and gap, as any binding precluded by one such configuration should continue to be excluded no matter what environment the configuration is incorporated into as a subpart. On the other hand, if the crossover restriction in questions is due to some intrinsic property of a wh-word in questions, say the weakness of description provided by the wh-expression, then we have some means of explaining the difference between questions and relatives, as long as we can provide a means of distinguishing the way the wh-expression is understood in the context of relative constructions. In particular, if there is some externally provided means of enriching the very weak description of content of wh-expressions in relatives, then this source of information may also provide a means of resolving the underspecification intrinsic to the following pronoun and, through it, indirectly identifying a fixed node for the newly enriched formula, so providing potential for interaction between the processes of node fixing and anaphora resolution. This is the approach we shall take, using the property of the head to which the relative is adjoined as the external source of information. This account turns on the account of relative clauses provided within this framework. 4.1.1. RELATIVE CLAUSES AS INDEPENDENT, LINKED TREE STRUCTURES The point of departure for our account of relative clause construal is the observation that relative clause construal involves constructing two propositional formulae that have some element in common. Furthermore, given that English is a head-initial language, the information as to what that shared element is is given ab initio—it is the formula projected by the head to which the relative is adjoined. The starting point in constructing a representation for the relative clause is, then,
Computation as Labeled Deduction
275
the requirement that this second structure must have a copy of the formula of the node from which this tree has been induced: (49)
John, who Sue's mother thinks is sick, is playing golf.
It is this set of observations which our account directly reflects. In (49) the occurrence of who is defined as signaling the initiation of a second structure which is required to contain a copy of the formula Fo(John) and Ty(e). As so far set out, the system only induces single trees for a system of typededuction which annotates the nodes of the tree once set out, plus a process of projecting and subsequently resolving initially unfixed nodes. To reflect the informal observation, this process of fixing an initially unfixed node is combined with a rule for transferring information from one tree to a second tree suitably linked to the first. The concept of linked tree is defined to be a pair of trees sharing some identical expression:10 (50)
For a pair of trees Ti, T2 , RLINK (T1 , T2) iff T\ contains at least 1 node n with the DU-formula (Fo(A), Ty(X), Y}, and T2 , with root node Tn(n) contains at least 1 node with (Fo(A), Ty(X), Z}.
To project such a structure incrementally, we define a LINK introduction rule. This rule applies to some completed node n in a tree T1 , induces a new tree T2 with root node n; and imposes on the new tree T2 a requirement of having a node annotated with the relevant formula Fo(A), (described as <«>* n). For the case of nonrestrictive relative construal, we assume this LINK Introduc1tion process applies to the node of type e, carrying over the formula which annotates this node into the new linked tree T2 .u (51)
LINK Introduction:
Note that the unfixed node is described as an arbitrary node along a chain of daughter relations, with the requirement on that node that it be filled by the formula filling the node to which the tree is linked. This requirement will be satisfied only if there is some subsequent expression projecting the necessary formula onto the node. The consequence of meeting this requirement is that all pairs of linked trees will contain a common occurrence of the variable copied over, thus meeting the characterization of what it means for a pair of trees to satisfy the relation RLINK In the case of a language such as English, the relativizing complementizer is defined as carrying a copy of the head formula by substitution of the metavariable
276
Ruth Kempson et al.
Fo(WH), and this requirement of a common formula in the two structures is met immediately. The new tree is thus "loaded" with an occurrence of the formula occurring within the node from which the LINKed structure was projected: (52)
INPUT TASK STATE
In the process defined as (52), the content a is carried from the host task state into the independent task state with its goal of SHOW t. The value of wh is therefore identified as identical with that of its head by definition. However, the specification of the presence of this formula is characterized as annotating a node whose tree position in the new tree is as yet unfixed, and this form will as before give rise to percolation of the specification Fo(a)andTy(e) down through the tree, checking at each node whether the DU -formula is to hold at that node. It is this anaphoric property of wh, being replaced by a substituend in relatives, which provides the basis for explaining the asymmetry between crossover phenomena in questions and in relatives. The contrast between questions and relatives can be seen as arising from an interaction between two different forms of update—pronoun construal and the identification of a tree position for some unfixed node. In relatives in English, the w/z-expression projects a formula Fo(a) of Ty(e) identified with the head, in virtue of the anaphoric properties of the relative "pronoun." All that needs to be established in the subsequent development of the tree is where the unfixed node annotated with Fo(a) and Ty(e) should fit into the tree. The interference caused by anaphora resolution is then as follows. If by a free pragmatic process, a pronominal of the same type as the dislocated constituent happens to be assigned as value the same formula Fo(a) as the head noun before the gap is reached, then a fixed position within the tree for the DU-formula Fo(a (and Ty(e}} will have been found. This will automatically lead to update of the tree—there is at this juncture an occurrence of Fo(a) at a fixed node in the tree—and there will be nothing to fill any subsequent gap where the words themselves provide inadequate or conflicting input. If, that is, the words that follow fail to project a full set of annotations to fulfil whatever requirements are set up in the subsequent projection, then at that later stage there will no longer be any outstanding unfixed node whose position has to be resolved, and hence no successful completion of the tree. So, should the pronoun be identified as being a copy of the formula inhabiting the head, there will be no successful completion of the tree if
Computation as Labeled Deduction
277
a "gap" follows. Indeed, the only way to ensure that the initially unfixed node is used to resolve some such outstanding requirement is to interpret the intervening pronoun as disjoint from the head nominal. This is the strong crossover phenomenon, exemplified by (42). Following on from this, should a pronominal have served the purpose of identifying a fixed tree position for a hitherto unfixed node, all subsequent references to the same entity will have to be made through anaphora: (53)
John, who Sue said he was worried was sick, has gone to the hospital.
(54)
John, who Sue said was worried he was sick, has gone to the hospital.
(55)
John, who Sue said he was worried he was sick, has gone to the hospital.
Thus (53) precludes any interpretation of he as dependent on John (the strong crossover data); (54) allows an interpretation for the occurrence of he as dependent on John, possibly via its identification with whatever occurs in the gap (the two will not be distinguishable, because the supposed gap contains an occurrence of the variable associated with the nominal); and (55) allows such an interpretation for both occurrences of he, and moreover also allows an interpretation in which the first but not the second pronoun is interpreted as dependent on John. The interaction of anaphora resolution and gap resolution said to underpin strong crossover has turned on the fact that the pronoun and the w/i-expression are of the same type, Ty(e). Should, then, there be any reason why identifying the pronoun will not lead to fixing the tree position for the dislocated expression, then the pronoun won't be able to be used to fix the position of the unfixed node; there will still be a role for a subsequent gap; and choice of pronoun as identified with the head nominal will not interfere with the Gap Resolution process. This happens in two types of case: (i) if the pronoun is a determiner and so not of type e (weak crossover effects), and (ii) if the wh-expression is contained within some larger expression and it is this larger expression which is unfixed within the emergent tree configuration (extended strong crossover effects). Both weak crossover effects (44) and extended strong crossover effects (45) are thus predicted to be well formed in relatives with the pronoun construed as dependent on the head, as the mere complement of the type of case that is precluded. None of these means of updating the tree through the occurrence of the pronoun will be available in questions, for there is no independent identification of the whexpression in questions. It remains an unidentified formula with but a placeholding wh metavariable, and, prior to Gap Resolution, without even a fixed position in a tree structure. Hence the asymmetry between questions and relatives.12 This account has the advantage over a number of accounts (including that of Kempson and Gabbay, 1998) that it provides a natural basis for distinguishing languages such as Arabic with a resumptive pronoun strategy from a language such as English, which uses pronouns resumptively only for marked effect. Arabic
278
Ruth Kempson et al.
displays no crossover effects, either strong or weak in relative clauses:13 (56)
irra:gil illi nadja ftakkarit innu qe:l innu aiya:n the man who Nadya thought he said he was sick
(57)
irra:gil illi nadja ftakkarit inn umuhe qelqa:neh minn-u the man who Nadya thought that his mother was worried about
All that is required to explain this difference between English and Arabic relatives is to propose that in Arabic, the relativizing complementizer has less rich anaphoric properties than in English. All that it encodes is the very requirement on the LINK structure definitional of LINK structures, an expletive which requires further lexical input. In consequence, the requirement of creating a copy of the nominal formula in the relative clause structure is not satisfied by the complementizer itself, and so the presence of the required copy will only be met through an ensuing pronoun identified anaphorically with the formula inhabiting the head. There is never any question of "gap" positions occurring subsequent to some pronominal, for there is no successfully annotated unfixed node for whom it is only its position in the tree that remains to be identified. Correctly, the analysis leads us to expect a total lack of observable crossover data in relative clauses in Arabic. The difference between the two languages thus reduces to a lexical difference between the relativizing complementizer in the two languages. Notice that in all cases, the account, and its context sensitivity, make critical use of the way information has been accumulated previous to the projection of the pronoun, and is not solely defined in terms of the configuration in which the pronoun occurs. It is thus sensitive to linear order, partiality of information at intermediate steps in the interpretation process, and the way information is accumulated through the interpretation process. In particular, the dynamics involved in interpretation of w/i-expressions follow from the goal of seeking to resolve the weak tree description initially projected. The context sensitivity of "weak" and "extended strong" crossover but not "strong" crossover is thus predicted from the proposed characterizations of w/i-expressions, pronouns, and relative clauses, without additional construction-specific stipulation.
5. TOWARDS A TYPOLOGY FOR WH-CONSTRUAL In the face of the presented evidence, one might grant the need for some form of incrementality in the projection of interpretation for wh-structures, but nevertheless argue that the slash mechanism of HPSG, with its percolation of wh features progressively up a tree through feature unification captures just the right dynamic element, without abandoning the overall declarative formalism. What, one might ask, does this disjunctive specification approach have to offer, over and
Computation as Labeled Deduction
279
above that, more conservative, form of specification? It is furthermore extremely close to the functional uncertainty analysis of LFG (Kaplan and Zaenen, 1988). Kaplan and Zaenen indeed analyze long-distance dependencies in terms of the Kleene * operator, and so constitute a genuine precursor of the present analysis. However, in that case, the disjunction is defined over string sets and the/-structure specifications, not, as here, over structural specifications. The advantage specific to this account, in reply to such a charge, is the dynamic parsing perspective within which the account is set. It is this dynamic perspective that provided an account of the crossover phenomenon. And it is this same perspective that also provides the basis for a general typology of w/i-constructions, explaining why they occur as they do, rather than simply defining distinct mechanisms for each new set of data (as do Johnson and Lappin, 1997). The unifying form of explanation that we set out is not available to the more orthodox frameworks, in which syntax is defined purely statically. We take in order wh-in situ constructions, multiple w/i-constructions, and partial movement constructions. (In all cases we shall restrict attention to full NP w/z-expressions such as who, what.) 5.1. Wh-in situ Constructions In the framework adopted, there is a near symmetry between w/z-initial and wh-in situ constructions. The in situ form is the fixed variant of the node which the initial w/i-expression projects as unfixed. The only additional difference is that wh-in situ constructions lack the additional +Q feature indicating a formula with one open position. We specify together the result of processing a w/i-initial expression, and the effect of processing a wh-in situ expression. (58)
Wfe-initial:
(59)
Wi-in-situ:
The w/i-initial expression encodes an instruction that its formula and type are satisfied at some lower point in the tree, together with the specification that the
280
Ruth Kempson et al.
node currently under construction has the property of being a wh-question, hence a formula open with respect to at least one argument. The wh-in situ expression, conversely, encodes an instruction that it is the premise Fo(WH) and Ty(e) which is projected into the current task state. There is no tree with unfixed node position, as the wh- in situ projects information to a fixed node of the tree; and the feature +<2 does not need independent specification, as the presence of the open formula in the tree is directly given by the wft-expression itself.14 In languages which freely allow wh either in situ or initially, with a free process of NP "scrambling," this characterization of w/j-initial expressions needs to be generalized, for all NPs can occur in the initial position. Accordingly, we allow as an optional expansion for a node of type t.15 (60)
GAP ADJUNCTION:16
The Gap Adjunction rule will feed the gap-resolution process. We predict both wh-initial configurations and unrestricted occurrence of wh-in situ configurations (observed in Chinese, and also in Japanese, Malay): (61)
Ni bijiao xihuan [[ta zenmeyang zhu] de cai] ? CHINESE You more like how cook REL food 'What is the means x such that you prefer the dishes which he cooks byx?'
Given the near symmetry in the characterization of wh-initial constructions and wh-in situ constructions, it might seem that no asymmetry between these two types of w/z-constructions could be predicted. It is however a widely known observation that, while wh-initial constructions impose island conditions, such as the preclusion of dependency of a wh-expression into a relative clause, wh-in situ expressions can be freely construed with scope wider than that of the relative clause within which they occur (e.g., Aoun and Li, 1991). This prediction is especially problematic for analyses which involve overt wh movement at s-structure (pre-SPELLOUT), and covert wh movement at LF (post-SPELLOUT) such as advocated in Reinhart (in press) (cf. Simpson, 1995, for detailed discussion). Nevertheless, the near mirror-image characterization of wh-initial and wh-in situ, perhaps surprisingly, provides a natural basis for predicting asymmetry between these two forms of dependence—on the basis that only the former is part of a search through a domain as yet unbuilt. We take the preclusion of dependency into relative clauses. Wh-initial sets up a disjunctive specification to be resolved within
Computation as Labeled Deduction
281
a certain task domain (the tree projected from the clause it fronts) (cf. (58) with its explicit specification that the disjunctive specification has to be resolved within the given task domain). The problem with examples such as (62) is that this resolution task cannot be achieved within the domain defined. The presence of the head of the relative clause (the man in (62)) to the contrary guarantees that the requirement associated with the object of see is filled, so that no task states remain open and hence able to resolve the configurational position of the unfixed tree node *m in the initial tree description. It thus remains unresolved in the task assigned to the node m, leading to lack of wellformedness: (62)
Who did John see the man who likes e
The string is ill formed because an assigned completion task on structure within the domain of that task cannot be satisfied given information from other expressions in the string. In wh-in situ configurations, to the contrary, there will never be such an unresolvable disjunction. The projection of the gap formula (Fo(WH) and Ty(e)} immediately meets whatever restrictions are imposed by other expressions in the environment, so there is never a node lacking a fixed tree node identification. As so far characterized, the wh-'m situ form is an unrestricted place-holding device, occurring anywhere in a tree. However, in some languages it is anaphorlike, licensed to occur only within a given locality marked by + Tense features (cf. Simpson, 1995; Ouhalla, 1996). Such a case is displayed by Iraqi Arabic which is a wh-initial and wh-in situ language of particular current interest (Simpson discusses the problems it poses for minimalist accounts, where the present data are taken from). In sentences with an initial tensed verb, but otherwise a sequence of nontensed verbs, the wh-expression may occur EITHER in situ or in the fronted position: (63)
Mona raadat [riijbir Su 'ad lisa 'ad meno] Mona wanted to force Suad to help who 'Who did Mona want to force Suad to help?'
(64)
Meno Mona raadat [riijbir Su 'ad tisa 'ad]
If, however, the subordinate clause is tensed, then (without some special ancillary device—see section 5.3.1), the wh-expression may not occur in situ and must be preposed: (65)
*Mona tsawwarat [Ali ishtara sheno] Mona thought Ali bought what (Intended: 'What did Mona think that Ali bought?')
(66)
Shenoi tsawwarit Mona [Ali ishtara ti;] What thought Mona Ali bought 'What did Mona think Ali bought?'
282
Ruth Kempson et al.
The restriction (also displayed in other languages—Hindi, and some dialects of German) is straightforward to characterize in this system, since wh-expressions project a place-holding m-variable, formally similar to anaphoric expressions. Like anaphors, this restriction involves the concept of a tense domain. The tense-marked verb defines a domain within which wh-expressions are licensed, but any second intervening tense-marked verb breaks the domain, and the whexpression is not licensed to occur. For this we require some restriction on the .*(LabelS(+Q)) specification to guarantee that the feature +Q be projected onto a node suitably local to the occurrence of the wh-expression, sharing the same value of the Locality predicate as the node annotated by the wh-formula itself. Pending a detailed account of tense and locality, we leave this as an informally described condition only (cf. section 5.3.1 and fn.7). 5.2. Multiple W/i-Questions The two characterizations of wh-initial and wh-in situ constructions now combine together. Multiple wh-constructions are simply wh-initial constructions and wh-in situ processes of construction in combination. We predict the effect in English that the first wh-expression in a multiple question is subject to subjacency effects—its associated gap not being able to occur inside a relative clause, while the secondary wh, occurring in situ, is not subject to any such subjacency effects, and so may occur within a relative structure. (67) *Whoi did the journalist leak the document in which Sue had criticized ei to which press? (68) Who reported the journalist that leaked which document to the press? (69)
Who reported the journalist that had leaked which document to which committee?
And we anticipate the Iraqi data that the second wh in a multiple wh-structure may only occur if all superordinate verbs except the clause containing the primary wh-expression are nonfinite: (70) Shenoi ishtara AH ti [minshaan yenti li-meno]? What bought Ali in order to give to-whom 'What did Ali buy to give to whom?' (71)
*Meno tsawwar [Ali xaraj weyya meno]? Who thought Ali left with whom 'Who thought that Ali left with whom?'
Thus we can construct the beginnings of a typology of wh-questions. Wh-in situ forms are forms at a fixed node in a tree, projecting the place-holder for the an-
Computation as Labeled Deduction
283
swer; wh-initial expressions project the same place-holder but at an unfixed node; and multiple wh-questions are simply the combination of these two. 5.3. Partial Wh-Movement A further, and apparently unrelated, part of the puzzle is the partial whmovement, as displayed in (72)-(73):17 (72)
Was glaubst du was Hans meint mit wem Jakob gesprochen hat?
(73)
Was glaubst du mit wem Hans meint dass Jakob gesprochen hat? What think you with whom Hans said that Jakob spoken had? To whom do you think Hans said Jakob was talking?'
Seen from an orthodox quantifier-variable binding perspective for analyzing whexpressions, the puzzle about these anticipatory was particles in German is that they do not constitute any indication of a specific question posed—merely a cautionary advance notice that such a wh-question will be posed at some later point in the on-line interpretation process. How then can their so-called "expletive" properties be related to those of the wh operator with which they are paired (cf. Cheng, 1991; Dayal, 1994; Simpson, 1995; the papers in Lutz and Miiller, 1995; Horvath, 1997)? These expletive forms appear not to be operators, but not variables, either. A natural means of capturing this phenomenon is suggested by this framework, through the dynamics of node construction. All nodes in a tree are projected in two phases: (i) the introduction of a node with a requirement TODO, (ii) the satisfaction of that node as DONE. Given the availability of *X as an annotation on a node (cf. section 2.5.1), and given also the characterization of requirements on a node as DU-formulas which are TODO, there is no reason to preclude a complex DU -formula of the form *X from occurring as TODO—for any D(7-formula may be imposed as a requirement. As long as this involves the addition of a DU-formula to a type independently introduced by an introduction rule, this feature addition will be harmless, and will not give rise to a proliferation of extra feature-specific introduction and elimination rules. What the specification of *Fo(WH) in TODO would mean was that ahead in the projection of structure for the purpose of building an interpretation is the need to enter a *Fo(WH) as DONE (a consequence of introducing an unfixed node), this in turn indicating that later in the interpretation process, gap resolution will identify the fixed node that meets this initially imposed requirement. We thus have a natural formal analog to the informal intuition that these expressions indicate what sort of interpretation task lies ahead, which is but a simple extension of the analysis provided for wh-expressions
284
Ruth Kempson et al.
themselves. Accordingly, we define an extra lexical definition of was, via the projection of *Fo(WH) in TODO: (74)
was (nonexpletive)
(75)
was (expletive)
The target instruction to construct a DU-formula
is essentially anticipatory, inducing a wh-expression lower down in the configuration. There is no unattached node in the tree description whose resolution could ensure the satisfaction of such a specification in the tree, so the only means of provision is from lexical input—hence the requirement encodes the necessity for a wh-expression at some daughter node lower in the tree. Since, furthermore, wh-initial expressions in German themselves are triggered by the presence of an empty node requiring type t onto which they project the +Q attribute, the required wh-expression is predicted to occur at the front of the subordinate clause in such structures. The only departure from the specification of a full wh-expression, other than its characterization as part of TODO, is the specification that this requirement must hold at a daughter node and may not hold at the node at which it is inserted.18 This ensures that such an expletive was never occurs in the same clause as the whexpression it anticipates, given that full wh-expressions in German, as in English, are restricted to occurring at a position from which they can project the feature +Q, hence a node of type t. The sole functional purpose indeed of this expletive is to indicate that a full wh-form will be in some clause LOWER than the one it itself initiates. We predict the data (76)-(81):19 (76)
* Was glaubt er was ? (to mean 'What does he believe?')
Computation as Labeled Deduction
(77)
*Was mit wem hat Jakob gesprochen? what with whom has Jakob spoken 'Who has Jakob spoken to?'
(78)
Was glaubst du was Hans meint mit wem Jakob gesprochen hat?
(79)
Was glaubst du mit wem Hans meint dass Jakob gesprochen hat?
(80)
*Mit wem glaubst du was Hans meint dass Jakob gesprochen hat?
(81)
Was glaubst du was/dass Hans meint mit wem Jakob gesprochen hat? 'Who do you think Hans said that Jakob has spoken to?'
285
(76)-(77) both contain the expletive was with some full wh- expression in the same clause (precluded by the clash between the * form of specification and the lexical specification of full wh-expressions). The second was in (76) could not be analyzed as a wh-in situ, because the requirement of an item projecting a lower +Q feature would not be met. (78) correctly has a series of was particles followed by a mit wem initial to its own clause, allowing all lexical specifications to be satisfied. These expletive was expressions merely impose a target—they do not constitute a discrete DU-formula at an identifiable and hence discrete node. (79) also allows the lexical specifications both of the expletive was and wem to be satisfied—note there is nothing in the full wh-form (as illustrated by the full form was) to guarantee that a clause-initial wh-form is satisfied by the presence of a gap in the very same clause. In (80), the requirement imposed by the expletive was is not satisfied, there being no lower clause-initial wh-expression. Though reiterating expletive was guarantees at each clause boundary so marked that a full-form wh will not occur there, even a single instance of was is sufficient to induce the presence of a full wh-initial form lower in the structure. Hence was and dass may alternate in structures such as (81) (cf. Muller and Sternefeld, 1996).20 Finally we predict that the two construction types may be combined as long as the gap triggered by some clause-initial full wh-expression precedes a was expletive to be followed by a second full clause-initial wh. We thus predict the wellformedness of (82), also correctly predicting that (82) may not be construed as a multiple wh-question: (82) Weri ei glaubt was ich meinte mit wenij Jakob ej gesprochen hat? who thinks what I said with whom Jakob spoken has 'Who thinks I said who Jakob had spoken to?' 5.3.1. CROSS-CLAUSAL WH-LICENSING: IRAQI ARABIC Iraqi Arabic displays a closely related phenomenon. It has a tense-domain restriction on the occurrence of the wh-expression, and so wh-in situ expressions
286
Ruth Kempson et al.
cannot be construed as questions, without some further ancillary device. The extra device is a prefix sh- (a reduced form of sheno (= what)) prefixed on the higher verb. In its presence, a wh-expression is licensed to occur in situ in a tensed clause apparently suspending its own locality restriction: (83) sh- 'tsawwarit Mona [AH raah weyn ] ? Q-thought Mona Ali went where 'Where did Mona think that Ali went?' Since the wh-in situ in embedded clauses is otherwise licensed only if there is no tensed verb intervening between it and the highest matrix clause, this will require a specific lexically triggered definition projecting the feature +Q both onto the higher node of type t and onto some embedded node of type t, and, as in German, imposing the requirement that some subsequent wh word project the formula Fo(WH) and Ty(e): (84)
sh—expletive
Notice how in this case, the * Fo(WH) requirement will be met by a full w/z-expression in situ (recall the definition of*P as P V * P guaranteeing that * Fo(WH) will hold at a node if Fo(WH) holds at that node). In Iraqi Arabic, furthermore, the wh-expression does not itself project the feature +Q: rather it requires its presence within a locally tense-specified domain (a characterization which is definitional of the wh-'m situ expression). So the required tense annotation on the newly induced complement clause node (the tree node identified asiin (84)) will induce the presence of a finite verb form, which in its turn will license the presence of a full wh-expression within that contained tree structure. The expletive wh-phenomena both for wh-initial languages and wh-in situ languages, thus emerge as a natural corollary of projecting natural language structure through the dynamics of opening nodes in anticipation of some specified action subsequently filling them. If we put this together with the account of wh-initial and wh-in situ constructions, differentiated primarily by the variation between fixed and unfixed nodes, we have an emergent typology for wh-questions based on the dynamics of how an emergent tree description for natural language interpretation is progressively set up from an initial starting point.
Computation as Labeled Deduction
287
6. CONCLUSION The substance of this account of wh-expressions has been the claim that the disjunctive specification made available by statements of the form<x>*P provides the basis for expressing natural linguistic generalizations—in particular not only characterizing the structural properties of wh-initial sentence forms, but also providing a principled basis from which to elaborate a whole family of generalizations about wh-structures. Wh-initial effects, wh-in situ effects, and the required array of partial movement effects are correctly predicted, as are the array of otherwise puzzling crossover phenomenona. Each of the phenomena standardly taken to require independent characterization have been explained from the same set of assumptions about the process of tree growth and its role in the interpretation process. Essential to the account have been two properties: 1. The asymmetry between the input provided by any individual expression on the one hand and its interpretation/structural role in interpretation on the other; 2. A specification of how such encoded input information is incrementally enriched as part of a left-right process of building up some prepositional form as interpretation for a string. The novelty of this account lies in the claim that natural-language expressions may not merely only relatively weakly specify the content to be assigned to them in context, but they may also fail to project a fully defined tree relation to the constituents projected by the items to which they are adjacent in the string. Expressions in a string must therefore in part be interpreted by a process of enrichment which involves not merely fixing the content of the expression relative to context, but also establishing tree relations which may hitherto not have been uniquely fixed. The significance of this account of long-distance dependency and related phenomena lies in two properties. First it is presented in terms of the "pragmatic" process of utterance interpretation—building up an interpretation in context. Second, there is no concept of syntactic structure over and above the structure in terms of which the incremental process of interpretation is modeled. The model of the parsing process itself provides the structural framework, indeed this is the syntax, and it is in terms of this framework that all linguistic explanations are couched. In this articulation of a single level of representation, it is quite unlike Discourse Representation Theory, despite obvious parallels between the resulting tree structure configurations and discourse representation structures.21 Furthermore, the projection of structure from wh-expressions is taken to be part of the process of resolving such relatively weak input specifications, defined, like anaphora resolution, in terms of the (primarily) left-right projection of information from
288
Ruth Kempson et al.
preceding linguistic input. The explanation therefore falls within a family of explanations which might loosely be called parsing explanations (cf. earlier attempts by Erteschik-Shir, 1973; Marcus, 1980; Berwick and Weinberg, 1984). It should however be stressed that this explanation of the data departs from earlier conceptions of the relation between competence and performance, or semantics and pragmatics, in which the competence model is defined in terms of an independently defined body of syntactic/semantic/phonological axioms which performance/pragmatic explanations take as input. We are not proposing a pragmatic model in terms that takes a fixed, semantically interpreted structural configuration as input with pragmatic principles applying to this input to yield a set of contextually fixed values. And we are not proposing an explanation of whphenomena in terms of parsing strategies merely to come to the conclusion that wh-binding, crossover, wh-in situ, and partial wh-movement effects fall outside the remit of the natural-language computational system, leaving the assumption of a computational system specific to the language faculty reduced but intact. We are, to the contrary, elaborating a model of the parsing process which purports to provide the total vocabulary for explaining structural (syntactic) properties of natural language. Despite the procedural flavor of this framework, many commonalities with other frameworks remain. Together with all other linguistic frameworks, we are assuming that the lexicon provides the input on the basis of which interpretation is projected, and that such encoded information provides all the information needed to characterize idiosyncrasies of individual languages. Together with other frameworks (HPSG, Categorial Grammar) we assume that lexical specifications include type-logical information fixing the combinatorial properties of individual expressions. Together with others, we assume that these lexical specifications also include representation of concepts, which in some cases fix the denotational content of some individual expressions. However, unlike other frameworks, we assume that such lexical specifications may include procedural instructions on the process of parsing itself, and that a unitary characterization of lexical specifications requires the definition of all such specifications as procedures that provide input to the incremental projection from a string onto some logical form. Furthermore, we claim that this incremental projection of structure is the only level of representation required in characterizing natural-language interpretation. The overall framework in terms of which these lexical specifications are defined, is, then, the metalevel theory which defines the inferential, goal-directed process that constitutes the activity of parsing. What we are proposing is that the human faculty for natural language is a capacity for parsing, a specialized inferential capacity for pairing linguistic expressions with logical forms which they are taken to express, these logical forms themselves being vehicles for inference of an orthodox sort. The shift of perspective has consequences. First, it suggests that the study of syntax has, following the lead of semantics, to become dynamic, defined in terms
Computation as Labeled Deduction
289
of the ongoing projection of information on a left-right basis (cf. Johnson and Moss, 1994, for the proposal that current models of grammar might revealingly be recast as dynamic algebras). Second, our concept of competence in opposition to some concept of performance has to be revised. We no longer envisage the systems underpinning natural language according to the static pattern imposed by classical Fregean logics, with strings assigned denotational content direct, and some ancillary and entirely separate largely unknown theory of performance explaining how these systems are manipulated in communication. Rather, we envisage natural languages as systems specifically developed for the dynamic enterprise of projecting infinite variety of interpretative content from a finite lexicon. Seen in this light, the underspecification of natural language content is no longer an embarrassing divergence from formal language systems, to be patched up in the analysis to approximate as closely as possible to those systems. Such specifications are, to the contrary, indicative of the purpose for which natural languages are designed. Natural languages are metalevel devices for the projection of vehicles of thought/inference, encoding procedures whereby the intended content can most effectively be retrieved. There is no longer a dichotomy between the perspectives provided by theories of competence and theories of performance. Theories of linguistic competence are indeed theories about the language faculty, and these are theories about the abstract formal properties of the framework which we put to use in parsing. Such theories are complemented by theories of pragmatics. The burden of pragmatic theories, and, more generally, performance theories, is to articulate the general constraints imposed by the cognitive system which determine how the choices made available by the competence system are actually realized in context (Sperber and Wilson, 1986). The two together combine to yield a theory of linguistic knowledge and use.
NOTES 1
This paper was stimulated by J. Aoun, who posed this question in a talk at the School of Oriental and African Studies, London, in February 1996. We are grateful to Andrew Simpson, Shalom Lappin, and Abbas Benmamoun for conversations over many months, and to the audience at the Bangor conference on Syntactic Categories for comments. 2 Indefinites are a systematic exception to this. Compare Reinhart (1997), Winter (1997), Farkas (1997), Abusch (1994), Meyer-Viol et al. (in press) for recent attempts to account for this phenomenon. 3 We leave open the question of whether the arity of predicates should include an argument for an event variable, but do not include such an argument position in what follows. 4 Scope effects are also projected from such m-variables, with each determiner projecting an m-variable. Such determiner m-variables may be indexed as dependent on some other
290
Ruth Kempson et al.
term, the choice of the term on which some variable is dependent, being an anaphoric-like choice which has to be made during the setting out of the annotated tree structure (cf. Meyer-Viol et al, in press. For a detailed account of the epsilon calculus, cf. MeyerViol, 1995). 5 There are specialized function-application rules in case the formula of the argument contains indexed variables (cf. Meyer-Viol et al., in press). 6 This point has been emphasized both in the semantic and pragmatic literature for over a decade now. Compare Kamp, 1981; Kamp and Reyle, 1994; Barwise and Perry, 1983; Sperber and Wilson, 1986; and the articles within these paradigms which have followed these. 7 The formal specification involves denning an additional Locality predicate, the value of which is shared for all nodes within a domain intervening between a node of Ty(t) annotated with a feature +Tense and some dominated node of Ty(t) also annotated with a feature + Tense. 8 Data such as His mother ignored every student with his not able to be construed as bound by every student is, on the account to be given here, an independent phenomenon to be explained in terms of linear order (cf. Williams, 1994, for a similar view). 9 The literature reports differences in judgments of acceptability with restrictive relative clause crossover data, but in recent months we have been unable to find a single speaker who consistently provides judgments in which (46)-(48) all preclude an interpretation in which the pronominal is bound by the quantifier. So, at least initially, we presume that the restrictive and nonrestrictive relatives alike display only strong crossover effects (cf. note 10). 10 This is in effect equivalent to the in situ form having an associated restriction (Labels(+Q)). In many languages (e.g., Japanese), the indication of the associated +Q feature is independently projected by a sentence-final particle. 1 ' The co-sharing of a formula expression in the two trees is most transparently displayed in nonrestrictive relative clauses. However, on the assumption that nominals project a pair of a variable (of type e) and a common noun, the same account can be extended to restrictive relative clause construals. 12 For those speakers for whom weak crossover effects persist in restrictive relative clauses, it appears that the variable projected by the nominal (the interpretation of the nominal in which the variable is contained being as yet incomplete) lacks sufficient denotational value to serve as an antecedent for the pronoun. (Cf. Kempson and Gabbay, 1998, for a discussion of this property within a different account of crossover phenomena in terms of locality.) 13 Wh-questions in Arabic display crossover effects much as in English if the wh is interpreted strictly as an indefinite. Should the wh- expression be interpreted quasireferentially as picking out some specific but not fully identified individual, then some speakers license the use of resumptive pronouns as a means of resolving the unfixed tree position. This independent means of enriching the wh-formula inhabiting the unfixed node provides a basis for identifying the following pronoun, which then, as in English relative-clause crossover phenomena, provides a means of identifying the position in the tree for the initially unfixed formula, thus precluding any following gap. 14 This is in effect equivalent to the in situ form having an associated restriction (Labels(+Q)) (= "a node bearing the annotation Labels(+Q) is somewhere above me
Computation as Labeled Deduction
291
in the tree description so far compiled"). In many languages (e.g., Japanese) in which the in situ form is the regular position for the wh-expression, the indication of the associated +Q feature is independently projected by a sentence-final particle. 15 In addition to this rule is the projection of the Topic position, associated with a clauseinternal position through the use of a resumptive pronoun. We leave this process on one side here, as not pertinent to a w/z-typology. Two alternatives present themselves. Either topic structures are an additional form of LINK structure, or they project an additional option for expansion from a node m requiring Ty(t) allowing a new node being introduced to be characterized as having a tree-node identified as m. Cf. note!3. 16 A restriction licensing only one unfixed constituent per task state, which is standardly imposed across languages, needs to be independently imposed. In some languages there is no such restriction, at least for w/z-expressions (e.g., Bulgarian and Czech). In these languages, in which all wh-expressions may occur preverbally, the characterization of whexpressions as simultaneously projecting a +Q feature at a task projecting a node of Ty(f) and a w/z-formula at an unfixed node, with no preverbal restriction to a projection of but a single unfixed node, is sufficient to allow the preverbal position of all wh-expressions. Simpson (1995) reports this preverbal array of wft-expressions as obligatory, but he informs us that if construed as D-linked, the second wh-form may remain in situ in a postverbal position. 17 The problems posed by so-called partial w/z-movement are especially problematic for the minimalist program, within which no unitary account appears to be possible (cf. Beck and Herman, 1996; Horvath, 1997, for recent advocacy of the two opposing "direct dependency" and "indirect dependency" accounts, both granting the necessity of the other form of account for some languages, and Simpson, 1995, for detailed evaluation of these problems.) For a more detailed account of this phenomenon within this framework, compare Kempson et al. (in press). 18 Should it prove possible to argue that the expletive form is a VP clitic as in Iraqi Arabic (cf. section 5.3.1) this stipulation would not be needed. 19 We ignore here the extra complexity associated with the preposition mil and all additional complications needed to predict constructions in which the wh- expression is contained within a larger fronted constituent, as in the English pied-piping construction. 20 For dialects in which sequences of was expletives are obligatory, the phenomenon has to be defined as locally inducing a complement clause node with appropriate properties. Cf. Kempson et al. (in press) for discussion of this and the expletives using a non-was form of wh. 21 It is also quite unlike dynamic predicate logic, whose characterization of anaphoric dependency and wh-questions involves a characterization of content exclusively in modeltheoretic terms projected from some discrete syntactic configuration defined over the syntactic string (about which the semantic formalism has nothing to say).
REFERENCES Abusch, D. (1994). The scope of indefinites. Natural Language Semantics, 3, 88-135. Aoun, J., and Li, A. (1991). Wh elements: syntax or LF? Linguistic Inquiry, 24, 199238.
292
Ruth Kempson et al.
Barwise, J., and Perry, J. (1983). Situations and attitudes. Cambridge, MA: MIT Press. Beck, I., and Berman, S. (1996). WTz-scope marking: direct vs indirect dependency. In U. Lutz and G. Miiller (Eds.), Papers on Wh-scope marking: Proceedings of a workshop on The Syntax and Semantics of Wh-scope marking 1995 (pp. 59-83). University of Stuttgart. Berwick, R., and Weinberg, A. (1984). The grammatical basis of linguistic performance. Cambridge, MA: MIT Press. Blackburn, S., and Meyer-Viol, W. (1994). Linguistics, logic and finite trees. Bulletin of Interest Group in Pure and Applied Logics, 1, 3-29. Cheng, L. (1991). On the Typology ofwh Questions. Doctoral dissertation, Massachusetts Institute of Technology, Cambridge. Chierchia, G. (1992). Questions with quantifiers. Natural Language Semantics, 1, 181234. Chomsky, N. (1981). Lectures on government and binding. Dordrecht: Foris. Dayal, V. (1994). Scope marking as indirect wh-dependency. Natural Language Semantics, 2, 137-170. Erteschik-Shir, N. (1973). On the nature of island constraints. Doctoral dissertation, Massachusetts Institute of Technology, Cambridge, MA. Farkas, D. (1997). Indexical Scope. In A. Sczabolsci (Ed.), Ways of scope-taking. Dordrecht: Kluwer. Gabbay, D. (1996). Labeled deductive systems. Oxford: Oxford University Press. Gabbay, D. (1994). Classical vs. non-classical logics (the universality of classical logic). In D. Gabbay, C. Hogger, and J. Robinson (Eds.), Handbook of Logic in Artificial Intelligence and Logic Programming: Vol.2 Deductive Methodologies (pp. 359-500). Oxford: Clarendon Press. Higginbotham, J. (1981). Pronouns as variables. Linguistic Inquiry, 11, 679-708. Horvath, J. (1997). The status of wh-expletives and partial wh movement construction of Hungarian. Natural Language and Linguistic Theory, 15, 507-571. Huang, J. (1982). Logical relations in Chinese and the theory of grammar. Doctoral dissertation, Massachusetts Institute of Technology, Cambridge, MA. Johnson, D., and Lappin, S. (1997). A critique of the minimalist programme. Linguistics and Philosophy, 20, 272-333. Johnson, D., and Moss, L. (1994). Grammar formalisms Viewed as Evolving Algebras. Linguistics and Philosophy, 17, 537-560. Joshi, A., and Kulick, S. (1997). Partial proof trees as building blocks for a categorial. Linguistics and Philosophy, 20, 637-667. Kamp, H. (1981). A theory of truth and semantic representation. In J. Groenendijk, R. Janssen, and M. Stokhof (Eds.), Formal methods in the study of language, mathematical centre tract 135 (pp. 277-322). University of Amsterdam. Kamp, H., and Reyle, U. (1994). From discourse to logic. Dordrecht: Kluwer Academic Publishers. Kaplan, R., and Zaenen, A. (1988). Long-distance dependencies, constituent structure, and functional uncertainty. In M. Baltin and A. Kroch (Eds.), Alternative conceptions of phrase structure (pp. 17-43). Chicago: University of Chicago Press. Kempson, R., and Gabbay, D. (1998). Crossover: a dynamic perspective. Journal of Linguistics, 34, 73-124.
Computation as Labeled Deduction
293
Kempson, R., Meyer-Viol, W., and Gabbay, D. (in press). Dynamic syntax. Oxford: Blackwell. Koopman, H., and Sportiche, D. (1982). Variables and the bijection principle. The Linguistic Review, 2, 139-160. Lasnik, H., and Stowell, T. (1991). Weakest crossover. Linguistic Inquiry, 22, 687-720. Lutz, U., and Miiller (Eds.) (1995). Papers on Wh-scope marking: Proceedings of a workshop on The Syntax and Semantics of Wh-scope marking 1995. University of Stuttgart. Marcus, M. (1980). A theory of syntactic recognition for natural language. Cambridge, MA: MIT Press. McDaniel, D. (1989). Partial and multiple wh movement. Natural Language and Linguistic Theory, 7, 565-604. Meyer-Viol, W. (1995). Instantial logic. Doctoral dissertation, University of Utrecht. Meyer-Viol, W., Kibble, R. Kempson, R., and Gabbay, D. (in press). Indefinites as epsilon terms: a labelled deduction account. In H. Bunt and R. Muskens (Eds.), Computing meaning: Current issues in computational semantics. Dordrecht: Kluwer Academic Publishers. Morrill, G. (1994). Type-logical Grammar. Dordrecht: Kluwer Academic Publishers. Moortgat, M. (1988). Categorial investigations. Berlin: Mouton De Gruyter. Miiller, G., and Sternefeld (1996). 'A' chain formation and economy of derivation. Linguistic Inquiry, 27, 580-512. Oehrle, D. (1995). Term-labelled categorial type system. Linguistics and Philosophy, 17, 633-678. Ouhalla, J. (1996). Remarks on the binding properties of wh pronouns. Linguistic Inquiry, 27, 676-708. Pollard, C., and Sag, I. (1991). Head-Driven Phrase Structure Grammar. Chicago: University of Chicago Press. Postal, P. (1993). Remarks on weak crossover effects. Linguistic Inquiry, 24, 539-556. Reinhart, T. (1997). wh- in situ in the framework of the minimalist program. Natural Language Semantics, 6, 29-56. Reinhart, T. (in press). Interface strategies. Cambridge, MA: MIT Press Simpson, A. (1995). Wh-movement, licensing and the locality of feature checking. Doctoral dissertation, School of Oriental and African Studies, University of London. Sperber, D., and Wilson, D. (1986). Relevance: Communication and cognition. Oxford: Blackwell. Williams, E. (1994). Thematic relations in syntax. Cambridge, MA: MIT Press. Winter, Y. (1997). Choice functions and the scopal semantics of indefinites. Linguistics and Philosophy, 20, 399-467.
This page intentionally left blank
FINITENESS AND SECOND POSITION IN LONG VERB MOVEMENT LANGUAGES: BRETON AND SLAVIC MARIA-LUISA RIVERO Department of Linguistics University of Ottawa Ottawa, Ontario, Canada
In this chapter,1 I argue that Long Verb Movement (LVM) languages are characterized by (a) a PF interface condition on Tense (T) that mentions a Head-Complement configuration, and (b) a LVM process that fronts a nonfinite verb, and applies in PF to satisfy this condition. This has two typological consequences. The first is that unrelated languages such as Breton and Bulgarian share identical second-position effects for tensed Auxiliaries (Aux), and LVM constructions with an untensed V preceding a tensed Aux. The second is that LVM and Verb Second (V2) languages can both be said to exhibit secondposition effects in main clauses, but nevertheless differ. The received view is that V2 involves two fronting operations that are syntactic and hence check features. I propose that LVM is a hierarchical fronting process of the PF branch that satisfies an interface condition on T (or a stylistic rule), and not a checking or syntactic operation. The chapter is organized as follows. Section 1 outlines the system to satisfy the requirements of T in PF in LVM languages, which consists of two parts. Section 2 contrasts LVM and V2 languages on the basis of this system. Sections 3 and 4 discuss similarities and differences between Breton and Slavic languages with LVM, and between these LVM languages and Polish, a Slavic language without the LVM process. Syntax and Semantics, Volume 32 The Nature and Function of Syntactic Categories
295
Copyright © 2000 by Academic Press All rights of reproduction in any form reserved. 0092-4563/99 $30
296
Maria-Luisa Rivero
1. PF CONDITIONS ON TENSE The central idea of this chapter is that in LVM languages the functional category T is subject to a bare output condition or PF requirement. This condition is configurational or hierarchical and not linear, does not mention formal features, and can be satisfied via two core syntactic structures: the Head-Complement configuration in (la), or the Checking Configuration in (Ib).
On this view, T may satisfy its output condition when it heads a TP that is the Complement of a C with certain PF characteristics. That is, T must be in the structure depicted in (la), or in the Internal Domain in the sense of Chomsky (1995) of a C that is for now overt, but more precisely visible. This condition is formulated in (2), with H standing for head. (2) H-Internal Domain Condition Satisfy the PF condition of T in the internal domain of a C visible in PF. Alternatively, T may satisfy its PF requirement by appearing in a Checking configuration in the sense of Chomsky (1995), as when an overt V adjoins to T in the structure depicted in (1b). This condition is formulated in (3). (3) H-checking Domain Condition Satisfy the PF condition of T in its H-checking domain. The intuitive idea is that T requires overt support at the PF interface, and that this support can be supplied under two structural conditions: (a) by a head that is the sister of TP, the maximal projection of T, or (b) one that is the sister of T itself. The choice between these two structures gives rise to parametric variation, which distinguishes Breton from Slavic, and Slavic languages from one another. First, differences between Breton and Slavic languages with LVM can be attributed to contrasts in the quantitative use of these two options, as follows. In Breton, T is usually licensed with condition (2) and the Head-complement configuration in (la), while condition (3) with the checking configuration in (1b) is used with just a few verbs. These two licensing options for T are also found in Slavic languages with LVM, but under different circumstances. The checking configuration in (1b) is used to license T with verbs and with lexical auxiliaries; condition (2) with the Head-Complement configuration in (la) is used with just the subset of auxiliaries that are functional. Breton and these Slavic languages, then, are similar in licensing T under condition (2) and the Head-Complement configuration, and
Finiteness and Second Position
297
in sharing a LVM process with parallel properties that applies in PF triggered by this condition. In sum, LVM-languages use two structures to license T in PF under different circumstances. Breton uses condition (2) in general and (3) only exceptionally, while the Slavic languages with LVM use condition (2) just for functional auxiliaries. Second, the two options to license T in PF are also at the source of a parametric variation that distinguishes Polish from LVM languages in Slavic, and also Breton. Borsley and Rivero (1994) argue that Polish lacks LVM and displays raising of a nonfinite V to finite Aux (i.e., Incorporation forming a morphological complex). I suggest here that what makes Polish different is that it uses a variant of condition (3) to license T in PF with functional auxiliaries, which does not trigger LVM but an Incorporation or word formation rule that is stylistic (i.e., a hierarchical operation in the PF branch that does not check the formal features of the target against those of the raising V).
2. LONG VERB MOVEMENT VERSUS VERB SECOND This section compares LVM and V2, deriving differences from the PF condition on T and the PF process to satisfy it in LVM languages. 2.1. Root Clauses V2 languages exhibit a word-order asymmetry that distinguishes main from embedded clauses, as in (4). (4) a. Dieses Buck hat Hans gelesen. German 'Hans has read this book.' b. Ich bedauere [dass Hans dieses Buck gelesen hat]. 'I regret that Hans has read this book.' c. Hat Hans dieses Buck gelesen ? 'Has Hans read this book?' The classic analysis is that the (finite) head raises to C in (4a), while it does not in (4b) (den Besten, 1977, to Branigan, 1996), the V-in-C part of V2. Second, if V is in C, Spec-of-C must also be filled, and a common assumption is that a covert operator fills that position in (4c). LVM-languages contrast with V2-languages in two respects. First, a nonfinite V raises to C in LVM. Consider (5-6) from this perspective. (5) a. Lavaret en deus [he deus desket he c'henteliou]. said 3S have [3S have learned her lessons] 'He has said that she has learned her lessons.'
Breton
298
Maria-Luisa Rivero
b. *En deus lavaret [he deus desket he c'henteliou}. c. *Lavaret en deus [desket he deus he c'henteliou]. (6) a.
Spytalsomsa [chi si napisal list}. Slovak asked have+ 1SREFL [if have+2S written letter] 'I have asked if you have written a letter.' b. *Som sa spytal [chi si napisal list}. c. *Spytal som sa [chi napisal si list}.
In (5a-6a), V precedes Aux in the main clause, and follows it in the subordinate clause. The proposal is that V fronts to C in main clauses through a LVM process2 that does not apply in the majority of subordinate clauses, which is similar to V-to-C in V2. Second, Spec-of-C must be phonologically empty if the nonfinite V is in C (Rivero, 1993a), which can be illustrated with multiple Wh-movement and LVM, as in (7). (7) a.
Koga kakvo e kupil? when what have+3S bought? 'When has he bought what?' b. *Koga kakvo kupil e ? c. Kupil li e knigata? bought Q have+3S book+the 'Has he bought the book?'
Bulgarian (Rudin, 1986:(81b))
For Rudin (1988), Wh-phrases in Bulgarian move to Spec-of-C, which can hold several phrases as in (7a). However, (7b) shows that V cannot front with those phrases, and (7c) indicates that it fronts in interrogatives with no overt Spec-of-C (Rivero, 1993b). Thus, if Spec-of-C is filled with one or more phrases, C cannot hold the untensed V. The same is found in declaratives, as Breton (8-9) illustrate. (8)
a. *Al levr lennet en deus Tom. Breton the book read 3S have Tom b. *[CP NPk [c' [co Vi] [TP Aux [VP ... t i . . . tk]]]].
(9) a. Al levr en deus lennet Tom. the book 3S have read Tom 'Tom has read the book.' b. [CP NPj [c' [co 0] [TP Aux [VP ... t i . . . ]]]] c. Lennet en deus Tom al levr. Tom has read the book.' d. [CP [c' [co V;] [TP Aux [VP ... ti ; ... ]]]] For Borsley, Rivero, and Stephens (1992), (8) combines Topicalization and LVM, and hence has material in both Spec-of-C and C, while (9a-c) contain only one
Finiteness and Second Position
299
overt constituent in CP. They also argue that lennet in (9c) is in C, and distinguish between LVM and VP-topicalization as in (10). Borsley and Stephens (1989: sect. 6) argue that (finite) auxiliaries or verbs are in I=T. (10)
O lennal levr n' emanket Tom. Breton PRT read the book Neg is Neg Tom 'Tom is not reading the book.' 'Reading the book, Tom is not.'
To conclude, a hallmark of V2 languages is root clauses where C is filled by a finite V, and Spec-of-C is also filled. The hallmark of LVM-languages is (a subset of) main clauses with a C filled with the nonfinite V, and a Spec-of-C that is empty of phonological material. 2.2. Nonroot Clauses Germanic embedded V2 is not homogeneous, which has attracted much attention (latridou and Kroch, 1992, for an overview). In standard Dutch, V2 is limited to main clauses. In German and mainland Scandinavian, it is possible in a limited range of subordinate clauses. In Icelandic and Yiddish, it is acceptable in a wide range that includes adjunct and subject clauses. For some, including Vikner (1991), embedded V2 results from CP-recursion as in (lla): Cl takes CP2 as complement, Spec2 holds a phrase, and C2 holds the embedded tensed V. For others, including Diesing (1990), embedded V2 can also result from V in I, and any type of phrase in Spec-IP, as in (1 Ib). (11) a. V [CP1 C1 [CP2 X max i [c' [C2 Vj] [IP ... t j . . . ti;. .. ]]]] b. V [CP C [IP X max i [i; [i Vj ] [VP ... t j . . . t i . . . ] ] ] latridou and Kroch (1992) relate the restricted embedded V2 of mainland Scandinavian to CP-recursion licensed by a governing V as in (1 la). Unrestricted embedded V2, as in Yiddish and Icelandic, involves IP as in (1 Ib). Embedded LVM also exists but is not homogeneous. There are two groups of languages. One includes Serbo-Croatian and Czech and resembles standard Dutch, with LVM only in main clauses; in these languages, untensed Vs do not precede tensed Aux in subordinate clauses. A second type resembles mainland Scandinavian and includes Old Spanish. Lema and Rivero (1991:254-257) show that this language allows LVM in the complement of bridge Vs, and argue that this results from CP-recursion: the untensed V fronts to C2 in a structure like (lla). I know of no language with LVM in a wide range of subordinate clauses, so the third group with unrestricted embedded LVM corresponding to Icelandic and Yiddish seems not to exist. If the landing site of LVM is C, this situation can be accounted for. On this view, LVM is restricted to the type of embedded clause with CP-recursion, which contains a landing site for the untensed V. Then, the absence of embedded LVM in Serbo-Croatian or Czech means that there is no
300
Maria-Luisa Rivero
CP-recursion as in standard Dutch, and restricted embedded LVM in Old Spanish indicates that there is CP-recursion, as in mainland Scandinavian or German. In sum, some languages show LVM in declarative complements embedded under bridge verbs, which fits well with the idea that V raises to C in LVM, and other languages disallow embedded declaratives with LVM because they disallow CP-recursion. Breton and Bulgarian lack LVM embedded under bridge Vs, but display interrogative complements with this process, as in (12). (12) a. N' ouzon ket [ha lennet en deus Tom al levr.] Breton Neg know+ IS Neg [Q read 3S have Tom the book] 'I do not know whether Tom has read the book.' b. Ne znam [prochel li e Petur knigata.] Bulgarian Neg know+lS [read Q have+3S Peter book+the 'I do not know whether Peter has read the book.' These patterns may look surprising, as embedded V2 is not usually restricted to interrogative complements, and most Germanic languages disallow V2 under question Vs. I discuss this in section 3, and argue that the hypothesis that LVM is a process that satisfies a PF interface condition of a functional category can provide an account of these data. 2.3. Explanations for V2 and LVM Let us recall the main account of V2 to contrast it with the one proposed here for LVM. For Vikner (1991:2.2 and references), the classical view is that a finite feature in C triggers V-in-C in V2, an idea found most recently in Branigan (1996). A topic /wh-feature is often seen as the reason to fill Spec-of-C overtly. The two aspects of V2, then, result from internal conditions of the syntax or feature checking. Finiteness is checked via raising to C, and topichood via raising to Spec-of-C. In my view, LVM phenomena are of a different nature. The core idea is that LVM-raising contrasts with V-raising in V2 in that it operates to satisfy a condition of the PF-interface—or what Chomsky (1995) calls a bare output condition—not to check a formal feature. As LVM satisfies an external condition and is not a checking operation, (a) it can apply in the PF branch and (b) have an output that does not establish a checking configuration in the sense of Chomsky (1995). However, LVM is a hierarchical not a linear process, and comparable to the operations called stylistic in Chomsky (1995) and Chomsky and Lasnik (1977), and the PF-driven rules in Zubizarreta (1995) and Reinhart (1995). In other words, LVM applies in PF but has "syntactic" and not "phonological" characteristics. With this proposal in mind, let us look at differences between LVM and V2 beginning with V-in-C. My claim is that LVM resembles V2 in thatfiniteness— or T—constitutes a trigger for both processes: the untensed V raises to C to satisfy
Finiteness and Second Position
301
a condition of T in TP. The differences are (a) that LVM is not a checking operation, and (b) that LVM can establish a Head-Complement configuration, and thus is not limited to the checking configuration required of syntactic Move. Consider the analysis proposed for (13), which repeats (9) with .TP replacing IP. (13) a. *En deus lennet Tom al levr 3S have read Tom he book b. Lennet en deus Tom al levr. c. [CP [c' [co Vj] [TP Aux [VP ... t i . . . ]]]] The auxiliary en deus (T) imposes the interface condition (2) that mentions the structure in (la): it must appear in PF in the complement of a head that is visible. This PF condition can be met in a variety of ways examined in section 3, but the LVM process satisfies it: it fills C with V, which makes TP the complement of an overt C. On this view, LVM results in the Head-Complement configuration depicted in (la), which is similar to the output of Merge in the computation, and contrasts with the output of Move. That is, LVM does not establish a checking configuration between the verb and the auxiliary containing T, in contrast to what is required for syntactic Move. In my view, LVM can establish the output in (la) because it does not check features, and in Rivero (in prep) I develop the idea that this output corresponds to the case where the category affected by the process (V) keeps on projecting once it raises, and the category that serves as target/attractor in the movement (Aux) does not further project. By contrast, in the case of syntactic Move, the target projects and the moved category does not, and this ensures that a checking configuration is established where formal features can be checked in the sense of Chomsky (1995). The availability of two configurations to license T in PF suggests that interface conditions use the same structures as the conditions of the internal system. On the one hand, the interface condition in (3) is based on the configuration used to license (i.e., check) formal features in the system, which most often results from Move and not Merge. The interface condition in (2) is based on the Headcomplement configuration, which in the computation is used for Theta relations and Selection and results from Merge. In sum, PF offers a choice because the two configurations for internal requirements are also used for interface or external requirements. The hypothesis that LVM satisfies an interface condition on T based on the Head-complement configuration derives a major syntactic difference between yes-no questions in LVM and V2-languages. We saw that in V2 languages finite Auxiliaries can be string initial in yes-no questions. LVM languages offer several syntactic strategies for such questions, but disallow string-initial Auxiliaries, as Slovak illustrates. In Slovak, LVM applies in the same way in declaratives and yes-no questions.
302
Maria-Luisa Rivero
(14) a. *Si napisal list? Slovak have+2S written letter b. Napisal si list? written have+2S letter? 'Have you written the letter?' The difference between German and Slovak results from the PF condition on T. Fronting the untensed V to C is independent of illocutionary factors, but ensures that at PF the tensed Aux heads the Complement of an overt C. Yes-no questions may contain an unpronounced operator in Spec-of-C, but I show in section 3 that condition (2) can be satisfied only if overt material and not unpronounced material fills Spec-of-C. The second aspect of LVM is that Spec-of-C must be phonologically empty if the nonfinite V is in C. This means that V-raising in V2 applies in all main clauses, while LVM operates only in a subset of them, and is in complementary distribution with movement to Spec-of-C. This difference follows from the proposal that LVM is a PF process that applies to satisfy an interface condition, and not to check formal features. LVM becomes unnecessary and in fact is blocked if Topicalization or wh-movement establish a configuration that is a legitimate PF object that satisfies the requirements of T. Under the standard assumption that Topicalization/ wh-movement apply for feature checking, these rules are obligatory when the appropiate formal feature is present in the structure, thus they block LVM, which is not triggered by a feature. LVM is triggered by the need to satisfy an interface condition of T, plays no internal role in the system, and will not operate once feature-checking frontings succeed in establishing the configuration that satisfies T's requirement. In this way LVM has a "last resort" flavor in the sense of Chomsky (1991) and Epstein (1992), but at the same time is a process that violates the Last Resort Principle in the sense of Chomsky (1995). In section 3, I argue that this difference as to feature checking between LVM in PF and syntactic rules accounts for contrasts as to second- and third-position effects in Breton. The core idea is that feature-checking rules each check a (different) formal feature and can co-occur, and the auxiliary in T may end in third position. By contrast, LVM is a process which usually leaves an auxiliary in second position; it applies in PF only if fronting in the syntax does not establish the configuration to license T. Chart (15) summarizes the differences between LVM and V2 languages. First, with LVM finiteness is located in T and imposes a PF interface condition, but with V2 finiteness is located in C and imposes an internal condition. V raises to C in both V2 and LVM, but in V2 finite raising is a checking or syntactic operation, and in LVM nonfinite raising is a PF or stylistic rule with a configurational output that moves V to C to satisfy the external condition of T in TP. Second, V-to-C in V2 combines with other feature-checking processes that fill Spec-of-C, with each rule playing a different role in the system. By contrast, LVM is incompatible with
Finiteness and Second Position
303
feature-checking operations that fill Spec-of-C in the syntax given that those rules establish the appropiate configuration to license T in PF
(15) T in Complement of visible C V-to-C trigger Content of C in PF Content of Spec-of-C in PF
LVM languages
V2 languages
Yes PF condition Untensed V Empty (if C filled)
No Last Resort Tensed V Usually filled
2.4. Differences in LVM Languages LVM languages are not identical, and two differences related to T deserve mention in this section. The first is that Breton is VtensedSO like the Celtic languages (Anderson and Chung, 1977; Borsley and Stephens, 1989; Schafer, 1992), while other LVM languages are not. The contrast is observed in embedded clauses where the rigid VtensedSO of Breton contrasts with flexible order in Slavic, including SVO as in Bulgarian (16b). (16) a. Mona a lavar [e oar Yann ar respont]. Breton Mona PRT says PCL knows Yann the answer 'Mona says that Yann knows the answer.' (Schafer, 1992: (4)) b. Petur znaex [ che decata vidjaxa knigata]. Bulgarian Peter knew that children+the saw book+the 'Peter knew that the children saw the book.' Clause-initial finite Vs characterize VSO languages, as illustrated with Welsh (17). Breton belongs to the VSO type, so it may appear surprising that tensed items cannot be first in root clauses in this language, as illustrated in (18). (17)
Gwelodd Emrys ddraig. Welsh saw Emrys dragon 'Emrys saw the dragon.'
(18)
*Lenn Anna al levr. Breton reads Anna the book '*Anna reads the book.'
I attribute the Breton restriction to the PF requirement on T, which is absent in other VtensedSO Celtic languages: TP must be the Complement of a visible head. For the moment visible is equivalent to overt. As a result, VtensedSO order is reserved in Breton for environments where this PF condition is satisfied, which include the embedded clause in (16a). In brief, Breton is V-initial and the Slavic languages are not, but the restriction on T makes the Celtic language resemble the Slavic languages.
304
Maria-Luisa Rivero
A second difference concerns finite verbs versus auxiliaries. We just saw that Breton disallows sentence-initial auxiliaries and verbs: (13a) and (16). In West and South Slavic, by contrast, auxiliaries traditionally known as clitics cannot be sentence-initial (14), but tensed verbs and lexical auxiliaries can be, which is discussed in section 4: (19)
Vidjaxa (decatd) knigata(decatd). saw (children+the) book+the 'The children saw the book.'
Bulgarian
I attribute this contrast to the PF licensing system for T summarized in (20). (20)
Tense-licensing in PF in LVM Languages V-raising to T Language Breton Slavic
TP as complement
V
Aux
V
Aux
No Yes
No No
Yes No
Yes Yes
In Breton T is licensed in the Head-Complement configuration in almost all cases. In Slavic, this structure is reserved for auxiliaries, and the Checking configuration in (Ib) is operative with verbs. On this view, Breton T must appear in PF in a complement, which means that finite verbs and auxiliaries cannot be sentence initial. In Slavic, T in verbs is licensed in the configuration in (Ib), which means that tensed Vs can be sentence-initial as in (19), and hence need not be in a projection that is a complement.
3. BRETON In this section, I argue that Breton T is licensed mainly via the internal domain condition in (2) and hence in the configuration in (la), which is the cause of second- and third-position effects with most tensed items. The checking configuration in (Ib) and (3) is used with a few Vs, leading to some first position effects. The preference for condition (2) and LVM in PF make Breton contrast with other Celtic languages and resemble the Slavic languages. 3.1. The H-Internal Domain and Second Position The H-internal domain condition in (2) accounts for the second position of most tensed items in Breton. This principle requires that tensed root clauses have a layer above TP, which is identified here with CP, and frontings in syntax or PF ensure the projection of this level.
Finiteness and Second Position
305
First consider perfect auxiliaries. Affirmative root constructions may satisfy the H-internal domain condition by a phrase in Spec-of-C, or a V that fills C through LVM.3 (21) a. b. c. d.
Al levr en deus lennet Tom. Breton [CP NP [c' [co ] [TP A Lennet en deus Tom al levr. [CP [c' [co Vj] [TP Aux [VP NP tj NP]]]] 'Tom has read the book.'
If a phrase moves in the overt syntax to Spec-of-C, as in (2la), it signals that C must be projected before Spell-Out, which makes the position visible in PR If it is assumed that a covert/LF movement raises an abstract operator in yes-no questions, under our approach this will not make C visible. If V moves to C as in (21c-d), this head contains overt material. Now consider synthetic Vs, as in (18) repeated in (22a). With just a tensed V, the option with V in C fails to occur. If V moves to C, it does not head a complement and violates the internal domain condition. A V in T is also impossible with no overt material in CP, as TP is not a complement, or complements a C that is not visible in PF. The derivation with fronting of a phrase, as in (23), complies with the requirement on c. (22) a. *Lenn Anna al levr. Breton b- "[cp [c' [coVi] [Tpti NP NP]]]] c. *[Tp Vi [vp NP ti NP]] (23) a. Al levr a lenn Anna. the book PCL read+PRESENT Anna 'Anna reads the book.' b- [CP tspec-c NPj] [c' [coq ] [TP a Vk [VP NP tk tj ]]] (22a) is deviant and Stump (1984:298) notes that sentences with sentence-initial particles are also ill formed: *A lenn Anna al levr. Following Stephens (1982), Stump (1984, 1989), and Schafer (1992), I assume that the particle is in T, not in C (contra Hendrick, 1991), which means that sentences with initial particles are ruled out for the same reason as (22): they lack a filled=overt C. Consider ober 'do' as tense carrier (Borsley, Rivero, and Stephens, 1986; Schafer, 1994; Stephens, 1982; Wojcik, 1976). Ober is inflected for Tense and takes a VP complement with V raising to C by LVM, as in (24a). The untensed V is the visible head required by the H-internal domain condition to license T. (24) a. Lenn a ra Anna al levr. Breton read PCL do+PRESENT Anna the book 'Anna reads the book.' b. [ C p q [ c V i ] [ X T P a Aux [ V p NP t i NP]]]
306
Maria-Luisa Rivero
Thus, Breton sentences may come in analytic or synthetic versions. In the synthetic construction in (23), the lexical V is inflected for Tense, and X max fronting to Spec-of-C applies and makes C visible. In the analytic construction as in (24), Aux is inflected for Tense, and the lexical V is the head of its nonfinite complement. In this situation, LVM or X max -fronting, as in Al levr a ra Anna lenn, are the two alternative options to license T. Consider now negation. Initial ne in Breton ne . .. ket counts for first position, as in (25): (25) a. N' en deus ket lennet Tom al levr. Breton neg 3S have neg read Tom the book 'Tom has not read the book.' b. Ne lenn ket Anna al levr. neg reads neg Anna the book 'Anna does not read the book.' Here Breton resembles other Celtic languages and some of the Slavic languages, as in (26). (26) a. Nid ydyw Megan ddim yn cysgu. Welsh neg is Megan neg in sleep 'Megan is not sleeping.' b. Ne sum prochel (az) knigata. Bulgarian Neg have+lS read (I) book+the 'I have not read the book.' For some, Celtic Neg occupies C (Chung and McCloskey, 1987; Rouveret, 1991), an idea adopted for Breton in (Hendrick, 1988, 1991; Borsley, Rivero, and Stephens, 1996). Given this assumption, Neg licenses T as the overt head in C that takes TP as complement, as in (27). (27)
[CP [c Ne] [TP [T lenn} [ket Anna al levr}}}
Now consider subordinate clauses, as in (28). These do not contain overt complementizers, have the Aux in first position, and disallow LVM. (28) a. Lavaret he deus Anna [en deus lennet Tom al levr.} Breton said 3S have Anna 3S have read Tom the book 'Anna said that Tom had read the book.' b. V [Cp [c' [co q] [TP en deus [Vp lennet Tom al levr}}}}} If such clauses are CPs with a null C as in (28b), C is visible in PF due to selection. That is, the main V contains a feature encoding the type of clause it selects (i.e., declarative), and a similar feature is found in the embedded C. When V is combined with a clause via Merge, the feature in V and the one in C must match, and this information is available in the remainder of the derivation including PF, and
Finiteness and Second Position
307
makes C visible. In (28), then, TP is the complement of a nonovert but visible C. If V in fact takes a TP complement and there is no CP, V is the visible head that licenses T in PR The Internal Domain Condition is an interface principle for Tense, as the following cases demonstrate. First, the LVM construction in (29) shows that nonfinite auxiliaries can be initial, so there is no prohibition against auxiliaries in first position. (29) Bet am eus kavet al levr. Breton had IS have found the book 'I have found the book.' Second, imperatives lack T (Beukema and Coopmans, 1989; Zanuttini, 1991; Rivero, 1994a), show Agr (Person and Number), and can be initial as in (30). This shows that verbs with Agr and no T need not be in a complement. (30) Sent ouzh da vamm! Breton obey+2Sto your mother! (Schaefer, 1992: (38)) 'Obey your mother!' Third, finite Vs show T and Agr with null subjects, but only T with overt subjects (Anderson, 1982; Borsley and Stephens, 1989; Hendrick, 1988; Schafer, 1992; Stump 1984, 1989). Regardless of Agr, however, finite verbs must appear in a complement, as in (31), and cannot be initial. This shows that the Internal Domain Condition is sensitive to T and disregards Agr. (31) a. Levriou a lennont. (Stump, 1984) books PCL read + 3P They read books.' b. Levriou a lenn (*lennoni) ar vugale. books PCL read+3S (*read+3P) the children 'The children read books.' In sum, in Breton T is licensed when TP is the complement of a visible head. A head is visible (a) if filled overtly, as in LVM, (b) if its Spec is filled overtly, as in Topicalization, or (c) if its projection is selected, if we assume that embedded declaratives have a null C. 3.2. The H-Internal Domain and Third Position Two types of third-position effects result from the internal domain condition, as when (a) feature-checking rules co-occur, or (b) an initial constituents is in a projection that does not have TP as its internal domain. Let us begin with checking operations by recalling that LVM applies to license
308
Maria-Luisa Rivero
T, and will not operate when a Spec-of-C is filled in the syntax: LVM does not cooccur with Topicalization. Checking operations must apply obligatorily to check features, so may combine in ways that leave the auxiliary in third position, which can be illustrated with Neg-fronting and Topicalization. Under the assumption that Neg originates within the clause (Borsley et al., 1996), Neg-fronting operates to check a strong [+neg] feature in C. The standard assumption is that Topicalization checks a Top/Focus feature in Spec-of-C. When these two rules combine, as in (32), Aux is in third position. (32) a. Al levr n' en deus ket lennet Tom. Breton the book neg 3S have neg read Tom 'Tom has not read the book.' b. [CP Al levr [c' [Co n'} [IP en deus ket lennet Tom]}] The purpose of LVM in PF is to make C visible. Raising Neg to C in the syntax also has this effect, so this is why LVM does not combine with Neg-fronting (Stephens, 1982), as in (33). In other words, LVM is usually seen with an auxiliary in second position. (33)
*Lennet n' en deus ket Tom al levr. Breton read neg 3S have neg Tom the book
One exception to this is the auxiliary in third position in left-dislocations and yes-no questions. First consider left dislocated phrases, which do not usually "count" for first position, in contrast with topicalized phrases, which usually do. This characteristic is illustrated in Breton (34): the dislocated phrase is followed by a V fronted by LVM and the auxiliary in third position. That topicalized phrases differ is seen in (35): they disallow LVM or count as first position: (34) a. Yann, roet meus al levr dezhan. Breton Yann, given IS+have the book to + him (Schafer, 1992:(44b)) 'Yann, I've given the book to him.' b. [TOPP NP [CP [C' [co Vi] [TP Aux [vp ti NP PP]]]]] (35) a. *A/ levr lennet en deus Tom. Breton the book read 3S have Tom b. *[ CP NP k [c, [a, Vi] [TP Aux ti ..tk.]]]. I account for this contrast using the Left Dislocation analysis of Chomsky (1977) updated in Lasnik and Saito (1992:76ff): such constructions contain a projection called here TOP(ic)P with the base-generated dislocated phrase as Spec and a null head. (36)
[TOPP X max [ TOP ' [TOP 0] [ CP [c- C° TP]]]].
On this analysis, C prevents TOP from having TP as its internal domain. TP is in the complement domain of TOP, but not its minimal complement. In (34), then,
Finiteness and Second Position
309
LVM places V in C to license T. Topicalization, by contrast, is a movement to Spec-of-C that establishes in the syntax the internal domain for T, so blocks LVM. Now let us consider yes-no questions such as (37), where the question particle ha is followed by sonjal 'think' fronted by LVM, the particle a, and Aux 'do', which means that they are similar to Left-Dislocations. W/z-phrases are like topic phrases and incompatible with LVM as in (38) (ha is not used in wh-questions). (37) Ha sonjal a raint er bleuniou? (Desbordes, 1983:84) Q think PCL do+Fut+3PL the flowers 'Will they think of the flowers?' (38) a. Piv en deus lennet al levr? Breton who 3S have read the book? 'Who has read the book?' b. *Piv lennet en deus al levr? Ha heads a phrase dubbed Q(uestion)P as in (39) and takes a CP-complement that Borsley et al. (1996) assume it does not L-mark.
Alternatively, ha as functional head does not select a specific type of complement. This means that the question particle can be merged in the computation with a complement that does not share an equivalent feature standing for, roughly, illocutionary force: ha is [+Q], but the C that projects CP in (39) does not contain this feature. The assumed lack of selection between ha and C prevents C from being visible in PF unless either it or its Spec is filled overtly. In structures such as (39), then, LVM fills C so that TP can be the complement of a visible head. We noted in section 2 that embedded LVM is found in ha-interrogative complements, as in (12) repeated now as (40a). Such complements also allow Topicalization, as in (40b). (40) a. N' ouzon ket [ha lennet en deus Tom al levr.] Breton Neg know+ IS Neg [Q read 3S have Tom the book] b. N' ouzon ket [ha al levr en deus lennet Tom.] Neg know+ IS Neg [Q the book 3S have read Tom] 'I do not know whether Tom has read the book.'
310
Maria-Luisa Rivero
The analysis given for main clauses can account for these embedded questions. If ha is merged with an unselected CP-complement, Xmax-fronting to Spec-of-C in the syntax (Topicalization) or LVM to C in PF are two procedures that make C visible to license T in PF. Thus, the interface condition on T that mentions the Head-Complement configuration is ultimately responsible for LVM in clauses that contain the question particle ha, whether they are embedded or not, and accounts for verb fronting to C in a syntactic environments where, as noted in section 2, V2 phenomena are usually absent in Germanic.4 In sum, third-position effects result in Breton from checking operations that combine as when Topicalization co-occurs with Neg-fronting. These two operations block LVM in PF. They also result when a first constituent is in a projection that does not have TP as its internal domain, as with dislocations and haquestions, in a structural situation that allows LVM. 3.3. The H-Checking Domain and First Position In Breton root clauses, first-position effects with tensed Vs are lexically determined. Aspectual mont 'go', and eman 'be', are the two Auxiliaries that can be sentence-initial, or head clauses that are not complements. Eman takes either a PP or a progressive complement, as in (41). (41) Eman Yann o lenn al levr. Breton Is Yann PROG read the book 'Yann is reading the book.' I suggest that these entries have a feature that allows them to satisfy all the requirements of T, including its PF-condition. One way to implement this idea is that mont and eman raise from an internal position in the clause, adjoin to T, satisfy the PF-interface condition of T via the H-checking Domain Condition, so can be sentence-initial. 3.4. Breton Clitic Pronouns In some languages, an internal domain condition similar to (2) serves to license functional categories like D (clitic pronouns); this gives rise to pronouns in second position, as in Slavic. In Breton condition (2) is restricted to T and does not apply to D. Breton clitic pronouns do not impose an interface condition, but adopt the characteristics of the verb they are attached to. Consider (42). (42) a.
b.
E desket en deus Yann. Breton him taught 3S have Yann 'Yann has taught him.' [cp [c'[co [vo CL [vo V]i] [Tp Aux[vpNPti]]]]
Finiteness and Second Position
311
c. *E deskas Yann. Breton him taught Yann '*Yann taught him.' d.
"[TptTotvoCLtvoVlhHvpNPti]]]
When clitic pronouns are attached to untensed Vs as in (42a), they can be initialappearing in a projection that is not a complement. This sentence involves LVM of the untensed V with the pronoun. This verb does not contain T, and since the pronoun imposes no interface condition of its own, the two share the initial position. By contrast, when clitics attach to tensed Vs they must be within a complement projection and cannot be initial: (42c). The finite verb deskas hosts a preceding clitic and contains T, which means that it must head a complement but it does not, so the sentence is deviant. In brief, clitic pronouns can be initial only when attached to nonfinite verbs. Breton shows a dichotomy between T and D that has no counterpart in Slavic. In Breton, T must be in an internal domain, but D need not be, which means that Wackernagel phenomena arise with finite verbs and auxiliaries, but not with clitic pronouns. In Slavic, Wackernagel phenomena appear with some finite auxiliaries and with clitic pronouns. To bring these Breton and Slavic phenomena under a common umbrella, I suggest that the fundamental idea is that certain functional categories (as opposed to clitics) must satisfy PF interface conditions and hence that this variation can be attributed to functional categories. This means that I do not emphasize the notion "clitic" when discussing second position phenomena. Under my view, Breton and Slavic share the licensing system for the functional head T but not the licensing system for functional D, and I maintain that tensed Aux in Slavic or Breton can be considered "clitics" because they contain a T that imposes a bare output condition, while untensed Aux are not clitics because they lack the relevant T. On this view, which allows one to unify Breton and Slavic phenomena, "clitic" becomes a derivative not a primitive notion. If "clitic" was the crucial notion, the curious conclusion would be that all Breton tensed Vs and Aux are "second-" position clitics, and that only those Breton pronouns that are attached to tensed Vs are second-position clitics. As stated, my proposal is that T in Breton and Slavic must satisfy similar interface conditions, and those resemble the ones imposed by D in Slavic. 3.5. Summary Breton is head-initial, which means that no constituent precedes T within TP, but the PF interface condition called the H-Internal Domain Condition requires that TP be the complement of a visible C. This provides a unitary account for the "second-" and "third-" position of finite heads in Breton root clauses. We have seen that the finite auxiliary/verb in TP is in second position in (a) affirmative w/z-questions, which have the w/i-phrase in Spec-C, (b) affirmative topicalizations
312
Maria-Luisa Rivero
with the topic in Spec-C, (c) negative clauses with ne in C, and (d) LVM constructions with the untensed V in C. Finite auxiliaries/verbs are third when featurechecking rules co-occur as in (e) negative wh-questions with the Wh-phrase in Spec-C and ne in C, and (f) negative topicalizations, with the same characteristics. Third position is also possible with an initial constituent above CP as in (g) dislocations with the dislocated phrase in TOPP, and (h) questions with ha, which resemble dislocations. The verbs in first position are mont and eman, and they license T in PF via the Checking Configuration or (3).
4. SOUTH AND WEST SLAVIC The aim of this section is to establish that South and West Slavic shares with Breton the PF licensing system for T. Slavic languages that participate in this system, which need not be identical in other respects, include Bulgarian, Czech, Serbo-Croatian, and Slovak. Polish participates in this licensing system in a way that differs, and is discussed as one example of parametric variation on the HeadComplement and Checking configurations. We have seen that in Breton, T is licensed via the H-internal Domain Condition, with the exception of mont and eman, which invoke the H-checking Domain Condition. In this section, I argue that Slavic uses these two licensing modes, but for different lexical items, which leads to quantitative differences. The H-checking Domain Condition and (tensed) V-to-T apply with verbs and lexical auxiliaries, and the H-internal Domain Condition affects the Slavic Aux traditionally labeled clitics, and triggers LVM (but not in Polish). Slavic LVM has the characteristics discussed for Breton: it moves the untensed V to C to make TP the internal domain of a visible C, which licenses T. As a result, Slavic shares with Breton Wackernagel phenomena for tensed Aux. I mentioned earlier that Breton and Slavic contrast as to systems for the functional category D (clitic pronouns), which will not be discussed. 4.1. Verbs versus Auxiliaries T-licensing conditions distinguish Vs and Aux in Slavic. Slavic Vs need not head complements, as in (16a) repeated as (43). (43)
Vidjaxa decata knigata. Bulgarian saw children+the book-I-the 'The children saw the book.'
T is satisfied via the H-checking Domain Condition as in (44) through (tensed) V-to-T. Slavic Vs differ from most Breton Vs, but not from mont and eman, which use (44), so the difference reduces to lexical variation:
Finiteness and Second Position
(44)
313
[ T p[ T oV i [ T oT ] ][ v p . . . t 1 . . ] ]
VSO in questions with dali in C (Rudin, 1986; Rivero, 1993b) show that V is not in C: (45) a. Dali vidjaxa decata knigata? Q saw children+the book+the 'Did the children see the book?' b. [CP [c' [co dali] [TP Vi, [VP NP ti,NP]]] According to Lema and Rivero (1990) and later work, Auxiliaries fall into two classes. When Aux are positionally free, they resemble Vs in various ways and belong to the lexical class, with a good example being the modal equivalent to must. By contrast, Aux that are positionally restricted do not resemble Vs in the same way, and show properties of temporal affixes, so they are dubbed functional. In this chapter, I propose that in Slavic the functional class uses the internal domain condition to license T, while the lexical class is identical to V and uses the checking configuration (I offer no explanation as to why this is so). Arguments for the lexical/functional dichotomy are given for Slavic and Old Romance LVMlanguages in (Rivero, 1994b), and are repeated here only partially, with Czech as the case in point. Let us begin by introducing these two classes in view of positional restrictions. The Czech perfect Aux in (46) is functional, and must head a complement. Future Aux as in (47) is lexical, is free from this restriction, and behaves like a tensed V. (46) a.
a'. b. b'. c. d.
Tehdyjsem pisal knihy. Czech then have+IS written books Then I {have written/was writing} books.' [CP Tehdy [Co 0] [TP jsem pisal knihy}} Pisal jsem knihy tehdy. 'I {have written/was writing} books then.' [Cp [c0pisali ][TP jsem [Vp ti knihy]]] *Jsem pisal knihy tehdy. * Tehdy pisal jsem knihy.
(47) a. Budu pisat knihy. will.IS write books 'I {will write/will be writing} books.' b. [ TP Budu [ VP pisat knihy] ] Similarities with Breton arise with Slavic functional Aux, but not lexical Aux nor ordinary Vs. Slavic is similar to Breton when auxiliaries must be in the complement of a visible head: (46). Slavic shares strategies with Breton to place tensed Aux in a complement. That is, X max may fill the Spec of the superordinate category, as in (46a), or an X° may fill the head of that category, as in (46b), through LVM. Slavic is like Breton, as LVM fails to apply if Spec-of-C is filled: (46d). In
314
Maria-Luisa Rivero
Slavic, LVM is also absent from embedded clauses, which usually have a C that is overt, as in Slovak . . . chi si napisal list '. . . if he has written the letter.' In brief, Breton and Slavic are parallel in that they share the H-internal Domain Condition for T, but contrast because they use it for different lexical items. In addition, Breton and Slavic are similar in that they share the LVM process. Example (48) serves to show that the H-internal Domain Condition mentions Tense and not Agr, as established for Breton in section 3. (48) Bil sam chel knigata. Bulgarian had have+lS read book+the 'I have read the book.' Lit. I have had read the book In this sentence, an auxiliary in participle form is fronted by LVM. This auxiliary shows Agr (masculine, singular) but lacks T, and can head a root projection, unlike sdm that shows both Agr and T. So I conclude that the H-internal Domain Condition is sensitive to T not Agr, and that auxiliary be/have must appear in an internal domain when finite (sdm), but not when nonfinite (bil). In traditional terms, the Slavic perfect is a "clitic" only when tensed, not when untensed, which means that the bare-output condition that this auxiliary is subject to resides in T. In my analysis, positional restrictions, or their inability to license T, are a sign of the functional nature of some Aux, but additional factors reviewed next separate functional from lexical Aux. This dichotomy has more predictive power than the traditional division between clitic and nonclitic Aux and offers the advantage of capturing similarities between Breton and Slavic through the idea that functional categories such as T (and not necessarily clitics) display interface conditions. My approach does not deny the distinction between clitics and nonclitics, but considers it one manifestation of the more fundamental contrast between functional versus lexical categories. For Lema and Rivero (1991), lexical and not functional Aux license VPPreposing. Thus, VP-preposing is fine with the Czech future Aux but not with perfect Aux (Rivero, 1991: (4)): (49) a.
[Kupovat knihy] budu. buy+INF books will+lS 'Buy books I will.' b. *[Koupil knihy] jsem. bought books have+ IS '*Bought books I have.'
Analyses for Slavic VP-Preposing await development, but the idea that lexical Aux may establish a Theta-role-type relation with its VP-complement could explain why this complement shows extraction properties similar to those of a NPobject.
Finiteness and Second Position
315
Another difference is that functional Aux precedes lexical Aux when the two combine (Rivero, 1994b). In (50), the perfect precedes the modal, which is usual in Slavic. (50)
Tehdy jsem musel chtit vedet. Czech then have+lSmust want know 'Then I must have wished to know.'
If functional categories form extended projections with lexical categories, as in Grimshaw (1991), this order follows if the Modal is lexical and the Perfect functional. Modals are not second position items, and allow VP-Preposing (Rivero, 1991: n. 4), so these different properties coexist. In Czech (as in Slovak and Polish), the position of Neg establishes an interesting contrast between lexical and functional Aux (Rivero, 1991): (51) a.
Tedhy jsem ne-pisal knihy. Czech then have+ IS not-written books Then I have not written books.' b. *Ne jsem pisal knihy tedhy. c. Ne-pisal jsem knihy tedhy. d. *Jsem ne-pisal knihy tedhy.
(52) a. Ne budu pisat knihy. neg will+ IS write books 'I will not write books.' b. *Budu ne-pisat knihy. Examples (5la) and (52b) show that ne follows perfect Aux and cannot attach to this item. In (51c) LVM moves the untensed V to C with ne 'not' attached to it; therefore, ne fails to immediately precede the functional Aux. By contrast, ne immediately precedes future Aux in (52a), and tensed Vs as in (53). If future Aux belongs to the lexical class, this distribution is principled. (53) Ne-pishem knihy. Czech neg-write+PRES+IS books 'I am not writing books.' The traditional view that attributes positional restrictions to clitics also establishes two classes of auxiliaries, but cannot account for word-order differences other than second position. Thus the functional/lexical distinction is superior to the clitic-nonclitic view both from the contrastive perspective that seeks to relate Breton and Slavic, and from a point of view internal to Slavic. I propose that tensed lexical Aux are merged in the lexical layer of the clause like ordinary Vs, and raise to T like them. This raising establishes a checking configuration that licenses T in PF under the Checking Domain Condition. This
316
Maria-Luisa Rivero
accounts for why the Czech future Aux escapes position restrictions, and patterns like a V when negated. Recall that for Rivero (1991), Czech T takes NegP as complement, and Neg takes VP as complement: (54)
[ T pT[ N e g p Neg [vp V]]]
The ordinary V amalgamates with ne, and the complex raises to T resulting in ne+V in (53). Future Aux is lexical, generated under V with a VP-complement (not represented), so amalgamates with ne to form the complex that raises to T, which results in ne+Aux in (52a). On this view, Tensed functional Aux are items merged in T, take a VPcomplement headed by V, and disallow V to raise to T. No V or V-like Aux raises to the functional Aux from the VP, so no H-checking domain for T is established. Thus, this case involves the same H-internal domain condition as Breton: Aux must be the head of a complement of a C visible in PR The Czech perfect Aux in (54) is generated under T with NegP as complement, and is positionally restricted only appearing in an internal domain. In root clauses, fronting of a constituent must apply so that T can be licensed, as in (5la) and (51c). In sum, Slavic and Breton root sentences in the perfect contain a T that must be satisfied via the H-internal Domain Condition, so they look similar. However, Breton and Slavic root patterns with Verbs and Modals differ since they fall under the H-internal Domain Condition in Breton, and the H-checking Domain Condition with raising to T in Slavic. 4.2. Parametric Variation in Slavic: Polish We saw above that the two options for licensing T in PF are the source of the parametric variation that distinguishes Breton from the LVM Slavic languages as to verbs and auxiliaries. The same dichotomy is the source of the parametric variation that differentiates Polish from the LVM languages of the Slavic family and from Breton at the same time. To see how this works first recall that Borsley and Rivero (1994) argue that Polish lacks LVM and displays raising of a nonfinite V to a finite Aux, resulting in a morphological complex. That is, Polish has Incorporation as in Baker (1988). What I suggest here is that this process is "stylistic" and thus applies to establish an H-Checking Domain to license T in PF, and not to check features, and this is why it can be optional in some varieties of this language. In the other Slavic languages examined above and in Breton, Incorporation is unavailable as a licensing option in PF, which leads to a type of parametric variation attributable to the functional category T. Consider here sentences with perfect auxiliaries as in (55), where Polish looks identical to Bulgarian, Czech, and Breton. A functional Aux when finite cannot head a projection that is not a complement, and Wh-movement to Spec-of-C li-
Finiteness and Second Position
317
censes the structure in PF. In other words, the Head-Complement structure is used to license T in PF in a way that is reminiscent of the LVM languages. (55) a.
Kiedy-s widzial ten film? Polish when-have+2S seen this film? 'When have you seen this film?' b. [ CP kiedyi [ C ' [ C o q] [TP [TOs] [Vp widzial ten film] ti,]]] c. *S widzial ten film. have+2S seen this film '*You have seen this film.' d. *[TP [TO s] [VP widzial ten film ]]]]
In addition, certain varieties of Polish have an optional incorporation process with the Participle moving to the tensed Aux, as in (56), and this process also serves to license T in PF as in (56c-d). This is where the differences with LVM Slavic languages arise. (56) a. Kiedy widzial-es ten film? when seen- have+2S this film b. [Cp kiedyi [c'[coq] [Tp [To widzialk -es] [Vp tk ten film ] ti]]] c. Widzial -es ten film. seen- have+2S this film '*You have seen this film.' d- [TP [TO widzialk -es] [Vp tk ten film ]]]] If we think of the T-node as the locus for a Tense affix as in (Chomsky, 1981), the ordinary inflected V results from V-raising to the affix in T, reminiscent of English / am (not) in Paris. In such a situation, V is usually precluded from moving to the T that is filled by a lexical item, as in / will (not) be in Paris. From this perspective, Polish is a language that allows V-raising if T is filled by an affix or by a full word. Thus, the untensed V raises to T if this node contains a functional auxiliary, as in (56). Incorporation leads to a distributional difference with the LVM languages. Recall that sequences formed by w/i-phrase(s), a Participle, and a tensed Aux are ungrammatical in Slavic languages with LVM, and in Breton: Bulgarian *Kakvo kupil e? versus Kakvo e kupil? 'What has he bought?'. The reason is that LVM does not co-occur with syntactic movement to Spec-of-C, as the participle will only move to C if the licensing configuration for T has not been established in the syntax. In Polish, by contrast, the assumption is that the Participle raises to T and not to C, which explains why incorporated forms do not display the root
318
Marfa-Luisa Rivero
characteristics of LVM constructions, and may appear in all types of embedded clauses including relatives (Borsley and Rivero, 1994). The next question is why Incorporation applies in Polish. Sequences such as (55a) indicate that in some variants the process need not always apply, which suggests that it is not a checking operation. My proposal thus is that Polish uses the H-internal Domain and the H-checking Domain Conditions as two alternative ways to satisfy the PF requirements of T in functional Aux. If V raises to Aux, the process establishes the H-checking domain where T is licensed, as in (56c-d), and this makes Polish contrast with other Slavic languages and with Breton, which necessarily appeal to the H-internal Domain Condition in the context of functional Aux. Polish functional Aux that are not targeted by V-incorporation are licensed by the H-internal Domain Condition, as in (56a), similar to what happens in Breton and other Slavic languages. As a consequence of this double choice, the Polish perfect Aux is positionally restricted in the absence of incorporation, as in (55c-d) (i.e., it cannot be sentence-initial). In this case, Polish auxiliaries most closely resemble Bulgarian auxiliaries, which cannot be first, but need not be second or adjacent to C. However, if incorporation applies as in (56b), it has the properties of the ordinary tensed V in Slavic: it can head a projection that is not a complement, as in (56c-d). Finally, Polish does not differ from other Slavic languages when it comes to true verbs and lexical auxiliaries; these use the checking configuration to license T in PF. On this analysis Polish uses, under slightly different circumstances, the two PF licensing principles for T that are also found in Breton and the other Slavic languages. This provides additional support for the idea that the Head-Complement and the Checking configurations are available in Universal Grammar to satisfy PF interface conditions, not just internal conditions. The chart in (57) summarizes the proposal by adding Polish to the chart in (20). (57)
Tense-licensing in PF Checking Domain
Internal Domain
Language
V
Aux
V
Aux
Breton Slavic LVM Igs Polish
No Yes Yes
No No Yes
Yes No No
Yes Yes Yes
5. SUMMARY AND CONCLUSIONS In LVM-languages, T is a functional category that imposes a PF-interface condition. This condition can be satisfied in two structural ways based on sisterhood. TP can be the sister of the head that licences its T (i.e., the licensing configura-
Finiteness and Second Position
319
tion is the Head-Complement structure). Alternatively, T can be the sister of the licensing head (i.e., the licensing structure is a Checking Configuration). In addition, these languages share a verb^fronting process dubbed LVM with an output that establishes a Head-Complement structure, which is reminiscent of computational Merge, not Move. This process applies in PF to satisfy the interface condition of T and not to check features: V becomes the sister of TP when it moves to C. Parametric differences between LVM languages result from the interaction of the two structural options to license T in PF and entries in the lexicon. In Breton, T is most often licensed when TP is a sister or complement, but with a few verbs, T is licensed by being itself a sister. A consequence of this is that Breton is a VSO language where VI patterns are almost nonexistent in root clauses, and secondposition restrictions on finite heads are pervasive. In Slavic the option to license T when TP is a complement/sister is found with functional auxiliaries traditionally called clitics, while with verbs T is licensed by being itself a sister, so second-position effects on finiteness are less pervasive than in Breton. Besides their interface requirements on T, Slavic and Breton are characterized by an LVM process. This process applies in PF to satisfy T, establishes a Head-Complement configuration reminiscent of Merge in the computation, and does not check formal features. The same PF system to license T serves to distinguish Polish from LVM languages, including Breton. Polish uses Incorporation of V to Aux to license T in PF, hence a Checking Configuration, which distinguishes it from the other languages. The Head-Complement configuration where TP is the sister of C is also used to license T in Polish. This language lacks LVM, so the Head-Complement option is used not when V is in C but when a phrase is in Spec-of-C.
NOTES 'The first version of this paper dates from 1993; this updated version owes much to helpful comments by R. D. Borsley, P. Hirschbiihler, M. Suner, A. Terzi, and two anonymous reviewers for the present volume. Unless otherwise indicated, Breton examples are from (Borsley, Rivero, and Stephens, 1996). I owe thanks to many colleagues and friends for data and discussion through the years: for Breton, R. Borsley and J. Stephens; for Bulgarian, G. Alexandrova, O. Arnaudova, M. Dimitrova-Vulchanova, and E. Savov; for Czech, F. Bakes and J. Sedivy; for Polish, R. Borsley, E. Jaworska, and J. Witkos; for SerboCroatian, W. Browne, A. Donskov, D. Stojanovic (formerly Kudra), and L. Progovac; for Slovak, H. Briestenska. I acknowledge support from SSHRCC Grants 410-91-0178 and 410-94-0401 and the Eurotyp Project of the ESF. 2 This process, known as Long Head Movement, first proposed in Rivero (1994a [written
320
Maria-Luisa Rivero
in 1988]) has attracted much attention, leading to a variety of counterproposals whose discussion falls well beyond the scope of this chapter. Rivero (1996) lists some of the existing alternatives, and sketches a critique of (pure) Morphological Merger/Prosodic Inversion analyses, as proposed most notably in Halpern (1992, 1995); the basic idea is that LVM can account for phenomena that fall outside the scope of linear operations like Prosodic Inversion/ Morphological Merger (and see note 4). Rivero (in prep) discusses LVM in the context of stylistic rules affecting verbs, which are the hierarchical movements in the PF branch with a Head-Complement or a Checking Configuration output that have in common the application to satisfy interface conditions and not to check formal features. 3 For Schafer (1994), Breton is both an LVM and a V2 language, albeit in different constructions. Schafer views LVM essentially as in Borsley, Rivero, and Stephens (1996), not as a feature-checking operation. By contrast, for Topicalizations, Wh-questions, and negative sentences she proposes a V2 analysis with tensed V-to-(I-to)-C for feature-checking. In my analysis, finite Vs and Aux are in T in all constructions, and finite raising to C is not viable; if it applied in the syntax, T would not head a complement at PF, and its interface condition would not be satisfied. On this view, Breton cannot be a V2 language, interpreting V2 to imply finite raising to C for feature checking. 4 In Bulgarian, verb raising in PF applies with li in main and embedded questions, as in (i), (ii), and (iii), which repeats (12b). (i)
(ii)
(iii)
Vidjaxme li knigata? Bulgarian Saw+lP]Q book+the Did we see the book?' Prochel li e knigata? Read Q have+3S book+the 'Has he read the book?' Ne znam [prochel li e Petur knigata.} Neg know+lS [read Q have+3S Peter book+the 'I do not know whether Peter has read the book.'
In my view, functional li (=Q) is in C and imposes an interface, not a feature-checking requirement: it must have overt material in its checking domain. This PF requirement triggers stylistic V-fronting in both root and nonroot clauses: verbs adjoin to li to licence it and thus come to precede it. LVM has the same external distribution in Bulgarian and Breton in yes-no questions, but Breton ha precedes the fronted V as in ha lennet en deus because LVM applies in the CP-complement of ha to satisfy the interface condition of T. That is, T and not Q triggers LVM in Breton in these questions, another parametric contrast between these two languages explored in Rivero (in prep). Rivero (1996) uses embedded interrogatives as in (iii) to argue against Prosodic Inversion/Morphological Merger. Examples of type (iv) from Borsley, Rivero, and Stephens (1996:66) also favor a hierarchical process, and argue against a linear operation. Prosodic Inversion, for instance, inverts a string-initial Aux with a following prosodic word, and wrongly predicts that the auxiliary should follow the first verb in these sequences: (iv) a. Lennet ha komprenet en deus Yann al levr. Breton read and understood 3S have Yann the book 'Yann has read and understood the book.'
Finiteness and Second Position
321
b. Vidjal i prochel e knigata. Bulgarian seen and read have+3S book+the 'He has seen and read the book.'
REFERENCES Anderson, S. (1982). Where's Morphology. Linguistic Inquiry, 13, 571-612. Anderson, S., and Chung, S. (1977). On grammatical relations and clause structure in verbinitial languages. In P. Cole and J. Sadock (Eds.), Syntax and semantics 8: Grammatical relations (pp. 1-26). New York: Academic Press. Baker, M. C. (1988). Incorporation. Chicago, IL: University of Chicago Press. den Besten, H. (1977). On the interaction of Root transformations and Lexical Deletive rules. Published in 1983 in W. Abraham (Ed.), On the formal syntax of the Westgermania (pp. 47-131). Amsterdam: John Benjamins. Beukema, R, and Coopmans, P. (1989). A government-binding perspective of the imperative in English. Journal of Linguistics, 25, 417-436. Borsley, R. D., and Rivero, M. L. (1994). Clitic auxiliaries and incorporation in Polish. Natural Language and Linguistic Theory, 12, 373-422. Borsley, R., Rivero, M. L., and Stephens, J. (1996). Long Head Movement in Breton. In R. Borsley and I. Roberts (Eds.), The syntax of the Celtic languages (pp. 53-74). Cambridge: Cambridge University Press. Borsley, R. D., and Stephens, J. (1989). Agreement and the position of subjects in Breton. Natural Language and Linguistic Theory, 7, 407-428. Branigan, P. (1996). Verb-second and the A-bar syntax of subjects. Studia Linguistica, 50, 50-79. Chomsky, N. (1977). On Wh-movement. In P. Culicover et al. (Eds.), Formal syntax (pp. 71-132). New York: Holt, Rinehart and Winston. Chomsky, N. (1981). Lectures on Government and Binding. Dordrecht: Foris. Chomsky, N. (1991). Some notes on economy of derivations and representations. In R. Freidin (Ed.), Principles and parameters in comparative grammar (pp. 417-454). Cambridge, MA: MIT Press. Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: MIT Press. Chomsky, N., and Lasnik, H. (1977). Filters and control. Linguistic Inquiry, 8, 425-504. Chung, S., and McCloskey, J. (1987). Government, barriers and small clauses in modern Irish. Linguistic Inquiry, 18, 173-237. Desbordes, Y. (1983). Petite grammaire du Breton moderne. Mouladuriou hor Yezh. Diesing, M. (1990). Verb movement and the subject position in Yiddish. Natural Language and Linguistic Theory, 8, 41-79. Epstein, S. D. (1992). Derivational constraints on A'-chain formation. Linguistic Inquiry, 23, 235-261. Grimshaw, J. (1991). Extended projections. Unpublished manuscript, Brandeis University. Halpern, A. (1992). Topics in the placement and morphology of Clitics. Ph.D. Dissertation, Stanford University. Revised version published as (Halpern 1995).
322
Maria-Luisa Rivero
Halpern, A. (1995). On the placement and morphology of Clitics. Stanford, CA: CSLI Publications. Hendrick, R. (1988). Anaphora in Celtic and Universal Grammar. Dordrecht: Kluwer. Hendrick, R. (1991). The morphosyntax of aspect. Lingua, 85, 171-210. latridou, S., and Kroch, A. (1992). The licensing of CP-recursion and its relevance to the Germanic verb second phenomenon. Working Papers in Scandinavian Syntax, 50, 1-24. Lasnik, H., and Saito, M. (1992). Move alpha. Conditions on its application and output. Cambridge, MA: MIT Press. Lema, J., and Rivero, M. L. (1990). Long Head Movement: ECP vs. HMC. NELS20, 333347. GLSA, University of Massachusetts, Amherst, MA. Lema, I, and Rivero, M. L. (1991). Types of verbal movement in Old Spanish: Modals, futures, and perfects. Probus, 3, 237-278. Reinhart, T. (1995). Interface strategies. OTS Working Paper, Utrecht. Rivero, M. L. (1991). Long Head Movement and negation: Serbo-Croatian vs. Slovak and Czech. The Linguistic Review, 8, 319-351. Rivero, M. L. (1993a). Long Head Movement vs. V2, and Null Subjects in Old Romance. Lingua, 89, 113-141. Rivero, M. L. (1993b). Bulgarian and Serbo-Croatian Yes-No Questions. V°- raising to -li vs. L/-Hopping. Linguistic Inquiry, 24, 567-575. Rivero, M. L. (1994a). Clause Structure and V-movement in the languages of the Balkans. Natural Language and Linguistic Theory, 12, 63-120. Rivero, M. L. (1994b). Auxiliares funcionales y auxiliares lexicos. In V. Demonte (Ed.), Gramdtica del Espanol (pp. 107-138). Publicaciones de la Nueva Revista de Filologia Hispanica VI. CELL, El Colegio de Mexico, Mexico. Rivero, M. L. (1996). Verb Movement and Economy: Last Resort. Papers from the First Conference on Formal Approaches to South Slavic Languages. Plovdiv. October 1995. University of Trondheim Working Papers in Linguistics 28. 211-228. Revised version to appear in Benjamins volume. Rivero, M. L. (in prep). Verb syntax and interface conditions. New York: Oxford University Press. Rouveret, A. (1991). Functional categories and agreement. The Linguistic Review, 8, 353-387. Rudin, C. (1986). Aspects of Bulgarian syntax: Complementizers and Wh constructions. Columbus, OH: Slavica Publishers. Rudin, C. (1988). On multiple questions and multiple Wh-fronting. Natural Language and Linguistic Theory, 6, 445-501. Schafer, R. J. (1992). Negation and verb second in Breton. Working Paper 92-02, Syntax Research Centre, University of California, Santa Cruz. Revised version published in NLLT in 1994. Schafer, R. J. (1994). Nonfinite predicate initial constructions in Breton. Ph.D. Dissertation, University of California, Santa Cruz. Stephens, J. (1982). Word order in Breton. Ph.D. Dissertation, University of London. Stump, G. (1984). Agreement vs. Incorporation in Breton. Natural Language and Linguistic Theory, 2, 289-348.
Finiteness and Second Position
323
Stump, G. (1989). Further remarks on Breton Agreement. Natural Language and Linguistic Theory, 7, 429-472. Vikner, S. (1991). Verb movement and the licensing of NP-positions in the Germanic languages. Unpublished manuscript, University of Stuttgart. Published by Oxford University Press in 1995. Wojcik, R. (1976). Verb-fronting and auxiliary do in Breton. NELS, 6, 259-278. Zanuttini, R. (1991). Syntactic properties of sentential negation: A comparative study of Romance languages. Ph.D. Dissertation, University of Pennsylvania. Zubizarreta, M. L. (1995). Prosody, focus, and word order. Ms. University of Southern California. Revised version published by MIT Press.
This page intentionally left blank
FRENCH WORD ORDER AND LEXICAL WEIGHT ANNE ABEILLE* DANIELE GODARD *IUF Universite Paris 7 UFRL Paris, France *CNRS Universite Lille 3 Villeneuve d'Ascq, France
1. INTRODUCTION1 As usual with complex phenomena, progress in the comprehension of word order can only be made by isolating and studying each factor in turn. We concentrate our attention here on the syntactic constraints governing the order of complements and adjuncts in French, leaving aside discursive, pragmatic, and stylistic factors. Accordingly, the grammatical judgments we provide are to be taken with an unmarked intonation, some of the sentences given as ungrammatical here being acceptable with a special prosodic pattern. The study of word order requires a great attention to the detail of the data. Nevertheless, we think it possible to arrive at generalizations that are both empirically accurate and theoretically interesting. Recently, the question has been taken up of the relation between constituency and word order with the two questions: can word order be reduced to the hierarchical structure (Kayne, 1994; Cinque, 1977), or does it constitute a separate component (Gazdar et al., 1985; Pollard and Sag, 1987), and, in the second case, do the constituency and the ordering domains coincide or does word order have a domain of its own, and, if so, how is it related to constituency (Reape, 1994; Kathol, 1995)? The word-order facts we look at are not readily amenable to structural distinctions, and point to the existence of a separate word-order component, Syntax and Semantics, Volume 32 The Nature and Function of Syntactic Categories
325
Copyright © 2000 by Academic Press All rights of reproduction in any form reserved. 0092-4563/99 $30
326
Anne Abeille and Daniele Godard
but do not seriously challenge the view that the constituency and the word-order domain coincide. Our main finding is classificatory: we bring to light a new syntactic factor that plays a role in word order, building on suggestions in Sadler and Arnold (1993, 1994) for the English NP, and Sells (1994) for certain Korean facts. We show that certain constituents, which consist of a word, obey much stricter constraints than their phrasal counterparts or other such constituents. Roughly, they must occur first in the phrase or adjacent to the head. This suggests a weight constraint symmetrical to the well-known heaviness constraint, which tends to order heavy elements last in their domain. Leaving heavy constituents aside, we contrast "light" constituents with ordinary "middle-weight" ones, using a two-value (lite vs. nonlite) feature WEIGHT, which characterizes both lexical items (they can be lite, nonlite, or unspecified) and phrases (usually nonlite). Adopting the Head-driven Phrase Structure Grammar framework (HPSG, Pollard and Sag, 1987, 1994), we formalize order rules as constraints on the daughters in a phrasal type.2 In this framework, we build on our empirical findings to propose a mixed theory of word order, which results from the interplay of the grammatical function and the weight of the daughters. We begin with an examination of the order of complements in the VP, showing a systematic difference between bare complements and the others (section 2), which we describe using the WEIGHT (WGT) feature in conjunction with phrasal constraints and LP rules for French (section 3). We then apply the theory to account for the position of adjectives in the NP (section 4). Finally, we go back to the adverbs in the VP, to give a fuller account of ordering in French (section 5).
2. THE ORDER OF COMPLEMENTS IN THE VP We contrast phrasal complements (which we call "nonlite," anticipating the weight feature) which occur freely to the right of the head in French, with bare complements (called "lite") which must precede phrasal complements and are strictly ordered among themselves. 2.1. Free Order among Phrasal Complements As has often been observed, complements in French are not ordered with respect to one another (leaving discursive factors aside).3 An indirect object may precede or follow a direct object (1); a predicative adjective may precede or follow a direct object (2): (1) Paul donne un livre a sonfilsl donne a son fils un livre. 'Paul gives a book to his son.'
French Word Order and Lexical Weight
327
(2) Cette musique rend monfilsfou dejoiel rendfou dejoie monfils. 'This music makes my son really happy.' (lit: crazy of joy) Similarly, a sentential or infinitival complement may precede or follow a nominal complement, with some preference for the second position due to heaviness: (3) Paul dit a Marie de venirl dit de venir a Marie. 'Paul says to Marie to come.' (4) Paul dit a safille qu 'ilfait beau/ dit qu 'ilfait beau a safille. 'Paul tells his daughter that it is nice weather.' The same mobility can be observed with complements of nouns: (5) La destruction de Rome par les Barbaresl par les Barbares de Rome. 'The destruction of Rome by the Barbarians.' (6) La volonte de lutter de Jean/ de Jean de lutter. 'Jean's wish to fight' (lit: The wish of fight(ing) of Jean). 2.2. Lite Complements before Nonlite Complements Bare proper names and predicative adjectives have the same mobility as phrasal complements: (7) a. Paulpresente Geraldine asonfils I a sonfils Geraldine. 'Paul introduces Geraldine to his son.' b. Cette musique rend monfils foul rendfou monfils. 'This music makes my son really happy.' On the other hand, bare common nouns exhibit ordering constraints not observed with phrasal complements. First, they precede phrasal complements. Light verbs provide numerous instances of bare nominal complements, which invariably occur immediately after the verbal head: (8) a. La course donne soifa Jean/ * donne a Jean soif. 'The race makes Jean thirsty.' (lit: gives thirst to Jean) b. Ce livre fait plaisir a Marie/ * fait a Marie plaisir. 'This book gives pleasure to Marie.' (lit: makes pleasure to Marie) However, when the same N has a complement or a determiner, it becomes as free as a phrasal complement: (9) a. La course donne une grande soifa Jean/ donne a Jean une grande soif. 'The race makes Jean very thirsty.' (lit: gives a great thirst to Jean) b. Ce livre fait le plaisir de sa vie a Marie I fait a Marie le plaisir de sa vie. 'This book gives the pleasure of her life to Marie'
328
Anne Abeille and Daniele Godard
Modification by an adverb or conjunction of these N has a similar effect:4 (10) La course donne [vraiment soif] a Jean/ donne a Jean [vraiment soif]. 'The race makes Jean really thirsty.' (lit: gives really thirst to Jean) (11) La marche donnera [faim ou soif] a Marie/ ? donnera a Marie [faim ou soif]. 'A walk will make Marie hungry or thirsty.' (lit: will-give hunger or thirst to Marie) (12) La vitesse fait [peur et plaisir] a Marie/fait a Marie [peur et plaisir]. 'Speed gives fear and pleasure to Marie.' The same observation extends to another case of bare complements, the past participle in tense auxiliary constructions and the infinitive in causative constructions. We analyze tense auxiliaries andfaire as the head of a flat VP, which takes as complements the participle or the infinitive and its complements (cf. Abeille et al., 1997). The tree structure representations of these constructions are given in (13), where the function of the daughters is represented as an annotation on the branches:
In this analysis, the auxiliary or the causative faire is the morphosyntactic head (H) of the construction, which inherits all the complements (C) of the bare participle or infinitive. Like other bare complements, it must precede all other nonlite complements. (14) a. Paul a achete des pommesl * a des pommes achete. 'Paul has bought apples.' b. Cette musique fait pleurer mon filsl * fait mon fits pleurer. 'This music makes my son cry.' (lit: makes cry my son) However, unlike the N in light verb constructions, these verbal complements must precede the other complements even when modified or conjoined: (15) a. Paul a [achete et mange] des pommes/ * a des pommes [achete et mange]. 'Paul has bought and eaten apples.' b. Paul fait [beaucoup rire] son filsl * fait sonfils [beaucoup rire]. 'Paul makes his son laugh a lot.' As explained below, this difference between N and V does not depend on the category but on the requirement made by the predicate of which they are a complement. To account for the difference between (10-12) and (15), and in light of additional data on adjectives and adverbs (see sections 4 and 5), we will analyze
French Word Order and Lexical Weight
329
coordination or modification of lite categories as potentially ambiguous between lite and nonlite. 2.3. Rigid Ordering of Lite Complements Unlike phrases, the bare complements mentioned above are rigidly ordered in the following way (leaving bare quantifiers aside):5 (16)
Head < Past Part < Vinf < Bare Noun
The past participle must precede the other lite complements. It precedes the bare N in (17) and the bare V[infJ in (18): (17) La course a donne soif 'a Marie/ * La course a soif donne a Marie. 'The race has made Marie thirsty.' (lit: has given thirst to Marie) (18) Paul a fait tomber le vase/ * a tomberfait le vase. 'Paul made the vase fall.' (lit: has made fall the vase) Similarly, the V[inf] precedes the lite nominal complements: (19) Le President fera rendre hommage aux victimesl * fera hommage rendre aux victimes. 'The President will make one pay tribute to the victims.' (lit: will-make pay tribute to the victims)
3. A FEATURE-BASED TREATMENT Before presenting our analysis with the feature WEIGHT, we briefly show why alternative analyses based on morphological incorporation or syntactic distinctions completely independent from word-order properties are inappropriate. 3.1. Alternative Analyses The existence of bare complements has seldom been recognized as a syntactically relevant phenomenon. Some analyses have been proposed to deal with them in the morphology. Auxiliary constructions are traditionally handled in the same chapter as verbal inflection in descriptive or school grammars (e.g., Bescherelle, 1980; Grevisse and Goose, 1988); there are also attempts to account for the position of the infinitive in causative constructions by postulating morphologically complex predicates (e.g., Zubizarreta, 1985). However, a morphologically based solution is not consistent with the data, because adverbs and PPs, which do not belong to the same word as the verbal head, can always occur between the head and the bare complements:6 (20) a. Paul a evidemment achete des pommes. 'Paul has of course bought apples.'
330
Anne Abeille and Daniele Godard
b. La musiquefait depuis toujours pleurer monfils. 'Music always makes my son cry a lot.' (lit: music makes always a lot cry my son) c. Le livrefera sans doute plaisir a Marie. 'The book will no doubt give pleasure to Marie.' If the past participle, the infinitival or the bare noun in (20) were part of the same word as the head V, so would the adverb; it is not clear how such a proposal could be justified. As an alternative, one might think that some categories are adjoined to the V rather than at the same level as the regular complements. But many light V constructions (faire plaisir andfaire un grand plaisir, rendre hommage and rendre un vibrant hommage, avoir faint, and avoir une faim de loup) do not specify whether the complement is a bare N or an NP (with a determiner). The complementation of such light Vs would be radically different, depending on whether the N has or doesn't have a determiner. While not impossible, this structural difference would require independent justification.7 Another hypothesis is to use categorical distinctions. Distinguishing betwen V and VP (or S) complements could account for the contrast between (14) and (4-6). One could simply say that V complements, but not VP (or S) complements, must precede other complements. But a similar distinction is more problematic for nominal complements. One could contrast bare nouns as NPs with "maximal" nominal phrases or proper names as DPs, only the second being referential (e.g., Abney, 1987; Longobardi, 1994), and we would simply say that NPs must precede DPs in French. But if bare soifis an NP, one cannot see how the adjunction of an adverbial modifier (vraiment soif} would turn it into a DP; analogously, it is difficult to have coordination of nominal complements ("NPs") such as (11)-(12) recategorized as DPs. A category-based account will be even more difficult to account for the potential ambiguous behavior of certain modified or conjoined phrases (see sections 4 and 5). A feature-based account seems more appropriate for this kind of under-specification. Another categorical distinction would make use of bar-level distinctions. This is Sells's proposal to account for similar word-order restrictions in Korean, where certain bare complements and adverbs resist scrambling and must immediately precede the head (Sells, 1994). Assuming a binary phrase structure for Korean, Sells contrasts X0 categories, which must combine with an X0 head, with Xt and X2 categories that can combine with an X1 head; only the phrases with an X! head can scramble. The analysis can be summarized as follows: (i) words, rather than maximal projections only, can be complements or modifiers; (ii) certain words, but not all, are prevented from projecting X1 or X2 phrases by themselves;
French Word Order and Lexical Weight
331
(iii) certain syntactic phrases must be defined as X0 categories (the negationverb syntactic combination for instance), while others are X1 or X2. The effects of this proposal are very similar to what we also want for the French data. However, we find that X-bar theory is not the most appropriate tool. Proposals (ii) and (iii) represent a real difficulty for a bar-level representation, particularly when adverbs are taken into account. The word-order phenomena under investigation reflect properties of the lexical items; because they cannot be reduced to valence requirements, and because the combinatorics is not different when the phrase behaves ambiguously and when it behaves only as a usual maximal projection, a bar-level distinction is not appropriate. Anticipating the following discussion, certain adverbs in the VP must be adjacent to the lexical head, like common nouns, and others are more mobile, like proper names or maximal projections (see section 5). While we might associate the difference between common nouns and proper names with the fact that the second but not the first is valence saturated, this does not make sense with adverbs. Turning to the ambiguous phrases (vraiment soif, faim et soif}, it is impossible to get both X0 and X1 or X2. Again, as soon as the need for underspecification and sharing of value is recognized, a feature-based approach is more appropriate than one based on distinct categories. Bratt (1990) uses two features to get three levels of structure. Analyzing the sequence made of a causative verb and its infinitival verb complement in French (faire rire in Paul fait rire son fits 'Paul makes his son laugh,' lit: makes laugh his son), as a verbal complex, she notes this category with the two features: [LEX±] and [PHRAS±].8 While a word usually is [LEX+, PHRAS-] and a (usual) phrase [LEX-, PHRAS+], this complex is [LEX-, PHRAS-]. Proper names could be specified as [LEX+, PHRAS+] in the lexicon (her suggestion); our problematic combinations (vraiment soif, faim et soif), could then be underspecified ([LEX—, PHRAS±]), and the ordering constraints would say that [PHRAS—] come before [PHRAS+] constituents. However, the empirical justification for PHRAS is not clear, as soon as some words (proper names, but also most adverbs) have to be [PHRAS+], while some syntactic combinations are [PHRAS—], and others would have to be ambiguous. We conclude that, in the same way as word-order phenomena are not reducible to matters of constituency, the appropriate notation for them requires the use of a feature that is not reducible to other independent features. 3.2. The Feature WEIGHT In a way analogous to the heaviness constraint (which says that heavy phrases tend to come last, cf. Wasow, 1996), we propose that a constraint holds for light weight words or phrases that tend to come first in the phrase (just before or just
332
Anne Abeille and Daniele Godard
after the lexical head). We call them "lite" to make the point that lite is not just the contrary of heavy, the usual phrases being in fact "middle-weight." Lite constituents cluster with the head V. Ignoring heaviness phenomena here, we speak of a contrast between lite and nonlite constituents. The feature WEIGHT, present both in the lexicon and phrases, aims at capturing a general theory of word order. First, not all lexical items have the same weight value: they may be [ WGT lite], [WGT nonlite], or unspecified (with a general constraint that words are not heavy). Thus, we distinguish between common nouns, which are lite, and proper names, which are nonlite. Usually, predicates require their arguments to be nonlite; however, light verbs may allow (or require) that they be lite or unspecified. Second, while most phrases are nonlite, we allow certain phrases to be lite, such as achete et lu in (15a) or (23): (23) a. Paul a achete et lu La Recherche, b. *Paul a La Recherche achete et lu. 'Paul has bought and read the Recherche.' In (23) the coordination of participles is lite, because tense auxiliaries obligatorily take a lite V complement, that is, a participle which is unsaturated for all of its subcategorized complements. This sucategorization is represented in (24), as the value of the syntactic attribute ARG-ST whose first element corresponds to the subject and the others to complements; the identity of the integers means identity of the value for the lists (which is left unspecified), and © the concatenation of lists (Abeille and Godard, 1994; Abeille et al., 1997): (24)
avoir: ARG-ST <[1],V [WGT lite, ARG-ST <[1]>
[2]> [2]
The first complement of the auxiliary is the lite participle, and the second is identified with the list of complements that this participle itself subcategorizes for. Accordingly, the conjunction achete et lu must be lite when it is a complement of the auxiliary. Sentence (23b) is out because the [WGT lite] constraint on the coordination of past participles conflicts with the constraint that orders lite complements before nonlite ones.9 The question that must be raised, then, is whether we can or should dispense with head-only phrases. Given that the occurrence of lite and nonlite arguments depends on the subcategorization of predicates, which does not say whether they are words or phrases, do we need to build, or do we have arguments against building, a head-only phrase? It turns out that we can dispense with head-only phrases, at least regarding the data under consideration here. Since the weight distinction is what counts for subcategorization as well as word order, we get the right results if we accept combining words in the syntax. On the other hand, we have no argument which shows the head-only phrase to be inconsistent with our findings. The head-only phrase can give the right results if its description is identical to that of the head, in particular regarding weight and valence, and if syntax combines only phrases. In this chapter, we will explore a representation that does not use head-
French Word Order and Lexical Weight
333
only phrases, in order to keep constituency as simple as possible. The reader should keep in mind that this is a matter of representation, and can replace our representation combining words by head-only phrases, if it suits his or her taste better. 3.3. Liteness in Phrasal Descriptions The basic idea of the HPSG representation of linguistic expressions, or signs, is that all signs can be classified in types (noted with italics), which are associated with feature structures meeting certain constraints (Pollard and Sag, 1987; Sag, 1997). Signs divide into words (the unit for syntax) and complex constituents (phrases), which have daughters (hence the attribute DTRS). We examine here the consequences of the proposed WEIGHT feature for the representation of the relevant constituents. Let us first present the organization of phrases we assume:
This hierarchy is identical to that in Sag (1997), except for the hd-marker-phrase, and the hd-adj-comp-phrase which we propose for French, containing the complements and the adjuncts at the same time.10 As regards weight, we propose a general constraint, such that all head-nexus-phrases are nonlite: (26) head-nexus-phrase
[WEIGHT nonlite]
In order to account for lite phrases, illustrated in (15a) and (23) by the coordination of participle complements, we propose the following constraints on headadjunct phrases and coordinated phrases:11
Constraints (27a) and (27b) allow such phrases to be lite if all the daughters are lite. The daughters are not required to have the same weight ([1],[2],[n]and may
334
Anne Abeille and Daniele Godard
be different); however, the values can unify only if they are identical. Accordingly, the first disjunct in the value for the phrase is equivalent to lite if the daughters are all lite, to nonlite if they are all nonlite; since union fails if the daughters do not have the same weight value, the value for the phrase in this case is given by the second disjunct (nonlite). Because both signs and phrases can be lite or nonlite, the introduction of the WEIGHT feature leads to a more complex classification of signs, cross-classifying them for the two dimensions of weight and phrasality: (28) The hierarchy of signs and the feature WEIGHT
In the lexicon, nouns and adverbs are unambiguously specified as lite or nonlite: all proper names are nonlite, and all common nouns lite in French. Most adverbs are nonlite, while some are lite (see section 5). Verbs and adjectives can be underspecified for weight: most verbs are underspecified and can behave either as lite or nonlite. Adjectives may be lite, nonlite, or underspecified, depending on their pre- or postnominal position in the NP (see section 4). Words that are underspecified for weight are specified in context, given the constraint on weight in the phrase in which they appear. As an example, we represent in (29a) the analysis of the sentence Paul viendra according to the hypotheses presented in this section, and we give in (29b) the description of the head-subject phrase to which it corresponds:12
French Word Order and Lexical Weight
335
Although somewhat unusual in phrase structure frameworks, the representation in (29b) is perfectly in keeping with the formal apparatus for categories in HPSG. The notation "VP" has no theoretical status in this framework; it is an abbreviation for a phrasal constituent whose lexical head is a V, which is (normally) saturated for its complements, but is missing a subject. Similarly, an "NP" abbreviates a phrase whose head is a lexical N, and which is saturated for its complements and specifier; it is nonlite and also "maximal" to use the usual parlance, while the VP is not maximal, since the verb is considered the head of the sentence. Thus, if one does not want to use head-only phrases, the only phrase in the sentence Paul viendra is a hd-subj-phrase. There is no VP because the verb has no complement, and we have no head-only schema. There is no NP either, since the subject is a proper name. Both the subject and the head are nonlite words; the verb viendra is nonlite because most V's are lexically unspecified for weight, and the constraint on hd-subj phrases requires the head to be nonlite. The subject daughter is not so constrained and can be lite (as in Hommage sera rendu aux victimes 'tribute will be paid to the victims'); it is nonlite in (29b) because proper names are lexically nonlite. Two Linear Precedence constraints are associated with phrasal descriptions, making use of the function of the daughters and independent of weight ('<' means 'precedes'): (30) a. hd-nexus-ph b. hd-comp-adj-ph
Non-Hd-Dtrs / < Head-Dtr Head-Dtr < Non-Hd-Dtrs
Constraint (30a) states a default relation (noted /) on head-nexus phrases; it orders markers, fillers, specifiers, and subjects (all "nonhead daughters") before the head. Constraint (30b) is more specific and overrides the default; it orders complements and adjuncts (as "nonhead daughters") after the head in head-complements-adjuncts phrases. Premodifiers only occur in the head-adjunct phrase [see (66)]. We now turn to the weight feature, which plays a role in ordering nonhead daughters among themselves. 3.4. Weight and the Order of Complements in the VP We are in a position to state the Linear Precedence constraints responsible for the generalizations concerning French word order that we have uncovered:
336
Anne Abeille and Daniele Godard
(31) Generalizations concerning word order (i) There is free order among nonlite complements. (ii) Lite complements precede nonlite complements. (iii) Lite complements are ordered among themselves. No LP rules are needed to account for (i): only constraints have to be specified, not freedom of order. The LP constraints responsible for generalizations (ii) and (iii) are given in (32): (32) (preliminary version) hd-comp-adj-ph
a. [lite] < [nonlite]
b. [COMPS<[1]>]<[1]
These constraints are added to (30b). Constraint (32a) orders lite complements (or adjuncts) before nonlite complements (or adjuncts), as well as the lite head before the nonlite nonhead daughters. Constraint (32b) says that a daughter must follow the daughter that subcategorizes for it, whether the latter is the head or a complement. So the nominal complements (lite or nonlite) of the past participle in compound tense constructions must follow the past participle, which is a complement and which subcategorizes for them. Let us now see how these rules give the right results for the data mentioned in section 2, beginning with the bare complements, which are [WGT lite]. Note that such complements, which are freely allowed by phrasal descriptions, are constrained by the specific verb requirements: the verb itself says whether it can take [WGT lite] complements, or whether they must be [WGT nonlite] (the default case). In (33), where H stands for Head and C for Complement, the lite complement soif precedes the nonlite one a Marie, as required by (32a), and the lite complement soif follows the lite complement donne which subcategorizes for it, as required by (32b): (33) a. b. c.
a in (33c,d,e): [COMPS <donne[lite], soif[lite], a-Marie[nonlite]>] donne in (33c,d,e): [COMPS (soif [lite], a-Marie[nonlite])] la course a donne soif a Marie H [lite] C [lite] C [lite] C [nonlite] d. * la course a donne a Marie soif H[lite] C[lite] C [nonlite] C[lite] e. * la course a soif donne a Marie H [lite] C [lite] C [lite] C [nonlite]
The tense auxiliary avoir is the head and inherits the complements of the participle, as seen in (24), instantiated in (33): it must precede the participle and the other complements according to (30b). Both nominal complements are on the complement list of the past participle and must follow it according to (32b). A similar situation holds for causative constructions (Abeille et al., 1997): the
French Word Order and Lexical Weight
337
causative head takes as complements the lexical infinitive, the causee, as well as the complements subcategorized by the infinitive (34a);13 the data in (34c) receive a parallel explanation, given the description of fait in (34b): (34) a. faire: [COMPS , COMPS [1], Xi> [1]] b. fait in (34c): [COMPS (V[inf, lite], NP [a, nonlite], N[lite]>] c. Le bruit fait avoir peur a monfilsl * fait avoir a monfils peur. The noise frightens my son.' (lit: makes have fright to my son) d. Paul fait lover le chien a Marie I fait lover a Marie le chien. 'Paul makes Marie wash the dog.' (lit: makes wash the dog to Marie/ to Marie the dog) The N peur being lite must occur before the nonlite complement a mon fils in (34c), while the two nonlite complements in (34d) are unordered with respect to each other. We now turn to those phrases which behave as lite or nonlite. In (35a), the complement vraiment soif, where the bare N soif is modified by a lite adverb, occurs before the nonlite complement, and can be either lite or nonlite; in (35b), it must be nonlite since it follows the nonlite complement a Paul. (35) a. la course donne vraiment soif a Paul H[lite] C C [nonlite] b. la course donne a Paul vraiment soif H[lite] C [nonlite] C [nonlite] 'Running makes Paul thirsty' lit: gives Paul (really) thirst. The light verb donner selects for a nominal complement without specifying its WGT feature, thus allowing both the lite complements soif or vraiment soif and the nonlite complements une grande soif or again vraiment soif. We find a similar situation with conjoined lexical complements: faim et soif can be either lite or nonlite in la course donne faim et soif a Paul ('Running makes Paul hungry and thirsty' lit: gives hunger and thirst) but must be nonlite in la course donne a Paul faim et soif, since the sequence follows the nonlite complement a Paul. The coordination of lite verbal complements is also underspecified for WGT. But as a verbal complement of the tense auxiliary, or of the causative verb, it is contextually constrained to be lite [see (24)]. (36b) is excluded by constraints (32b) and (32c). Note that we allow environments to which a constraint applies to overlap. (36) a.
Paul a achete et lu ton dernier livre H[lite] C[lite] C [nonlite] b. * Paul a ton dernier livre achete et lu C [nonlite] C[lite] H[lite] 'Paul has bought and read your lastest book.'
338
Anne Abeille and Daniele Godard
4. THE POSITION OF ADJECTIVES IN THE NP We now apply our approach to the problem of adjunct ordering, considering first the position of adjectives in the NP. We show that an approach in terms of syntactic weight is superior to two analyses proposed for similar phenomena in English. 4.1. A Lexically Constrained Distribution What determines the relative positions of the modifying adjective and the head N is a long-standing problem in French grammar (e.g., Forsgren, 1978; Wilmet, 1980). While we do not deny the role of stylistic and possibly semantic factors for a full account of the relative order of the A and the N, we choose again to concentrate on purely syntactic constraints. From this point of view, we distinguish between three lexical classes of adjectives in French: 4.1.1. PRENOMINAL-ONLY ADJECTIVES Certain nonintersecting14 adjectives such as grandl ('great'), which contrasts with grand2 ('big, tall'), ancienl ('former'),futurl ('future', 'to-be'), soi-disant ('would-be'},fauxl ('fake, forged'). (37) a. Un grand homme Un homme grand. 'A great man' 'A tall man.' b. Unfaux coupable I * Un coupable faux. 'A fake culprit'/ 'A culprit fake.' 4.1.2. POSTNOMINAL-ONLY ADJECTIVES Denominal adjectives15 such as frangais ('French'), presidentiel ('presidential'), regional ('regional'), which alternate with a de N complement (franqais = de France), adjectives which are derived from participles such as dine ('firstborn'), seduit ('seduced'), abandonne ('deserted'), attendu ('expected'), adjectives denoting colors (vert, 'green,' etc.), and some adjectives denoting forms (carre, 'square,' rond, 'round'): (38) a. Sonfils aine I * son ainefils. 'Her son first-born'/ 'Her first-born son.' b. Les exportations francaisesl * lesfranfaises exportations. 'The exports French' / The French exports.' 4.1.3. PRE- OR POSTNOMINAL ADJECTIVES Most adjectives belong to this class in French (Wilmet, 1980); examples are given in (39): (39) a. Un agreable voyage/ un voyage agreable 'A pleasant trip'/ 'A trip pleasant'
French Word Order and Lexical Weight
339
b. Les nombreux arguments de Paul/ les arguments nombreux de Paul. 'Paul's numerous arguments' (lit: the numerous arguments/the arguments numerous of P.) There is no clear semantic counterpart to this syntactic behavior. We consider the position of bare adjectives to follow from a syntactic property present in the lexical description, which we note with the feature WEIGHT: prenominal adjectives are lite, and postnominal ones are nonlite, while indifferent ones are underspecified in the lexicon and contextually analyzed as either lite or nonlite. French differs from English in that most adjectives are underspecified, while most of them are lite in English. 4.2. Syntactic Constraints on Adjective Position 4.2.1. PRENOMINAL ADJECTIVES Before the N, adjectives have the same behavior in French as in English (Sadler and Arnold, 1993, 1994). First, they cannot have complements (Blinkenberg, 1928): (40)
Une longue (*de 2 metres) table/ Une table longue de 2 metres. 'A long table'/ 'A table 2 meters long.'
(41)
Une facile (* a remporter) victoirel Une victoire facile a remporter. 'An easy victory'/ 'A victory easy to obtain.'
Second, they cannot have phrasal modifiers:16 (42)
Cette etrange (*a vos yeux) decision/ Cette decision etrange a vos yeux. 'This strange decision'/ 'This decision strange in your eyes.'
They can be modified or conjoined (with lite modifiers or conjuncts), and still appear prenominally: (43) a. Une trop facile victoire 'A too easy victory' b. Une tres I plus I si longue table 'A very/more/so long table' c. Une etrange et agreable aventure 'A strange and pleasant adventure' These observations apply to all adjectives, whether they are lite or nonlite in the lexicon. A very interesting property of lexically lite adjectives is that modification or coordination enables them to appear postnominally, with exactly the same meaning (Blinkenberg, 1928): (44) a. Desfaux coupables I * Des coupables faux 'Fake culprits' b. Des vrais coupables I * Des coupables vrais 'Real culprits'
340
Anne Abeille and Daniele Godard
c. Des [vrais oufaux] coupables/ Des coupables [vrais oufawc] 'Real or fake culprits'/ 'Culprits real or fake' (45) a. Les anciens senateurs/ * Les senateurs anciens 'The former senators' b. Les actuels senateursl 1 Les senateurs actuels 'The present senators' c. Les [anciens ou actuels] senateursl Les senateurs [anciens ou actuels] 'The former or present senators'/ 'The senators former or present' These properties follow from our analysis if prenominal A's are lite: adjectives with their complements are nonlite (the value for head-nexus-phrases, cf. (26)); conjunctions of lite expressions are either lite or nonlite [cf. (27b)], hence the data in (43c), (44), (45); modification of a lite constituent by a lite adjunct may be lite [cf. (27a)], hence the data in (43a,b). Not only are prenominal adjectives themselves lite, but they can only modify a lite head N. Clearly, they cannot modify an NP since they must follow the determiner and don't have wide scope over a coordination of NPs. They cannot either modify a sequence made of an N and its complement(s). We illustrate thispoint with determinerless nominal sequences allowed as complements of prepositions. (46) a. // a tourne la page sans grande peine de coeur ou haine de soi 'He changed his lifestyle without much heartbreak or self-hate.' b. C'est un endroit ideal pour vaillants pecheurs de truite et amoureux de la nature. 'It is the ideal place for daring trout fishers and nature lovers.' In (46a) the only interpretation is that the peine de coeur ('heartbreak') is big, not the self-hate. In (46b) only the trout fishers are supposed to be daring not the lovers of nature. A discussion of the NP structure is clearly outside the scope of this chapter.17 We interpret (46a,b) by saying that the prenominal adjectives adjoin lower than the complement of the noun, as shown in (47):
We conclude that the prenominal A's can only modify a lite head N.
French Word Order and Lexical Weight
341
4.2.2. POSTNOMINAL ADJECTIVES
As we have seen, postnominal adjectives can have complements or phrasal modifiers. Moreover, they may permute with the complements of the N: (48)
Les exportations franfaises de fromage / Les exportations de frontage frangaises. 'The French exports of cheese' (lit: the exports French of cheese/ of cheese French)
(49)
Un livre interessant sur les Indiens / Un livre sur les Indiens interessant. 'An interesting book about Indians' (lit: a book interesting about Indians)
This shows that postnominal adjectives can appear as sisters of the complements of the noun, and hence as sisters of the head N itself.18 This is not their only structural position. They also occur to the right of an NP, as shown by their possibly taking wide scope over a coordination of NPs: (50)
Des chantiers d'autoroutes et desprojets de zones industrielles importants 'Important creations of highways and projects of industrial parks'
The NP in (50) can denote a plural entity made of several creations of highways and of several projects of industrial parks which are all important. Postnominal A's must be nonlite (they can have complements), and they modify either a lite N (occurring at the same level as the complements) or a nonlite NP. 4.2.3. WEIGHT AND CONSTITUENCY Summarizing, we constrain prenominal and postnominal A's to be respectively lite and nonlite. When the A is bare, the weight value comes from the lexicon; with an adjectival phrase, it comes both from the lexicon and the weight value of the phrase. Note that prenominal adjectives can be quite long: une incroyable et fatigante mesaventure ('an incredible and tiring misadventure'), as noted particularly in Wilmet (1980), Miller et al. (1992). We now exemplify the structures. Adjectives in the NP are adjuncts, where adjunct is a grammatical function. In French, they are allowed by two phrasal descriptions, the head-adjunct phrase and the head-complement-adjunct phrase, which we give in (51). The first one has only two daughters, the head and the adjunct; the second allows the complements and the adjuncts at the same level. Both are necessary in the NP as well as the VP domain. Note that, in keeping with our representational choice (see section 3.2), we constrain the hd-comp-adj-phrase to contain at least one complement daughter with the value "nonempty list" (nelist): this implements the idea that there is no head-only phrase (which would be the case if the complements were optional), "o" notes the shuffle of the complement and adjunct lists and "ss" stand for synsem:19
342
Anne Abeille and Daniele Godard
Although weight is relevant for the hd-adj-ph (the weight value of the hd-adj-ph is a function of the weight of its daughters), the daughters themselves are not constrained; but the adjunct daughters in the second schema are constrained to be nonlite, and the head to be lite. Adjuncts have a MOD feature whose value is identified with the synsem (the syntactic and semantic description) of the head. Lite adjectives are [MOD noun [lite]], so that they only combine with a lexical N (or a conjunction of lexical N). On the other hand, nonlite adjectives are [MOD noun], so that they combine with a nominal category of any weight; accordingly, they occur at the same level as complements, where they (may) modify a lite N, or they are adjoined to the (nonlite) NP. We first exemplify prenominal adjectives, allowed by description (5la):
Postnominal adjectives are allowed by either description in (51). For hd-compadj-phrase (51b) to apply, we need at least one complement at the same level as the adjunct. This is the case in (48), (49), but not in (41), (42):
French Word Order and Lexical Weight
343
The hd-comp-adj-phrase type correctly allows the structure in (53): the different daughters of the phrase labeled N [nonlite] are the head, a complement, and a nonlite adjunct (by lexical specification), and the head is lite. Since a nonlite A may modify a lite or a nonlite N, frangaises correctly modifies the lite N exportations. We finally exemplify postnominal adjectives allowed by description (5la), which doesn't specify whether the head or the adjunct is lite or nonlite:20
The hd-adj-phrase allows the combination of the lite head N with the nonlite adjunct in une victoire facile (54a), as well as that of the nonlite NP with the nonlite adjunct in (54b). 4.2.4. LINEAR PRECEDENCE CONSTRAINTS We don't yet have a full account for the position of the adjective in the NP. We have proposed two sorts of constraints for the hd-comp-adj-phrase: function-based (30), (32b), and weight-based (32a). Nothing more needs to be added for postnominal adjectives occurring in the hd-comp-adj-phrase: they must follow the head according to the constraint in (30), which says that the head daughters precede the others. On the basis of the preceding discussion, weight appears to be the determining factor in the hd-adj-phrase. We propose the following (preliminary) constraint: (55)
(preliminary version) hd-adj-ph Non-Hd-Dtr [lite] < Head-Dtr < Non-Hd-Dtr [nonlite]
LP constraint (55) obligatorily orders lite adjectives before the head N. It also obligatorily orders all nonlite adjectives (lexically specified as nonlite, or A's with complements, or A's with nonlite modifiers), as well as modifying PPs and relative clauses which are also nonlite, after the nominal head.
344
Anne Abeille and Daniele Godard
The adjective facile is underspecified for the feature WGT in the lexicon. Because it precedes the head victoire in (56a), it is contextually specified as lite by LP rule (55), which excludes a prenominal nonlite A. Conversely, it is contextually specified as nonlite in (56b), where it follows the head, since LP constraint (55) prevents lite adjuncts from occurring after the head. For the same reason, the adjectival phrase in (56c,d), which is nonlite, must occur after the head: (56) a.
Une facile victoire ADJUNCT [lite] H[lite] b. Une victoire facile H[lite] ADJUNCTfnonlite] c. Une victoire facile a remporter H[lite] ADJUNCT [nonlite] d. * Une facile a remporter victoire ADJUNCT [nonlite] H[lite]
We now illustrate the proposal with adjectives specified as lite in the lexicon: (57) a.
Desfaux coupables ADJUNCT [lite] H[lite] b. * Des coupables faux H[lite] ADJUNCT [lite] c. Des vrais oufaux coupables ADJUNCT[lite] H[lite] d. Des coupables vrais oufaux H[lite] ADJUNCT[nonlite]
Being lite, the adjectives vrai, faux must precede the head, according to (55). However, according to (27b), the conjunction vrai oufaux is underspecified, and, like the lexically underspecified A facile, may occur both before the head (as a lite adjunct) or after the head (as a nonlite adjunct). On the other hand, the postnominal only adjective vert is specified as nonlite in the lexicon, and, according to (55), must occur after the head: (58) a.
Un carre vert H[lite] ADJUNCT[nonlite] b. * Un vert carre ADJUNCT [nonlite] H[lite]
4.2.5. ALTERNATIVE APPROACHES Let us now compare our analysis with two alternatives which have been proposed for English NPs, where the constraint on prenominal adjectives resembles the French constraint. As is well known, bare adjectives generally occur in pre-
French Word Order and Lexical Weight
345
nominal position (59a,b), but adjectives with a complement (59c,d) or a postmodifier (59e,f) must follow the head N: 21 (59) a. A proud / happy man. b. ?* A man proud / happy. c. * A proud of himself man. d. A man proud of himself. e. * A happier than you man. f. A man happier than you. Two proposals have been made to account for these data, which we examine now. 4.2.5.1. Williams' Head Final Filter Williams (1982) proposes a head-final filter (HFF) to account for the contrast illustrated in (59): the adjectival phrase must end with the adjectival head in order for the AP to precede the head. This proposal, which he extends to German as well, presents a number of difficulties. First, there is a well-known counterexample to the generalization, since an adjective modified by enough can be prenominal, although it is not head final (a fair enough proposal). Second, the HFF encompasses only a part of the data which we account for. For example, it doesn't say anything about coordination. One should add that both conjuncts have to be head final for the coordination to be prenominal, since big and tall is prenominal, while bigger than you and tall is not. More importantly, it has nothing to say about the difference between lite and nonlite head-final phrases. Only certains premodifiers are allowed with a prenominal adjective both in English and French, although the detail of the data differs. Degree modifiers are good in French in general, but a subclass of them cannot modify a prenominal adjective in English, and manner or point of view adverbs in general are bad in French: (60) a. Une (tres) importante decision. b. A (very) important decision c. *A so important decision/ A decision so important d. *Une politiquement importante decision/ Une decision politiquement importante. 'A politically important decision' (lit: a decision politically important) On the contrary, the contrasts illustrated in (60) follow from our analysis: since a head-adjunct phrase (such as Adverb-Adjective) with a lite head can be lite or nonlite depending on the weight of the adverb, we predict that tres importante (or very important) with a lite adjunct can be lite, while politiquement importante (or so important) with a nonlite adjunct cannot.22 More generally, the HFF aims at explaining why certain adjectival phrases must occur postnominally but doesn't raise the question of the distribution of adjectives before or after the head N in a general way (even in English, there are A's which
346
Anne Abeille and Daniele Godard
must occur postnominally, as in the president elect, the heir apparent; cf. Quirk et al., 1972, §5.18), and is disconnected from other generalizations concerning word order. In our analysis, the HFF would follow from more general constraints on syntactic weight. 4.2.5.2. Sadler and Arnold's LEX Feature A more ambitious approach to the problem of the adjective in the English NP has been taken by Sadler and Arnold (1993, 1994). Their analysis is based on the binary feature LEX, which works in the following way: (a) words are [LEX+]; (b) certain phrases are [LEX+], where [LEX+] elements are conjoined, or a [LEX+] adjunct modifies a [LEX+] head; (c) there is an "agreement" of LEX features in the head-adjunct phrase, so that [LEX+] adjectives can only modify [LEX+] N, while APs ([LEX-]) modify NPs ([LEX-]). The generalization concerning word order is simply that [LEX+] adjectives precede, while [LEX—] adjectives follow the head N. We retain the basic idea of Sadler and Arnold, in that we make the order of expressions depend on a syntactic feature, in conjunction with combinatorial properties of adjuncts and heads. However, we cannot adopt their system for the following reasons. First, items in the lexicon are not uniformly [LEX+], since we have to distinguish among adjectives and among adverbs in French. Adopting the LEX feature becomes counterintuitive: not only must [LEX+] phrases be distinguished from [LEX—] phrases but also [LEX+] words from [LEX—] words in the lexicon. Second, we would be forced to analyze some modified or conjoined [LEX+] heads as unspecified for the LEX feature rather than [LEX+], since they occur both to the left and to the right of the head N.23 Third, as postnominal adjectives are at the same level as complements in French, we would allow a [LEX—] phrase to modify a [LEX+] head, which shows the absence of agreement in the LEX features of the head and the adjunct. Finally, the binary feature LEX is not appropriate for an analysis of the placement of the adjective as part of a general theory of word order.
5. ORDERING ADVERBS IN THE VP 5.1. Adverb Classification We finally consider adverbs in the VP, showing how the syntactic weight feature plays a crucial explanatory role in their ordering. Although an in-depth study of adverbs is clearly outside the scope of this chapter, we briefly present our classification, in order to properly circumscribe the role of "liteness." Leaving aside semantic aspects, we cross-classify adverbs along three different dimensions: their
French Word Order and Lexical Weight
347
adjunction sites, their value for weight, and their function (complements or adjuncts in the VP).24 Abeille and Godard (1997a) propose that adverbs occur in two different structures: (a) they are adjoined to a verbal category, conforming to the head-adj-phrase constraint (5la) here, and (b) they occur among complements and at the same level, conforming to the hd-comp-adj-phrase constraint in (51b). There are three adverb classes, depending on which verbal category (the S, the VP, or the lexical V) they adjoin to: S-adverbs adjoin to all of them, VP-adverbs adjoin to some VP as well as to the lexical V, and V-adverbs adjoin to the lexical V only. Although there is some connection between such classes and the semantics of the adverb (since it must have scope at least over the category it adjoins to), there is no simple relationship between the two behaviors. We will examine here V-adverbs, which are mostly scalar and quantity adverbs, such as bien, beaucoup, mal, pen, apeine, plus, trop ('well, a-lot, badly, a-little, barely, more, too much'). All adverbs occur on the right of the finite V, independently of their possible adjunction site in a head-adjunct phrase.25 However, some are constrained to immediately follow the head V (and precede the complements) while others are mobile, and freely interspersed among the complements. This distinction resembles that of lite versus nonlite complements, all the more so because constrained adverbs are bare words. There is some overlap between the distinction based on adjunction sites and weight: all V-adverbs are lite. However, the two factors do not coincide: all lite adverbs are not V-adverbs. While most S-adverbs are nonlite, jamais ('never'), or soudain ('suddenly'), which are S-adverbs, are lite, and, if VP-manner-adverbs (attentivement, bruyammenf) are nonlite, strictly negative adverbs (pas, plus, point), which are also VP-adverbs (they adjoin to infinitival VP) are all lite. As with adjectives, there is no one-to-one relationship between length and "liteness" at the lexical level: although many lite adverbs are monosyllabic, this is not the case for all of them (e.g., jamais, 'never,' toujours, 'always,' beaucoup, 'a lot,' a peine, 'barely'), and a few nonlite adverbs may be monosyllabic (la 'there'). A third distinction is relevant: adverbs in the VP may have the function of adjuncts or complements, although most of them are adjuncts. Certain adverbs are subcategorized, for instance, when they alternate with locative PPs (id 'here', la 'there'); moreover, we consider negative adverbs (pas, plus, point, jamais) as well as other lite adverbs, to be included among the complements, of finite Vs for the first, of all Vs for the second (Abeille and Godard, 1997; Kim and Sag, 1995). Our classification resembles that in Cinque (1997), although there are also differences. Using two criteria, where the adverbs can occur, and whether they are characterized by ordering constraints, Cinque distinguishes three classes of adverbs (and PPs): the "higher adverbs," which may occur before the subject and are ordered among themselves, the "pre-VP adverbs," which occur around the V and are also ordered, and the "circumstancial adverbs" (denoting time, location,
348
Anne Abeille and Daniele Godard
cause, manner etc.), which may occur high in the sentence but are not strictly ordered. The "pre-VP" adverbs roughly correspond to our V-adverbs plus the negation.26 Apart from differences due to the framework, there are two main differences. First, we leave aside the ordering of adverbs among themselves. While we agree with Cinque that it may well reflect semantic properties, we do not make the hypothesis that there is a one-to-one relationship between syntactic ordering and semantic scope. Thus, while S-adverbs tend to be ordered among themselves, even in the VP, they are not totally so, and, although circumstancial adverbs tend to be free, they are not totally so either.27 Second, we cross-classify the adverbs rather than try to have the different criteria converge towards homogeneous classes. This is what allows us to bring to light the role of liteness, since lite adverbs belong to otherwise different syntactic classes, and there are other lite categories besides adverbs. Restricting our attention to the class of lite V-adverbs and contrasting them with nonlite adverbs, we show how the use of the WEIGHT feature enlightens adverb position, and allows most of the LP rules already defined to apply to adverbs.28
5.2. Freedom of Afow/zte Adverbs We first check that some adverbs to the right of the head verb are mobile and may permute with its complements. This is true for a manner adverb such as gentiment, and for a subcategorized adverb, such as locative la:29 (61) a. Paul a gentiment lu ce livre a sa grand-mere. 'Paul has kindly read this book to his grand-mother.' b. Paul a lu gentiment ce livre a sa grand-mere. c. Paul a lu ce livre gentiment a sa grand-mere. d. Paul a lu ce livre a sa grand-mere gentiment. (62) a.
Paul a range la le livre pour sa grand-mere. 'Paul has put the book there for his grand-mother.' b. Paul a range le livre la pour sa grand-mere. c. Paul a range le livre pour sa grand-mere la. d. * Paul a la range le livre pour sa grand-mere.
We consider these adverbs to be specified as [WGT nonlite] in the lexicon. Being nonlite complements, subcategorized locative adverbs such as la must follow lite complements (62d), but are not ordered with respect to other nonlite complements (62a)-(62c). This follows normally from LP constraints (30b) and (32a). However, according to constraint (32a), nonlite adjuncts should follow lite complements (e.g., past participles), which is not what we observe in (6la). To account for the difference between adverbs and other nonhead daughters, we in-
French Word Order and Lexical Weight
349
troduce the feature [ADV±], and restrict constraint (3la) to apply to [ADV—] nonlite daughters:30 (63)
(final version) hd-comp-adj-ph
a. [lite] < [nonlite, ADV—]
b. [COMPS < <[1]>] < [1]
Constraint (63a) allows all adverbs and PPs (lite or nonlite), which are [ADV+], to occur before lite constituents, but the possibilities are further restricted by constraint (30b), which says that the head comes first, and by (63b) which says that subcategorized constituents follow the predicate; accordingly, only nonlite adjuncts may come before the past participle, hence the difference between (6la) and (62d). Contrary to what constraint (55) leads one to expect, in the hd-adj-phrase, nonlite adjuncts are not ordered with respect to the verbal head: they can precede or follow a VP (64) or an S (65): (64) a. Paul, probablement, viendra a Paris. I viendra a Paris probablement. 'Paul probably will come to Paris.' b. Attentivement examine, le tissu revele des imperfections. I Examine attentivement. . . 'Carefully examined, the material shows defects.' (65) a. Administrativement, le probleme est complique. / Le probleme est complique, administrativement. 'Administratively, the problem is intricated.' b. Bientot, Paul sera la. / Paul sera la bientot. 'Soon, Paul will be here.' Bearing in mind that nonlite adjuncts must follow the noun, we conclude that the order of the nonlite adjunct and the head in the hd-adj-phrase depends on whether the head is nominal or not. We reformulate the right part of constraint (55), in order for it to apply to nominal heads only: (66)
(final version) hd-adj-ph a. Non-Hd-Dtr [lite] < Head-Dtr b. Head-Dtr [lite or noun] < Non-Hd-Dtr [nonlite]
5.3. Lite V-Adverbs Contrasting with the preceding class are V-adverbs, to which we now turn. We show first that they can only adjoin to a lite V[nonfin]. They must follow a finite V (67a), but can occur either to the right or to the left of an infinitival (67b). A priori, they could be adjoined to the VP[inf], or to the lexical V[inf]. The fact that they cannot have wide scope over a conjunction of VPs indicates that they adjoin to the lexical V only:
350
Anne Abeille and Daniele Godard
(67) a. * Bien Jean lisait le texte./ * Jean bien lisait le texte./ Jean lisait bien le texte. 'Jean read the text well.' b. Jean voulait [bien lire le texte]/ [lire bien le texte}. 'Jean wanted to read the text well.' (lit: to well read the text) c. Jean voulait [bien lire le texte et I'expliquer aux eleves]. 'Jean wanted to read the text well and explain it to the students.' d. Paul voulait tres bien lire le texte. 'Paul wanted to read the text very well.' (lit: wanted to very well read the text). (67c) cannot convey that Jean wanted to explain the text well to the students, only that he wanted to have a good comprehension of the text.31 To account for the distribution to the left of V, we propose that these adverbs are lexically specified as modifying lite V[nonfin] (where nonfmite forms include infinitive and participles). Turning to their distribution to the right of V, unlike the adverbs examined in section 5.2, their position to the right of the (finite or infinitival) V is restricted; they must occcur before the other complements, both nonlite (68) and lite (69):32 (68) a. Paul se souvientpeu de sajeunessel ?? se souvient de sajeunesse peu. 'Paul doesn't remember his youth much' (lit: remembers not-much his youth). b. Paul promet de travailler mieux en classel ?? de travailler en classe mieux. 'Paul promises to work better in class.' (69)
Paul rendra bien hommage aux victimesl * Paul rendra hommage bien aux victimes. 'Paul will well pay tribute to the victims.'
Moreover, they are free when they are modified (70) or have a complement (71): (70) Paul lira tres bien Corneillel lira Corneille tres bien. 'Paul will read Corneille very well.' (71) Paul rendra mieux que toi hommage a Corneillel rendra hommage mieux que toi a Corneille I rendra hommage a Corneille mieux que toi. 'Paul will pay tribute to Corneille better than you.' These properties are of course reminiscent of the behavior of lite adjectives in the NP and of the bare N complements in the VP. We conclude that V-adverbs are lexically specified as lite. The next question is that of their function. Accepting their adjunct status to the left of the lite V, we propose that they are complements to the right of the V, at least when they are lite.33 The argument relies on the contrasts illustrated in (67).
French Word Order and Lexical Weight
351
If it were possible for a lite adjunct to occur indifferently to the left or the right of the V, it would be difficult to explain why it can occur both to the right and to the left of an infinitival (67b), but only to the right of a finite V (67a). We get this intricate pattern of occurrence with the following analysis: (a) these adverbs adjoin to a nonfinite (infinitival or participle) lite V only; (b) as lite adjuncts, they occur to the left of the lite V, not to its right; (c) if lite, they are included into the complements by a Lexical Rule (LR), which applies to finite and infinitival V.34 The LR including V-adverbs among the complements is given in (72): (72)
V-ADVERB COMPLEMENT INSERTION LEXICAL RULE
This LR takes as input a verb expecting a number of arguments ([1] corresponds to the subject, and [1] to the complements) and returns a verb with a V-adverb added as the first complement. The verbal content is modified in the output: it is the same as that of the adverb ([5]) which takes (the content of) this verb as an argument ([4]). Although the inserted adverb behaves syntactically as a complement, it still behaves semantically as a functor. The adverb description is not modified; its MOD value only serves to circumscribe the class of verbs taking the adverb as complement and to instantiate the new content of the verb with the adverb as complement.35 The phrase structure for (69), for example, is given in (73a), and that for (67b) in (73b):
Let us finally turn to the ordering constraints dealing with V-adverbs, which can be either adjuncts or complements (see LR (72)). The ordering constraints
352
Anne Abeille and Daniele Godard
associated with the hd-adj-phrase are in (66). V-adverbs as lite adjuncts, precede the head (66a); this allows bien lire le texte (67b). As nonlite adjuncts, they must follow the lite head, because only lite adjuncts may precede a lite head. This accounts for the unacceptability of (74): (74)
* Paul essayait de [mieux que toi] travailler. 'Paul tried to work better than you' (lit: to better than you work)
The ordering constraints associated with the hd-comp-adj-phrase are given in (63). The first one, according to which lite constituents precede all nonlite nonadverbial ones makes sense of the behaviour of V-adverbs in (68), where they must precede the nonlite N. Moreover, since lite adverbs modified by a lite adverb can be either lite or nonlite, such phrases are allowed (as lite) to the left of the V[inf] as in (67d), on the one hand, and they are mobile in the VP (70), on the other one. However, something more has to be said about (69), which shows that these lite adverbs must precede other lite complements. Using again the feature [ADV±], we add an LP constraint that orders nonhead daughters which are both [ADV+] and lite before other lite nonhead daughters:36 (75) hd-comp-adj-ph
[lite, ADV+] < Non-Hd-Dtr [lite, ADV—]
We now illustrate our analysis; the adverb is an adjunct in (76) and a complement in (77): (76) a.
b. c. e. f. g. (77) a.
Bien declamer Vraiment bien ADJUNCT [lite] H [lite, nonfinite] * Declamer Corneille H[lite] COMP [nonlite] * bien declame ADJUNCT[lite] H[lite, finite] Declamer Corneille H[lite] COMP [nonlite] Declamer mieux que toi H[lite] ADJUNCT[nonlite] rend hommage H[lite] COMP [lite] rend
Corneille COMP [nonlite] bien [lite] Corneille COMP[nonlite] mieux que toi ADJUNCT [nonlite] Corneille COMP[nonlite] vraiment bien a Corneille ADJUNCT [nonlite] COMP [nonlite]
bien hommage a Corneille vraiment bien H[lite,finite] COMP[ADV+,lite] COMP[ADV-,lite] COMP[nonlite] b. 11 rend hommage bien a Corneille H[lite,finite] COMP [ AD V—,lite] COMP [ AD V+,lite] COMP [nonlite] c. 11 declamer Corneille bien H[lite,inf] COMP [nonlite] COMP [lite]
French Word Order and Lexical Weight
353
5.4. Summarizing LP Constraints At this point, it is useful to consider the full set of LP constraints which we propose in this chapter: I. Function-based LP constraints hd-nexus-ph 1. Non-Hd-Dtr / < Head-Dtr hd-comp-adj-ph 2. Head-Dtr < Non-Hd-Dtr II. Weight-based LP constraints hd-comp-adj-ph 3. [lite] < [nonlite, ADV—] 4. [lite, ADV+] < [lite, ADV-] III. Mixed LP constraints hd-comp-adj-ph 5. [lite, COMPS (... [1] ..)] < [1] hd-adj-ph 6. Non-Hd-Dtr [lite] < Head-Dtr 7. Head-Dtr [lite or noun] < Non-Hd-Dtr [nonlite] To understand how the LP rules work, it is necessary to remember that word order is assumed to be free when no constraint is stated. In addition, all LP rules must be compatible with each other, but some may overlap, when a more specific constraint comes on top of another one (as is the case with LP5 and LP2). The above constraints embody a mixed theory of syntactic word order according to which the ordering results from the combination of function and weight factors. There are also some differences based on syntactic category (the head value), but they play a minor, additional role.37 While the function-based constraint (LP2) stands alone in the hd-comp-adj-phrase, there is no such simple generalization in the hd-adj-phrase, where the two factors are intermingled. It is interesting to note that LP2, which formalizes the idea that French is a head-initial language, in fact overrides the general ordering for hd-nexus-phrases. It has a counterpart in the hdadj-phrase, in that only lite adjuncts are allowed to precede the lite head. Clearly, the head-initial property in French characterizes these specific phrase types. Although the constraints making use of the weight factor accurately account for the intricacies of the data, they also do justice to the two intuitions with which we started: lite constituents precede nonlite ones, and lite constituents are ordered among themselves. The first intuition is embodied by LP3 for the hd-comp-adjphrase: only nonlite adverbs blur the picture, which occur everywhere after the (lite) head (see LP2). Both LP rules for the hd-adj-phrase make use of the linear privilege of lite constituents over nonlite ones, although they also take into account other factors at the same time (function, and category). Most of the other rules (LP2, LP3, LP4, LP6) ensure a precise order of lite constituents, although they may achieve another effect at the same time. Abstracting away from further constraints (imposed by the category and the finiteness of the head), we have the following order: (79)
French lite cluster: Adjunct [lite] < Head[lite] < Complements [lite] < Complements [nonlite]
354
Anne Abeille and Daniele Godard
6. CONCLUSION Taking French as an example, we have shown that word order cannot simply be deduced from, or reduced to, questions of constituency. We draw attention to the importance of the distinction between lite and nonlite constituents, which is relevant both for lexical items and phrases, and is different from the usual distinction between heavy and nonheavy constituents. Most of the facts that we discuss are new. Roughly put, lite constituents come before the others and are ordered among themselves. Given that lexical heads come first in the head-complement-adjunctphrase, this creates the impression of a lite cluster around the head. On this empirical basis, we propose a theory of word order which is formalized as constraints on the daughters of phrasal types, and makes use both of function (Head vs. Nonhead) and weight (lite vs. nonlite) distinctions. Our proposal raises some unanswered questions. Similar constraints have been studied for Korean (Sells, 1994) and for English modifying adjectives (Sadler and Arnold, 1993, 1994). This is an indication that the "liteness" factor should be recognized as part of grammar and justifies further work. In particular, since word order (as syntactically determined) presumably always results from the interplay of different factors, it would be interesting to know what other factors hide or reveal the liteness factor. The fact that nonlite constituents are free in the French VP, for instance, brings it to light, while the more rigid ordering of complements and adjuncts in the English VP tends to hide it. The distinction betwen lite and nonlite words crosses traditional syntactic categories. Is it otherwise motivated? A semantic factor could make sense for nouns, since the divide separates common nouns from proper names. However, a semantic distinction would not be easy to justify for adjectives or adverbs, where the specific behavior of items cannot be fully predicted. Related but distinct questions are: Is there a reason why the distribution of lexical (attributive) adjectives is so different in English, where most adjectives are lite, and in French, where most of them are unspecified for the feature?, and Is there a tendency for a given syntactic category to induce a certain weight, at least within a given language? A different line of explanation involves language evolution or language processing. Diachronically and typologically, our lite elements are an intermediary step between syntax and morphologization. Indeed, there is a stage in the evolution from Latin to French when personal pronouns were not cliticized (they had their own stress), but could not be separated from the verb, which makes them good candidates for being lite. Examples of nominal compounds, as well as of incorporation also come to mind (see the incorporation of certain bare adverbs in the Greek verb, Rivero, 1992). From a synchronic point of view, there are possible explanations in terms of ease of parsing, or production, since these items tend to enter into more or less fixed collocations or to form complex semantic predicates with the head. The same arguments which motivate the role of heaviness in word
French Word Order and Lexical Weight
355
order as a factor that facilitates parsing (Frazier and Fodor, 1978) or production (Wasow, 1996) might be made to explain why this class of lite elements cluster around the head.
NOTES 1
Previous versions of this chapter have been presented at the 3rd International HPSG Conference (Marseille, May 1996), at the University Paris 7 (June 1996), at the Bangor Conference on Syntactic Categories (June 1996), at the University of Pennsylvania (October 1996), at Stanford University (January 1997), at SOAS (April 1997) and at ESSLI in Aix (August 1997). We thank audiences at these events for their comments, and, in particular, D. Arnold, J. Bresnan, R. Borsley, A. Copestake, D. Flickinger, G. Green, E. Hinrichs, S. Kahane, A. Kathol, S. Lappin, D. Meurers, P. Miller, C. Pollard, F. Popowich, R. Kempson, L. Sadler, I. Sag, P. Sells, and P. Thibaut, as well as the anonymous reviewers for this book. This work was done while D. Godard was at University Paris 7 (CNRS). It is part of a larger project on French syntax undertaken in collaboration with Ivan Sag, to whom special thanks are due. 2 We follow a suggestion by E. Hinrichs and D. Meurers (p.c.). We remain informal in this chapter. In a fuller presentation of word order phenomena, the Linear Precedence constraints should be seen as constraining the value of a word order domain feature (a la Reape, 1994), associated with signs. 3 Extracted or cliticized arguments are not analyzed here as complements (see Miller, 1991; Miller and Sag, 1997, on clitics; Pollard and Sag, 1994, Bouma et al., 1997, Sag 1997, on extraction; Abeille et al., 1997, for a general presentation in French). 4 We follow Gross (1975) in allowing adverbs such as vraiment or tres as modifiers of the N in light verb constructions. 5 Bare quantifiers are another case of lite complements (Abeille and Godard, 1997): (i)
Paul passe tout a son filsl ? passe a son fils tout. 'Paul forgives everything to his son'
6 Morphological incorporation of adverbs in French has still to be argued for convincingly. 7 Note that the bare N can allow the passive:
(i)
Hommage sera enfin rendu aux victimes. ('tribute will finally be paid to the victims')
(ii) 8
Un vibrant hommage sera enfin rendu aux victimes.
Pollard and Sag (1987) and Sadler and Arnold (1993, 1994) also use the LEX feature to account for ordering observations. Pollard and Sag appeal to it in connexion with the verbal particle. Given the lack of mobility of the particle co-occurring with a pronominal accusative complement (John looked it up/* John looked up it), as opposed to its mobility with a nominal one (look up the answer/ look the answer up), and the fact that such pronouns also resist dative shift (They gave it to Mary/ * They gave Mary it), we are tempted
356
Anne Abeille and Daniele Godard
to analyze personal pronouns as lite complements. See section 4.5. for a discussion of Sadler and Arnold's proposal. 9 Note that in our perspective, lite phrases are associated with argument structures, which can no longer be the sole attribute of words (contra Pollard and Sag, 1994; cf. Pollard and Calcagno, 1997). 10 See Kasper (1994), for the same proposal for German, and Pollard and Sag (1987) (p. 165) for English. We rely on the "minimal recursion semantics" to give the right semantics for this flat structure (Copestake et al., 1987). 11 See also section 4 for further examples of modified or conjoined phrases specified as lite. 12 The fact that we don't require a lexical V or N to be always dominated by a phrase is reminiscent of Categorial Grammar and Dependency Grammar. As mentioned earlier, we could also have nonlite NP and VP here dominating nonlite N and V. 13 We ignore the different realizations of the causee, depending on the transitivity of the V [inf]. 14 Grammar books also include cardinal and ordinal numbers, or indefinites (certain(s) 'some') as prenominal adjectives. We consider their classification as modifying adjectives to be uncertain. 15 Some postnominal adjectives may occur prenominally in highly marked (literary) constructions, which we analyze as borrowings from an older system: son blanc manteau ('its white coat'), les vertes frondaisons ('the green foliage'), la royale aventure de la maison de Savoie ('the royal adventure of the Savoie House'). 16 The same constraint holds for all prenominal modifiers (adjectives or nouns) in English (cf. Pollard and Sag, 1987:73): The [toxic waste dump ] management/ * The [dump for toxic waste] management. 17 We are keeping the traditional analysis where the determiner combines with a N saturated for its complements. See Miller (1991) for a proposal that some determiners combine with a lite head N (unsaturated for its complements). 18 We cannot enter into a detailed discussion of all the other alternatives here, but we see no reason to assume that the adjective is adjoined to the head noun and then possibly "extraposed" to the right of the complements. 19 The shuffle operation takes two lists and gives a third list which respects the ordering relation of both. List ([X]) denotes a list whose elements all have property X. 20 In fact, two analyses are allowed for une victoire facile, since facile, as a postnominal A can modify an N or an NP: [[une victoire] [facile]] or [une [victoire facile]]. This is necessary independently of our weight-based analysis. 21 The only apparent exceptions to this generalization are some prenominal lexicalized expressions (a God-is-dead philosophy, an easy-to-please guest) and measure adjectival phrases usually written with hyphen (a two-meter long table), which could be analyzed as lite (compounds). 22 The positioning of such adverbs is not captured by (55). Anticipating section 5, we say that a nonlite adverb can occur either before or after a nonlite head (une decision politiquement importante / importante politiquement) cf (66). 23 This is also necessary in English to explain why such phrases occur after the N: her so beautiful daughter, a daughter so beautiful. 24 Semantic factors may have the effect of further restricting the occurrence of certain adverbs.
French Word Order and Lexical Weight
357
25
Schlyter (1974) was the first to note that modal adverbs (evidemment, 'evidently') occur in the VP in French. 26 Postverbal adverbs can only be called "pre-VP" in an approach which condones V movement. While the negation adjoins to the infinitival VP (it is a VP adverb in our classification), it is also, like adverbs in general, a postverbal adverb at the same level as the complements (although only when the V is finite). Since it is lite, it behaves like the V-adverbs in postverbal position. 27 For instance, the evaluative and modal adverbs (both "higher adverbs") are not ordered among themselves, although the evaluative one has semantic scope over the second (Abeille and Godard, 1994): (i)
Paul arrivera probablement malheureusement en retard. ('Paul will probably unfortunately arrive late')
(ii)
Paul arrivera malheureusement probablement en retard.
On the other hand, a time adverb such as immediatement must precede a manner adverb (both "circumstancial adverbs"): (iii)
Paul a immediatement bruyamment contre-attaque. ('Paul has immediately loudly counter-attacked')
(iv)
* Paul a bruyamment immediatement contre-attaque.
28 We leave aside lite negative adverbs and also the class of so-called "adverbial" adjectives (couter cher = 'to be expensive', lit: to cost expensive cf. Grevisse and Goose, 1988, §926):
(i)
Paul ne voitjamais son perel * ne voit son perejamais. ('Paul never sees his father')
(ii)
Paul a paye cher cette erreurl ?? a paye cette erreur cher. ('Paul paid a heavy price for this mistake)
29 The data concerning certain S-adverbs are difficult. Some speakers only accept them before the complements. 30 Only [ADV+] categories may modify a V. Although all adverbs are [ADV+], the feature [ADV±] cannot be replaced by mentioning the category adverb: it is relevant for other categories such as N's and A's. Bare Q complements (which are lite N[ADV+]), but not bare A complements (which are lite A[ADV-]), occur before the participle (Abeille and Godard, 1997b): Paul a tout lu ('Paul has read everything', lit: has all read), * Paul a cher paye son erreur ('Paul has paid a heavy price for his mistake', lit: has costly paid). In a fuller account of word order, (63b) would also mention the feature [ADV—] of [1] on the right hand side. 31 In addition, these adverbs also occur to the left of past participles, where they don't have wide scope over a conjunction of participial phrases, but cannot occur to the right of the participle (if they are bare):
(i)
Pourtant (tres) bien parti, il n'a pasfini glorieusement la course. 'Although he started well (lit: although well started), he did not finish the race in glory.'
358
Anne Abeille and Daniele Godard
(ii)
? Bien parti dans la premiere course et arrive dans la seconde, il ne nous a pas fait honte. 'Having started well (lit: well started) in the first race and arrived in the second one, he did not disgrace us.'
(iii)
Parti *(vraimeni) bien dans la premiere course, il nous a fait honneur. '(Having) started *(really) well in the first race, he did us credit.'
Thus, they adjoin to lite nonfinite V in general, but are not the complement of lite V[ppart]. 32 The reader should note that heavily stressed lexical adverbs are treated as nonlite; accordingly, the examples in (68) are better with a stress on the adverb. The properties of lite adverbs (their behavior when conjoined, modified or stressed) are also noted independently in Cinque, 1997. 33 There is an alternative analysis: lite adverbs would form a lite phrase with the lite V, whether they are on the left or the right of the V. We do not propose this, because of (a) the contrast between (67) and examples (ii), (iii) in fn 31, which would remain unexplained, and (b) NCC facts, which indicate that NCC only conjoins sisters in French (Abeille, in press); these adverbs can occur in NCC: Paul declame bien Corneille et mal Racine. ('Paul recites Corneille well and Racine badly'). 34 But not to past participles. By restricting the LR to insertion of lite adverbs into the complement list, we eschew ambiguity for the nonlite adverbs of this class (with modification or complementation) which are adjuncts to the right of the V; we also get the right facts for participles (cf. fn 31): given that the LR fails to apply to them, participles can only combine with V-adverbs as adjuncts; when they are on the right of the past participle they must be nonlite, given the LP (55) for the hd-adj-phrase, and the constraint on adjuncts in the hd-comp-adj-phrase (51b). 35 Note that this LR is not isolated if a LR adding negative adverbs to the complement list of (finite) verbs is justified (Abeille and Godard, 1997a; Kim and Sag, 1995). Although the two rules cannot be collapsed, they belong to the same family, reinforcing each other's plausibility. 36 Some speakers accept lite adverbs after past participles, which we analyze as lite ([ADV—]) complements (a bien lu Proust vs % a lu bien Proust). Such speakers don't have the LP constraint (75). 37 This is the case with the introduction of the feature [ADV±] in the constraint associated with the hd-comp-adj-phrase, but also with the mention of "noun" in LP7.
REFERENCES Abeille, A. (in press). Non elliptical coordination of non-major constituents in French. Unpublished manuscript, Paris 7. Abeille, A., and Godard, D. (1994). The complementation of tense auxiliaries in French. WCCFL 13, 157-172. Stanford: CSLI Publications. Abeille, A., and Godard, D. (1997a). The syntax of French negative adverbs. In D. Forget, P. Hirschbiihler, F. Martineau, and M-L. Rivero (Eds.), Negation and polarity, syntax and semantics (pp. 1-27). Amsterdam: J Benjamins.
French Word Order and Lexical Weight
359
Abeille, A., and Godard, D. (1997b). A Lexical account of quantifier floating. In A. Kathol, J-P. Koenig and G. Webelhuth (Eds.), Lexical and constructional aspects of linguistics explanation (pp. 81-96). Stanford: CSLI. Abeille, A., Godard, D., Miller, P., and Sag, I. A. (1997). Bounded dependencies in French. In S. Balari and L. Dini (Eds.), Romance in HPSG (pp. 1-54). Stanford: CSLI Publications. Abney, S. (1987). The English noun phrase in its sentential aspect. PhD Dissertation, MIT, Cambridge, MA. Bescherelle, (1980). L'Art de conjuguer. Paris: Hatier. Blinkenberg, A. (1928). L'Ordre des mots enfranfais moderne. Copenhague: Munksgaard. Bouma, G., Malouf, R., and Sag, I. A. (1997). Satisfying constraints on adjunction and extraction. Unpublished manuscript, Stanford University. Bratt, E. (1990). The French causative construction in HPSG. Unpublished manuscript, Stanford University. Cinque, G. (1997). Adverbs and functional heads : A cross-linguistic perspective. Oxford: Oxford University Press. Copestake, A., Flickinger, D., and Sag, I. A. (1997). Minimal Recursion Semantics, an Introduction. Unpublished manuscript, Stanford University. Forsgren, M. (1978). La Place de I'adjectifepithete enfrancais contemporain, etude quantitative et semantique. Stokholm: Almqvist and Wiksell. Frazier, L., and Fodor, J. D. (1978). The Sausage Machine: A New Two-Stage Parsing Model. Cognition, 6, 291-325. Gazdar, G., Klein, E., Pullum, G., and Sag, I. A. (1985). Generalized Phrase Structure Grammar. Cambridge: Cambridge University Press. Grevisse, M., and Goose, A. (1988). Le Bon usage (12th ed.). Liege: Duculot. Gross, M. (1975). Methodes en syntaxe. Hermann: Paris. Kasper, R. (1994). Adjuncts in the Mittelfeld. In J. Nerbonne, K. Netter, and C. Pollard (Eds.), German in HPSG (pp. 39-69). Stanford: CSLI Publications, distrib. University of Chicago Press. Kathol, A. (1995). Linearization-based German syntax. Ph.D. thesis, Ohio State University. Kayne, R. (1994). The antisymetry of syntax. Cambridge: MIT Press. Kim, J-B., and Sag, I. A. (1995). The parametric variation of French and English negation. WCCFL14, 303-317. Stanford: CSLI Publications. Longobardi, G. (1994). Reference and proper names. Linguistic Inquiry, 25.4, 609-666. Miller, P. (1991). Clitics and constituents in phrase sructure grammar. Ph.D. thesis, Utrecht University (published Garland, New York, 1992). Miller, P., Pullum, G., and Zwicky, A. (1992). Le principe d'inaccessibilite de la phonologic par la syntaxe: trois contre-exemples apparents en francais. Lingvisticae Investigationes, XVI.2: 317-344. Miller, P., and Sag, I. A. (1997). French Clitics without Clitics or Movement. Natural Language and Linguistic Theory, 15, 573-639. Pollard, C., and Sag, I. A. (1987). Information-based syntax and semantics, vol. 1, CSLI series, distr. University of Chicago Press. Pollard, C., and Sag, I. A. (1994). Head-driven phrase structure grammar. Stanford: CSLI and Chicago: Chicago University Press. Pollard, C., and Calcagno, M. (1997). Argument structure, structural case and French causatives. 4th HPSG Conference, Cornell.
360
Anne Abeille and Daniele Godard
Quirk, R., Greenbaum, S., Leech, G., and Svartvik, J. (1972). A grammar of contemporary English. London: Longman. Reape, M. (1994). Domain union and word order variation in German. In J. Nerbonne, K. Netter, and C. Pollard (Eds.) German in HPSG (151-197). Rivero, M-L. (1992). Adverb incorporation and the syntax of adverbs in modern Greek. Linguistics and Philosophy, 15, 289-331. Sadler, L., and Arnold, D. (1993). Premodifying adjectives in HPSG. Technical report, Essex. Sadler, L., and Arnold, D. (1994). Prenominal adjectives and the phrasal/lexical distinction. Journal of Linguistics, 30, 187-226. Sag, I. A. (1997). English relative clause constructions. Journal of Linguistics, 33.2, 431-484. Schlyter, S. (1974). Une Hierarchie d'adverbes en fransais et leurs distributions. In C. Rohrer and N. Ruwet (Eds.), Actes du colloque franco-allemand de grammaire transformationnelle, (vol. 2, 76-86). Tubingen: Niemeyer. Sells, P. (1994). Sub-phrasal Syntax in Korean. Language Research, Seoul 30, 351-386. Wasow, T. (1996). Remarks on grammatical weight. Unpublished manuscript, Stanford University. Williams, E. (1982). Another argument that passive is transformational. Linguistic Inquiry, 160-162. Wilmet, M. (1980). Anteposition et postposition de 1'epithete qualificative en francais contemporain. Travaux de Linguistique, 7, 179-201. Zubizarreta, M-L. (1985). The relation between morphophonology and morphosyntax: The case of Romance causatives. Linguistic Inquiry, 16:2, 247-289.
INDEX
A
subject-auxiliary inversion, 194-201 T-licensing conditions, 312 without lexical rules, 167-211 Axiom step, 266
Accidence, 37 Accomplishments, 240 Achievements, 240 Acooli (language), adjectives in, 232 Adjacency, 85-87 Adjective position, syntactic constraints on, 339-341 Adjectives attributive, 127 postnominal, 341 postnominal-only, 338 prenominal, 339-340 prenominal-only, 338 prototypical, 229 Adverbs nonlite, 348-349 ordering in VP, 346-352 Agrammatic aphasia, 54, 55 Agreement phenomena, 94-97 Anaphora, 204, 269,276, 277 Aphasic breakdown, 54 Arabic cross-clausal w/z-licensing, 285 resumptive pronoun strategy, 277 wh-questions, 281, 282, 290 Attributive adjective, 127 Auxiliaries scope restrictions on, 186-187
B Bare complements, 329 Basic transition rules, 266-268 Basque morphemes, 119 nominalized clauses, 111-112 Breton, 303,311-312 clitic pronouns, 310-311 H-internal domain and second position, 304307 H-internal domain and third position, 307310 T-licensing conditions, 313 Bulgarian, 295, 303, 304, 312, 316
C
Categorial prototypicality, 222 Categories, 133 fuzzy categories, 222-224, 242-245 Clausal constructions, with nominal properties, 101, 102, 104
361
362 Clauses, finite and nonfinite, 99 Clitic pronouns, Breton, 310-311 Clitics, 8 Closed classes, 39-40 Complementizer, 7, 10-15, 30 Completion step, 267 Construction, 149 "Contentive" expressions, 37 Copulas, 246 Correlation-only Prototypicality, 230-231 Cross-clausal w/z-licensing, 285 Crossover, 253-254, 272 relative clauses and, 273-278 Crossover restriction, 272-278 Cut-off Point Prototypicality, 230-233 Czech, 299-300, 312-316
Index English attributive adjective, 127 auxiliaries without lexical rules, 167-211 idiom chunks, 238-240 measure verbs, 234 nominal phrase, 101, 104-105 poss-ing construction, 101, 104-106, 108, 118-119, 120, 123,127 progressive aspect, 231-232 pronouns, 277-278 verbal alternations, 233 Epistemic modality, 213 E-projections, 64-66 Exocentric nonheaded phrases, 149 Extraction, ellipsis and, 209-211
F D
Declarative units, 263 Deep anaphora, 204 Deontic modality, 213 Derived nominal, 127 Determiner, 7, 22-24, 30 Diachrony, 48-49 Dimensions, 264 Direct Mapping Prototypicality, 230 Discrepancy (TODO), 264 Diyari (language), 68 DLC Feature Introduction Conditions (DLC-FTC), 144 Dual lexical category, 144 Dutch, nonroot clauses, 299 Dynamic modality, 213
E
E-categories, defining, 60-63 Ecological niche, 149 E-language, 39, 57-59, 71 Elimination step, 267 Ellipsis extraction and, 209-211 postauxiliary, 204-211, 218 Ellipsis constraint, formulating, 206-209 Embedded LMV, 299-300 Endocentric headed phrases, 149 Endocentricity, 140
Fake NP Squish, 224, 235, 238 Feature checking LF interface and, 80-82 morphology and, 82-85 under adjacency and VSO clause structure, 79-98 Feature dimensions, of task state, 264 fin-head-subj-cx constructions, 151 Finite clauses, 99 Fluent aphasia, 54 Free order, among phrasal complements, 327328 French pre- or postnominal adjectives, 338 word order and lexical weight, 325-354 Functional categories, 7-9, 37 FUNCTIONAL categories, 66-70 Functional expressions, 38 categorizing, 59-70 characterizing, 39-51 "Functional-lexical" distinction, 37-39, 49-51 categorizing functional expressions, 59-70 characterizing functional expressions, 39-51 closed versus open, 39-40 homonymy, 48-49 I-language and E-language, 39, 57-59, 71 phonology, 40-41 polysemy, 48-49 psycholinguistic evidence, 51-59 semantics, 44-48 syntax, 41-44
363
Index Functional Parameterization Hypothesis, 8 Functional Word Category (FWC), 9 classes of function words, 25-28 as closed classes, 28-30 Fuzzy categories, 222-224, 242-245
G Gaelic, subject positions in Scottish Gaelic, 7980, 87-97 GAP ADJUNCTION rule, 280 Gap Resolution process, 277 General Phrase Structure Grammar (GPSG), 73, 123 Georgian, Determiner plus finite clause sequences, 115-116 German nonroot clauses, 299 wh-expression, 282 Goal, 264 Goal step, 266 Grammar, without Functional Word Categories (FWC), 30 Greek, Determiner plus finite clause sequences, 114-115
I I-categories, 63-64 FUNCTIONAL, 66-70 Icelandic, nonroot clauses, 299 Idiom chunks, English, 238-240 I-language, 39, 57-59, 71 Inflection, 240-242 Inheritance hierarchy, 188-193 Introduction step, 267 Inversion, subject-auxiliary inversion, 194-201 Iraqi Arabic cross-clausal wh-licensing, 285 wh-questions, 281, 282 Irish agreement effects in, 94 progressive constructive, 120-121 subject positions in, 79-80, 87-97
K Kabardian (language), 116 Kleene star operators, 262 Korean, word-order restrictions, 326, 330
L
H Head-adjunct phrases, 150 head-comp-cx construction, 155 Head-driven Phrase Structure Grammar (HPSG),8, 124, 167, 246 auxiliary constructions in, 168-175 verbal gerunds as mixed categories in, 133163 Headed phrases, 149-150 Head Feature Convention (HFC), 141 Head Feature Principle, 150, 171 Head-filler, 150 Head-Final Filter (HFF), 345 Head-nexus phrases, 150 Head/specifier constructions, 150 Hebrew, agreement effects in, 94 Hindi, wh-expression, 282 H-internal domain, Breton, 304-310 Homonymy, 48-49 HPSG. See Head-driven Phrase Structure Grammar
Labeled deductive system, 257-260 Label predicates, 261 Language acquisition/breakdown, functionallexical distinction, 53-55 LEX feature, 346 Lexical categories, 37 Lexical expressions, 38 Lexically restricted scope restrictions, 188 Lexical weight, and word order, in French, 325354 LF interface, feature checking and, 80-82 Linear Precedence Constraints, 201, 343-344 LINK introduction rule, 275 Lite complements, before nonlite complements, 327-328 Liteness, in phrasal descriptions, 333 -334 Lite V-adverbs, 349-352 Logical language, 260-264 Long Head Movement, 319-320 Long Verb Movement (LVM) languages Breton and Slavic languages, 295-319
364 Long Verb Movement (continued) differences in, 303 -304 nonroot clauses, 299-300 root clauses, 297 V2 languages and, 299-303 LP constraints, 353
M Masdar clause, Tabasaran language, 109-110 Measure verbs, 234 Minimal Recursion Semantics, 180 Mixed categories, verbal gerunds as in HPSG, 133-163 Mixed extended projections, 101-127 Mode of combination, 267 Morphemes, 119 Morphology, feature checking and, 82-85 Morphosyntactic mismatches, adjacency and, 85-87 Multiple-inheritance hierarchy, 149 Multiple wh-expressions, 282-283 Multiple wh-structures, 255
N Narrow scope negation, 185 Natural language, labeled deductive system, 257-260 near, 242-243 Negation, 175-194 narrow scope negation, 185 not, 175-184 Nominal agreement morphology, 108 Nominal gerund, 134 Nominalized clauses Basque, 111-112 Turkish, 101, 106-108, 120, 122, 129 Nominalized morpheme, Turkish, 119 Nominal phrase, English, 101, 104-105 Nonaffixal function expressions, 40 nonfin-head-subj-cx constructions, 151, 153 Nonfinite clauses, 99 Nonfluent aphasia, 54 Nonlite adverbs, 348-349 Nonlite complements, 326 lite complements before, 327-328 Nonprototypical verbs, 229 Nonroot clauses, 299-300
Index Nontheta-assigning categories, 37 not accounting for, 179-184 distribution of, 175-179 Nouniness Squish, 225, 243-245 noun-poss-cx construction, 153 Nouns pronoun as, 15-20 prototypical, 226, 228, 229 verbal gerunds as, 135-136 Null-licensing, 140
O
Obliqueness hierarchy, 201 Old Spanish, embedded LMV, 299-300 Open classes, 39-40 Optimality Theory (OT), 103
p Paradigmatic complexity, prototypicality and, 228-242 Parse, 264 Parse state, 264 Partial wh-movement, 255-256, 283-285 Passives, 234 PF licensing system, Slavic languages, 312-318 Phonology, 40-41 Phrasal complements, free order among, 327328 Phrasal descriptions, liteness in, 333 -334 Pied piping, 156-162,244 Polish, 295, 315 copula, 246 Determiner plus finite clause sequences, 112— 114 parametric variation, 316-318 PF licensing system, 312 Polysemy, 48-49 Position categories, 8, 9 poss-ing construction, 101, 104-106, 108, 118119,120,123,127 Postauxiliary ellipsis, 204-211, 218 Postnominal adjectives, 341 Postnominal-only adjectives, 338 Prediction step, 267 Prenominal adjectives, 339-340
365
Index Prenominal-only adjectives, 338 Prepositional complementizer, 14 Priming, 52 Principles and Parameters theory (P&P), 102127 Processes, 240 Progressive aspect, English, 231-232 Pronouns, 277-278 as nouns, 15-20 Prosodic Inversion/Morphological Merger, 320 Prototype theory, 222, 224 syntactic categories and, 226-228 Prototypical adjective, 229 Prototypicality, 228 categorial, 222 Correlation-only Prototypicality, 230-231 Direct Mapping Prototypicality, 230 explanations for, 233-242 inflectional possibilities and, 240 paradigmatic complexity and, 228-242 Strong Cut-off Point Prototypicality, 230, 232 Weak Cut-off Point Prototypicality, 230, 232 Prototypical noun, 226, 228, 229 Prototypical verb, 226-227, 229, 232 Pseudogapping, 217, 218
R
Relative clauses, crossover and, 273-278 Result (DONE), 264 Root clauses, LMV and V2 languages, 297-299 Rotuman (language), adjectives in, 232
S
Satisfied Tasks, 265 Scanning step, 266 Scope restrictions on auxiliaries, 186-187 lexically restricted, 188 Scottish Gaelic, subject positions in, 79-80, 87-97 Second-position effects, Breton, 304-307 Semantics, "functional-lexical" distinction, 44-48 Semantics Principle, 170 Serbo-Croatian, 299-300, 312 Signs, 148
Slavic languages parametric variation, 316-318 PF licensing system, 312-318 Slovak, 312, 315 Spanish, nominal functional category, 110-111 Squishes, 224-226 States, 240 Strong Cut-off Point Prototypicality, 230, 232 Strong lexicalism, 140 Subcategorization, 20 Subject-auxiliary inversion, 194-201 Subject positions, in Irish and Scottish Gaelic, 79-80, 87-97 Subordination, 267 Subset Principle, 198 Substance, 37 Subword categories, 8 Surface anaphora, 204 Syntactic categories nonprototypical members of, 229 prototype theory and, 226-228 Syntatic categories, 221 Syntax, "functional-lexical" distinction, 41-44
T Tabasaran (language), masdar clause, 109-110 Task Declaration, 264-265 Tasks in Progress, 265 Task state, feature dimensions of, 264 TFSs. See Typed feature structures there, as nonprototypical NP, 235 there clauses, 204-206 Theta-assigning categories, 37 Third-position effects, Breton, 307-310 T-licensing conditions, 312 Transcategorical construction, 133 Transformational grammar, 38,58 Transition rules, 266-268 Transitions, 240 Tree configuration, underspecification of, 269270 Tree Node, 264 Tree relations, 261 Turkish morphemes, 119 nominalized clauses, 101,106-108, 120, 122, 128 Weak Cut-off Point Prototypicality, 232-233 Typed feature structures (TFSs), 148
366
Index
V
W
V2 languages, 299-303 nonroot clauses, 299-300 root clauses, 297-299 Valence phrases, 150 Valence Principle, 171 Valency, irrelevance to classification, 20-22 Verbal alternations, English, 233 Verbal gerund phrases, subtypes of, 137-140 Verbal gerunds, 134 as mixed categories in HPSG, 133-163 as nouns, 135-136 as verbs, 136-137 Verbal prototypicality, 232 Verb-noun clause, 109-110 Verbs measure verbs, 234 nonprototypical, 229 passivizing, 234 prototypical, 226-227, 229, 232 T-licensing conditions, 312 verbal gerunds as, 136-137 VSO clause structure, 87-88
Weak Cut-off Point Prototypicality, 230, 232 Turkish, 232-233 Welsh, 247, 303 w/i-expressions, 281 multiple, 282-283 scopal properties in parallel with quantifying expressions, 252-253 wh-in situ, 254, 279-282 wh-questions, Arabic, 281, 282, 290 Word category, 8 Word order, and lexical weight, in French, 325 354 Word priming, 52
Y
Yiddish, nonroot clauses, 299
SYNTAX AND SEMANTICS
Volume 1 edited by John P. Kimball Volume 2 edited by John P. Kimball Volume 3: Speech Acts edited by Peter Cole and Jerry L. Morgan Volume 4 edited by John P. Kimball Volume 5: Japanese Generative Grammar edited by Masayoshi Shibatani Volume 6: The Grammar of Causative Constructions edited by Masayoshi Shibatani Volume 7: Notes from the Linguistic Underground edited by James D. McCawley Volume 8: Grammatical Relations edited by Peter Cole and Jerrold M. Sadock Volume 9: Pragmatics edited by Peter Cole Volume 10: Selections from the Third Groningen Round Table edited by Frank Heny and Helmut S. Schnelle Volume 11: Presupposition edited by Choon-Kyu Oh and David S. Dinneen Volume 12: Discourse and Syntax edited by Talmy Givon Volume 13: Current Approaches to Syntax edited by Edith A. Moravcsik and Jessica R. Wirth
Volume 14: Tense and Aspect edited by Philip J. Tedeschi and Annie Zaenen Volume 15: Studies in Transitivity edited by Paul J. Hopper and Sandra A. Thompson Volume 16: The Syntax of Native American Languages edited by Eung-Do Cook and Donna B. Gerdts Volume 17: Composite Predicates in English Ray Cattell Volume 18: Diachronic Syntax: The Kartvelian Case Alice C. Harris Volume 19: The Syntax of Pronominal Clitics edited by Hagit Borer Volume 20: Discontinuous Constituency edited by Geoffrey J. Huck and Almerindo E. Ojeda Volume 21: Thematic Relations edited by Wendy Wilkins Volume 22: Structure and Case Marking in Japanese Shigeru Miyagawa Volume 23: The Syntax of the Modern Celtic Languages edited by Randall Hendrick Volume 24: Modern Icelandic Syntax edited by Joan Maling and Annie Zaenen Volume 25: Perspectives on Phrase Structure: Heads and Licensing edited by Susan D. Rothstein Volume 26: Syntax and the Lexicon edited by Tim Stowell and Eric Wehrli Volume 27: The Syntactic Structure of Hungarian edited by Ferenc Kiefer and Katalin E. Kiss Volume 28: Small Clauses edited by Anna Cardinaletti and Maria Teresa Guasti Volume 29: The Limits of Syntax edited by Peter Culicover and Louise McNally Volume 30: Complex Predicates in Nonderivational Syntax edited by Erhard Hinrichs, Adreas Kathol, and Tsuneko Nakazawa Volume 31: Sentence Processing: A Crosslinguistic Perspective edited by Dieter Hillert Volume 32: The Nature and Function of Syntactic Categories edited by Robert D. Borsley