Multifactorial Analysis in Corpus Linguistics
Open Linguistics Series Series Editor Robin Fawcett, University of Card...
221 downloads
1393 Views
14MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Multifactorial Analysis in Corpus Linguistics
Open Linguistics Series Series Editor Robin Fawcett, University of Cardiff This series is 'open' in two related ways. First, it is not confined to works associated with any one school of linguistics. For almost two decades the scries has played a significant role in establishing and maintaining the present climate of'openness' in linguistics, and we intend to maintain this tradition. However, we particularly welcome works which explore the nature and use of language through modelling its potential for use in social contexts, or through a cognitive model of language - or indeed a combination of the two. The series is also 'open' in the sense that it welcomes works that open out 'core' linguistics in various ways: to give a central place to the description of natural texts and the use of corpora; to encompass discourse 'above the sentence'; to relate language to other scmiotic systems; to apply linguistics in fields such as education, language pathology and law; and to explore the areas that lie between linguistics and its neighbouring disciplines such as semiotics, psychology; sociology', philosophy, and cultural and literary studies. Continuum also also publishes a series that offers a forum for primarily functional descriptions of languages or parts of languages - Functional Descriptions of Language. Relations between linguistics and computing are covered in the Communication in Artificial Intelligence series. Two series, Advances in Applied Linguistics and Communication in Public Life, publish books in applied linguistics and the series Modern Pragmatics in Theory and Practice publishes both social and cognitive perspectives on the making of meaning in language use. We also publish a range of introductory textbooks on topics in linguistics, semiotics and deaf studies. Recent titles in the series Culturally Speaking. Managing Rapport through Talk across Cultures, Helen Spcnccr-Oatey (ed.) Educating Eve: The 'Language Instinct'Debate, Geoffrey Sampson Empirical Linguistics, Geoffrey Sampson Genre and Institutions: Social Processes in the Workplace, and School, Frances Christie and J. R. Martin (eds.) Pedagogy and the Shaping of Consciousness: Linguistic and Social Processes, Frances Christie (cd.) Words, Meaning and Vocabulary: An Introduction to Modern. English Lexicology, Howard Jackson and Etienne Ze Amvcla Syntactic Analysis and Description: A Constructional Approach, David G. Lockwood Relations and Functions within and around Language, Peter H. Fries, Michael Cummings, David G. Lockwood and William Spruiell (eds) Classroom Discourse Analysis: A Functional Perspective, Frances Christie Working with Discourse: Meaning Beyond the Clause,]. R. Martin and David Rose
Multifactorial Analysis in Corpus Linguistics A Study of Particle Placement
Stefan Thomas Gries
Continuum The Tower Building, 11 York Road, London, SE1 7NX 370 Lexington Avenue, New York, NY 10017-6503 This edition published by Continuum 2003 © Stefan Thomas Cries 2003 This work has been accepted as the author's PhD dissertation by the Faculty of Language Sciences, University of Hamburg, on 27 September 2000; supervisors: Klaus-Uwe Panther and Giinter Radden. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage or retrieval system, without prior permission in writing from the publisher. British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library. ISBN 0-8264-6126-3 (hardback) Library of Congress Cataloging-in-Publication Data Grics, Stefan Thomas. 1970 Multifactorial analysis in corpus linguistics : a study of particle placement / Stefan Thomas Gries. p. cm. (Open linguistics series) Originally presented as the author's PhD thesis - University of Hamburg, 2000. Includes bibliographical references and index. ISBN 0-8264-6126-3 1. English language - Particles. 2. English language - Word order. 3. Corpus linguistics. I. Title. II. Series. PE1321 .G742002 425 - dc21
2002071250 Typeset by RetineCatch Limited. Bungay, Suffolk Printed and bound in Great Britain by MPG Books Ltd, Bodmin. Cornwall
Contents
List of figures figures
vii
List of tables
ix
Preface
xi
List of abbreviations
xii
1 Introduction 1.1 The scope of the study 1.2 The diachronic development of phrasal verbs 1.3 Theoretical assumptions 1.4 Outline of the study
1 1 3 5 1
2 Review of literature 2.1 Phonological variables 2.2 Morphosyntactic variables 2.3 Semantic variables 2.4 Discourse-functional variables 2.5 Other variables 2.6 Interim summary and critical evaluation
12 12 13 15 18 21 22
3 Objectives of this study
44
4 Key notions and hypotheses 4.1 Discourse-functional variables 4.2 Semantic variables 4.3 Morphosynlactic variables 4.4 Phonological variables 4.5 Remaining variables 4.6 interim summary
48 49 52 56 58 58 61
vi
CONTENTS
5 The data 5.1 Origin of the corpus data 5.2 Treatment of the. corpus data
67 67 69
6 Results and discussion 6.1 Monofactorial results 6.2 Pair-wise comparisons: the relative strengths of variables 6.3 Multifactorial results 6.4 Further evaluation
79 79 101 107 118
7 General discussion 7.1 Prototypes 7.2 Variability and grammar
132 132 143 146
7.3 Competing syntactic variation variation 7.3 Competing approaches approaches to to syntactic
8 The activation of constructions 8.1 Theoretical introduction 8.2 The, relation of variables to activation 8.3 A network of variables and weighted (causal) relations 8.4 Interim summary
157 157 166 174 180
9 Conclusion and outlook 9.1 Summary 9.2 Outlook: implications and extensions
185 185 187
10 Appendices 10.1 List of variables 10.2 Register-dependent interaction plots 10.3 List of TPVs
192 192 194 203
11 References
211
Subject index Author index
223 225
Figures
3:1 4:1 4:2 6:1 6:2 6:3 7:1 7:2 7:3 7:4 7:5 8:1 8:2 8:3 8:4 8:5 8:6 8:7 8:8 10:1 10:2 10:3 10:4 10:5 10:6
A piece of discourse up to the point of decision for a construction Concepts and their activation cost for the hearer in a communication situation Determinants of processing effort and particle placement Interaction plot: construction X REGISTER X COMPLEX Importance of predictor variables for CART Distribution of construction predictions relative to kinds of direct objects Discriminant scores of sentences in relation to the prediction accuracy Determinants of processing effort and particle placement Possible explanation of Hawkins's findings 1 Possible explanation of Hawkins's findings 2 Possible explanation of Hawkins's findings 3 Step 1 of the generation of the utterance John picked up books/books up Step 2 of the generation of John picked up books/books up/ John lifted books ' Step 3 of the generation of the utterance John picked up books/books up Step 4 of the generation of the utterance John picked up books / books up Step 5 of the generation of the utterance John picked books up Step 5 of the generation of the utterance John picked books up A network of variables with intercorrelations/association strengths A subpart of the proposed causal activation network Interaction plot: construction X REGISTER X COMPLEX Interaction plot: construction X REGISTER X LENGTIIW Interaction plot: construction x REGISTER x LENGTHS Interaction plot: construction X REGISTER X TYPE Interaction plot: construction X REGISTER X DET Interaction plot: construction X REGISTER X IDIOMATICITY
45 50 61 83 117 123 135 148 149 150 150 162 162 163 164 165 166 177 179 194 194 195 195 196 196
viii
10:7 10:8 10:9 10:10 10:11 10:12 10:13 10:14 10:15 10:16 10:17 10:18
FIGURES
Interaction plot: construction x REGISTER X CONCRETE Interaction plot: construction x REGISTER X ANIMAOY Interaction plot: construction X REGISTER X LM Interaction plot: construction X REGISTER x AcrrPC Interaction plot: construction x REGISTER X TOPM Interaction plot: construction X REGISTER X ConPC Interaction plot: construction X REGISTER X NM Interaction plot: construction X REGISTER X Ci.usSC Interaction plot: construction X REGISTER x TOSM Interaction plot: construction X REGISTER X ConSG Interaction plot: construction X REGISTER X OM Interaction plot: construction X REGISTER X PP
197 197 198 198 199 199 200 200 201 201 202 202
Tables
1:1 2:1 2:2 2:3 2:4 2:5 2:6 2:7 2:8 5:1 5:2 5:3 5:4 5:5 6:1 6:2 6:3 6:4 6:5 6:6 6:7 6:8 6:9 6:10 6:11 6:12 6:13 6:14 6:15 6:16 6:17
Comparison of particle positions in different language stages Entrenchment hierarchy of Grics (1999: 122-6), based on Deane (1987, 1992) ' Variables that are argued to contribute to particle placement Partial results of Peters (1999): absolute frequencies and column percentages The entrenchment hierarchy and its constitutive subparls Fictitious analysis of the complexity of the direct object 1 Fictitious analysis of the complexity of the direct object 2 Fictitious analysis of the complexity of the direct object 3 Fictitious analysis of the complexity of (he direct object 4 Distribution of the 403 sample sentences Encoding of JVP Jype of the Direct Object (TYPE) Encoding of Determiner of the Direct Object (Dm} Encoding of Complexity of the Direct Object (COMPLEX) Example sentence with context from the British National Corpus (File: A91) Observed distribution of constructions relative to COMPLEX Expected distribution of constructions relative to COMPLEX Contributions to Chi-squarc for the distribution in Table 6:1 Distribution of constructions relative to EENGTH\\' Distribution of constructions relative to LENGTHS Distribution of constructions relative to TYPE Distribution of constructions relative to DET Distribution of constructions relative to IUIOMATTCITY Distribution of constructions relative to CONCRETE Distribution of constructions relative to ANIMACY Distribution of constructions relative to LM Distribution of constructions relative to AcxPC (DTLM) Distribution of constructions relative to TOPM Distribution of constructions relative to ConPC Distribution of constructions relative to NM Distribution of constructions relative to CLusSC (DTNM) Distribution of constructions relative to TOSM
4 17 23 28 30 32 32 33 33 68 70 70 71 73 80 81 81 84 84 85 86 87 88 88 89 90 91 92 92 93 94
x
6:18 6:19 6:20 6:21 6:22 6:23 6:24 6:25 6:26 6:27 6:28 6:29 6:30 6:31 6:32 6:33 6:34 6:35 6:36 6:37 6:38 6:39 6:40 7:1 8:1
TABLES
Distribution of constructions relative to ConSC Distribution of constructions relative to OM Distribution of constructions relative to PP Distribution of constructions relative to PART = PREP Observed distribution of constructions relative to disfluency (DISFLUENCY) Distribution of constructions relative to the register (REGISTER) Correlational strength of each variable Variables and values/levels to be contrasted Division of variables into two classes (dichotomization) Distribution of constructions with COMPLEX: simple and DET: indefinite Distribution of constructions relative to COMPLEX and LM Strength of levels of variables in terms of construction distributions Overall results of the first discriminant analysis Factor loadings of the discriminant analysis for all variables Prediction accuracies of three analyses Overall results of the discriminant analysis for the Processing Hypothesis Factor loadings of the discriminant analysis for the Processing Hypothesis Classification/prediction accuracies of three analyses Parameters and settings of the CART analysis Cross-validated prediction accuracies of CART for split samples Differences of average values for correct and false predictions The effect of structural priming on particle placement Comparison of chosen vs. non-chosen constructions (segment transitions) Hawkins's results (1994: 181) concerning particle placement in English Structural equation modelling results
94 95 95 96 97 97 98 103 103 104 105 106 109 110 112 113 114 115 116 117 1 19 120 121 148 178
Preface
This book grew out of my doctoral dissertation submitted to the Faculty of Language Sciences of the University of Hamburg, Germany, in April 2000. While I would like to express my gratitude to all members of the committee, some of them definitely deserve to be singled out. I thank Klaus-Uwe Panther, my main advisor, and Giinter Radden for granting me the enormous intellectual freedom to approach my subject of interest from the quantitative perspective I have gradually been adopting in my research. I am also grateful to Matthias Burisch for spending so much time discussing statistical issues with me. A special word of thanks is due to Thomas Berg, whose influence, especially on the revision of the work for publication, has been tremendous. Last but not least I am grateful to Robin Fawcett, the editor of this series, for his feedback and advice. From many other people whose feedback has contributed to this work, I would like to single out two and thank Anatol Stefanowitsch and Viola L'Hommedieu for many discussions and support during the completion of this project. I would also like to express my heartfelt appreciation to my colleagues in the IFKI at the University of Southern Denmark at Sonderborg. They created a working atmosphere that made it easy for me to devote so much effort to this work. The financial support of the department enabled me to travel a lot to broaden my horizons and get valuable feedback on various parts of this study. Finally, I thank Stefanie WulfF for her patience, her loyalty and her support during all of the stages of my work.
List of abbreviations
TPV VPC
transitive phrasal verb verb-particle construction COMPLEX Complexity of the direct object of the VPC LENGTHS / LENGTHW Length of the direct object in syllables/words NP type of the direct object of the VPC TYPE DET Determiner of the direct object of the VPC CONCRETE Concreteness of the referent of the direct object of the
VPC LM
Last mention of the referent of the direct object of the
VPC AcTPC (DTLM) TOPM ConPC
NM CLusSC (DTNM) TOSM CoHSG
OM PP PART = PREP CART IAM EIC ratio
Activation of the referent of the direct object of the VPC due to the preceding context (distance to last mention) Times of mention of the referent of the direct object of the VPC in the preceding context Coherence of the referent of the direct object of the VPC to the preceding context Next mention of the referent of the direct object of the VPC Clustering of the referent of the direct object of the VPC to the following context (distance to next mention) Times of mention of the referent of the direct object of the VPC in the subsequent context Coherence of the referent of the direct object of the VPC to the subsequent context Times of mention of the referent of the direct object of the VPC in a window of 10 clauses before and after the VPC Presence of a directional adverbial/PP after the VPC The particle of the VPC is identical to the preposition of a following PP Classification and Regression Trees Interactive Activation Model EarlvTmmcdiate-Constitucnt Ratio
1 Introduction
1.1 The scope of the study A phenomenon that has attracted considerable interest in linguistics over the last decades is the existence of what Lambrecht (1994: 6) has, following Danes (1966), referred to as allo-scntences, that is 'semantically equivalent but formally and pragmatically divergent sentence pairs'. Frequently, the phenomenon in question has also been referred to as grammatical variation (cf. Rohdenburg and Mondorf, 2003), syntactic variation (Altenberg 1982: 11-12; Bolkcstcin and Risselada 1987:'497-8) and corifigurational or permutational variation (Stucky 1987: 377-8). Depending on the nature of the formal divergence between the sentence pairs, one can distinguish instances of syntactic variation where the formal divergence results from a different linear arrangement of the constituents only such as Topicalization or Left-Dislocation (e.g. / don't like your elder sister vs. Tour elder sister, I don't like] from those instances where the divergence results from a change of both specific constituents in the sentence pairs and their linear arrangement such as the Dative Alternation (e.g. He gave the book to the man vs. He gave the man the book) or the English genitive (e.g. the book of my father vs. my father's book). In this study, I will investigate a particular instance of syntactic variation of the first type, namely the word order alternation that is possible for a large group of multi-word verbs in English. Consider (1). (1) a. Fred picked up the book, b. Fred picked the book up.
In (1), the alternation takes place within a verb phrase consisting of at least: • a verb (transitive or intransitive); • a morphologically invariant word, which will be referred to as a particle; • a direct object noun phrase. First of all, we need to distinguish the constructions in (1) from a superficially similar type of construction, exemplified in (2).
2
MULTIFACTORIAI, ANALYSIS IX CORPUS LINGUISTICS (2) Fred went into the forest.
Closer inspection reveals that (la) and (2) are, in spite of their superficial similarity, in fact quite dissimilar. First, and most importantly for the present work, (2) does not allow the word order alternation this study seeks to explain: (3) *Fred went the forest into.
Second, if the direct object is unstressed and pronominal, then the sentences of the type in (1) require the direct object to be positioned immediately after the verb (cf. (4)) whereas the sentences of the type in (2) require a VP-fmal position of such direct objects (cf. (5)). (4) a. b. (5) a. b.
*Fred picked up it. Fred picked it up. Fred went into it. *Fred went it into.
Third, sentence-final particles in sentences of the first type are generallystressed (cf. (6)) sentence-final particles in sentences of the second type normally bear no stress (cf. (7)). (6) a. b. (7) a. b.
What did Fred pick up?2 ??What did Fred pick up? ??What did Fred go INTO? What did Fred go into?
Finally particles of the kind in (1) must not be positioned sentence-initially for question formation (cf. (8)) while this is possible for particles of the kind in(2)(cf.(9)): (8) *L"p what did Fred pick? (9) Into what did Fred go?
The substantial differences between the constructions in (la) and (2) have been taken as indicative of the fact that what we have so far called particle does in fact not constitute a single word class - rather, it has been suggested that the particle in the sentence type in (1) is an adverb whereas the particle in (2) is a preposition.' Correspondingly, what we have informally termed multi-word verbs so far is not taken to be a homogenous class either: verbs in sentences of the type in (1) (where the particle is an adverb) arc commonly referred to as transitive phrasal verbs (TPVs) whereas verbs in sentences of the type in (2) (where the particle is a preposition) are commonly referred to as intransitive prepositional verbs.4 The main focus in the present work, however, is not to devise any new classificatory schemes or constructional tests such as those discussed above --
INTRODUCTION
3
I will focus on the variables that govern the word order alternation in (1). which has (even outside the Transformational-Generative paradigm) frequently been referred to as Particle Movement. In this study, 1 will use the theoretically more neutral term particle placement since (i) the theoretical foundations of the numerous stages of Transformational-Generative Grammar do not correspond to those of the present work (cf. section 1.3 below), and (ii) even within the Transformational-Generative paradigm, there is still considerable disagreement as to which element is being moved (particle or object or even both) and which of the two structures is basic or derived (cf., e.g., Legum 1968). Somewiiat surprisingly, in more than 100 years of literature on particle placement no generally accepted names for the two possible constructions have been coined. The word order where the particle is positioned directly next to the verb will be referred to as constmctionf;, the word order where the particle follows the direct object will be referred to as construction,.' (10) a. Fred picked up the book. = construction,, b. Fred picked the book up. = construction, As a cover term for both constructions, I will use the term verb-particle construction (VPC). Before we continue by outlining the theoretical assumptions on which this study is based, it is worth briefly discussing the history of TPVs in order to provide at least a cursory glance at the development of the word order alternation under consideration.'" 1.2 The diachronic development of phrasal verbs In Old English (OK), the form of TPVs was different from that of presentday Modern English (ModE). Originally, what arc now transitive phrasal verbs and intransitive prepositional verbs were characterized in OE by prefixation of the particle to the verb in the vast majority of cases: for instance, the ModE verb to go out was utgan in OE.' In an early stage of OE (as represented by early law codes dated c.600). the particle could not be separated from the verb (in this respect, this early stage of OE is similar to, say. fourth-century Gothic and Latin) and. according to Milliard (quoted in von Schon 1977: 10), the meanings of these verbs were restricted to literal, i.e. spatial, senses. According to Shlukhtenko (1955: 112), the complex verbs with the prefixed particle commonly bore stress on the first syllable of the verbal root, emphasizing the 'semantic character' of the verb and rendering the remaining parts of the verb less prominent (both prosodically and semantically). The inflectional suffixes of OE tended to lose stress as they were positioned word-finally However, prefixes expressing spatial relations were invariably stressed and 'the stress of the spatial prefix was inevitably bound to enter into contradiction with the stress of the root'. Since this contradiction was fundamental, in both English and German, the bond
4
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
between the verb and the prefixed particle loosened in two stages, which, taken together, have been called the First Particle Shift: • in a first developmental step, prefixed particles with a spatial meaning gradually began to detach themselves from the root: the particle could be separated from the verb but could only occur before the verb stem of the complex verb; • in the second stage, the particle could be separated from the root and also follow the finite verb - however, particles could not follow infinite verb stems. After this First Particle Shift, the OE structure of separable complex verbs was similar to that of Modern German as is shown in the first two rows of Table 1:1. From then on the particle was in late OE prose more and more frequently positioned behind the verb (especially; of course, in main clauses since these have finite verbs - in dependent clauses, pre-verbal position was used for a longer period of time). However, the development did not stop with post-position of particles behind finite verb stems because the separation of prefixed particle and verb did not remove the contradiction of two necessarily stressed items immediately following one another. Thus, in a second step a transition from prefixation to postpositional dislocation of verbal prefixes took place so that only a few centuries later (c.900), the particles could also be positioned behind non-finite verb forms. This so-called Second Particle Shift occurred in English and Norse, but not in German or Dutch, which is why the structure of non-finite phrasal verb forms in ModE differs from both OE and that of the other related languages, cf. the fourth row in Table 1:1. This development involved some type of stylistic variation of the two possible arrangements (prc- and post-verbal) and correlated with several other contemporaneous changes: • the general decline of verbal prefixes (cf. Hiltunen 1983: 145); • the overall tendency of English to develop into an analytic language (cf. Hiltunen 1983: 144; Kiffer 1965: 36, 135; Martin 1990/2, 11); • the influence of Scandinavian languages; • the introduction of so-called Echo Particles (i.e. a free adverb supplementTable 1:1 Gomparison of particle positions in different language stages Language
Finiteforms (verb-second)
Non-finite forms
Old English Modern German Modern Dutch Modern English
he eode ut cr ging aus uitging he went out
utgan ausgehcn uitgaan to go out
INTRODUCTION
5
ing the semantic impact of the semanlically weakened prefixed particle; cf. von Schon 1977: 32, 230-1); • the requirement to position the heavily stressed adverbial sentence finally in order to achieve end-focus (cf. Curine 1931; Shlukhtcnko 1955). The two Particle Shifts were completed rapidly in early ME by c. 1200 to c. 1300 when the posl-vcrbal position of the particle prevailed (cf. Hiltunen 1983: 6; von Schon 1977: 228-31). About that time, the complex verbs (e.g. bringan up and giefan up) could also take on non-spatial meanings (cf. Hiltunen 1983: 223). Until and during Shakespeare's time, phrasal verbs were frequently used in both colloquial and written language. Then, during the period from Milton to Johnson, they suffered a partial eclipse, but. from 1800 onwards they were used more often and more creatively The earliest examples of particles not only immediately following the verb but also following material intervening between verb and particle (i.e. the first cases of the alternation investigated here) can be found in texts from Alfred the Great (+899) and Abbot /Elfric (f about 1020). In these cases, the particle was regularly a directional adverb, more specifically an adverb denoting a point of the compass. Later on (e.g. in Ancrene Riuile of 1240, reported by Kiffer 1965: 50), however, other particles could also occur clause-finally in less literal cases, but these cases were by far outnumbered by those where the particle followed the verb directly. 1.3 Theoretical assumptions This study incorporates a multitude of accounts of particle placement in the last hundred years. Apart from these specific works addressing the linguistic problem of particle placement. 1 also rely on a variety of frameworks as a basis for my methodology7 and analysis. Since the methodological approach to particle placement advocated in this study is radically different from most others (at least in some respect), it is necessary to illustrate the most essential assumptions this study relics on. I assume, following recent approaches within the framework of Cognitive Linguistics, that language is a cognitive faculty closely interacting with other cognitive faculties. By this I mean that language is not taken to constitute an arbitrary, autonomous and fully predictably rule-governed module; I would rather argue that language is based on the same cognitive and perceptual mechanisms that seem to underlie many cognitive capabilities. For instance, allocation of memory resources, categorization and attention are three among many processes that govern or constrain most cognitive processes of which language is just one. What is more, I take language to be shaped not only by a variety of these and other cognitive processes as such (in the sense that, e.g., our memory constrains the number of possible embeddings in a single sentence), but I am also convinced that linguistic structure is to a large degree influenced (though not fully determined) by interactional processes in which language figures as a means of communication (as discussed in
6
MULTIFACTORIAL ANALYSIS IN CORPUS UXGUISTICS
many approaches that may be loosely subsumed under the cover term Functionalism; cf. Croft 1995; LakofT 1991: 54-5; Nuyts 1995: 293). In this regard, I find plausible the claim of Construction Grammar that pragmatic information may be conventionally associated with a particular linguistic form, thereby contributing to the interpretation process of a given linguistic expression (cf. Goldberg 1995: 3-4, 67; Kay 1995: 171-2; Langacker 1987: 20). This view of language as one among other cognitive processes has important methodological consequences for the present work: first, the present study allows for cognitively/psychologically real notions such as memory limitations, attention allocation, activation levels and prototype effects, to name but a few, as explanatory parameters. More specifically, I follow LakofPs (1991: 54) Cognitive Commitment, i.e. the 'commitment to make one's account of human language accord with what is generally known about the mind and brain from disciplines other than linguistics'. This entails that a description and explanation of linguistic phenomena with reference to cognitively real notions as in the present work must: 1 incorporate a multitude of independent variables rather than relying on monocausal explanations in order to provide an adequate account of these phenomena; 2 allow for a certain degree of inexplicable variance in the data depending to some extent on, e.g., the subjective construals (in the sense of Langacker 1987: 138-41) of scenes by human conceptualizers and speakers. What is more, many of the cognitive processes argued to be relevant are inherently scalar in nature, which is why descriptions and explanations in probabilistic terms and with reference to non-Aristotelian (i.e. non-clear cut) categories will be the norm rather than the exception.8 The above-mentioned consequences also have important ramifications concerning the nature of the linguistic data on which my analysis is based. Many approaches base their analyses on intuitive and introspective data (such as inventing sentences or judging their grammaticality or acceptability). This study, however, is exclusively based on naturally occurring data from a large corpus (the British National Corpus) in order to guarantee (i) that the results are based on how speakers actually put language to use and (ii) that generally acknowledged standards of scientific research such as objectivity, reliability and validity are properly met. In addition, it will soon become obvious that the multitude of variables that needs to be considered in combination with the amount of data used render it impossible to handle the mass of data without fairly sophisticated statistical procedures; while human native speakers subconsciously somehow manage to keep track of all variables in real time, because of the cognitive restrictions just mentioned above, a human analyst simply cannot cope with all the information of twenty or even more variables both simultaneously and objectively without such techniques.
INTRODUCTION
7
The overall approach of my analysis to particle placement will adopt a psycholinguistic perspective; more specifically, I am going to investigate particle placement from the perspective of online speech production. Psycholinguistic models to which my analysis will refer are the following: 1 Processing-based accounts of constituent ordering such as those by Hawkins (1991. 1994), where the resolution of word order choices is largely dependent on the processing effort associated with the different structural options. 2 Interactive activation models (lAMs) of the sort proposed by, e.g., Dell (1986). 3 The functionally oriented Competition Model by Bates and MacWhinney (1982, 1989). A more detailed explanation of these models and their relation to particle placement will be outlined below. 1.4 Outline of the study
On a large scale, this study is organized in the same way as most empirical studies: introduction, methods, results and discussion. Chapters 2, 3 and 4 are introductory in the sense that they provide essential background information on which the study rests, main hypotheses and terminology; Chapter 5 is concerned with the data that were analysed and the methods that were put to use; Chapters 6 and 7 present the results of the study and discuss their, in some respects, far-ranging implications; Chapter 8 provides a different perspective on particle placement and complements the previous discussion. Chapter 9 provides a conclusion. Let us look at the organization of this study in more detail. Chapter 2 discusses previous approaches to particle placement. The focus will be on every single variable that has been argued to govern particle placement. In the course of this discussion, the theoretical orientation of the analyses will be considered, but will only be topicalized to the extent that it contributes to our understanding of how this variable influences particle placement.9 The discussion is organized into levels of linguistic research: section 2.1 deals with phonological variables; section 2.2 investigates morphosyntactic variables: section 2.3 examines some semantic variables; sections 2.4 and 2.5 respectively deal with several discoursefunctional variables and some other, not so easily classifiable, variables. Each of the variables will be exemplified in some detail as the understanding oi every single variable is crucial to comprehending the course of analysis to be pursued in later chapters. Finally, section 2.6 summarizes and critically evaluates the (inventory of) variables proposed so far and discusses in detail several methodological shortcomings of previous analyses (most of which pertain to analyses of other cases of syntactic variation as well).
8
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
Chapter 3 discusses the main objectives (both linguistic and methodological in nature) of this work in detail. Suffice it here to mention that the linguistic goals of this study are to describe, explain and predict particle placement — the methodological goal is to show that syntactic variation should be investigated by advanced statistical techniques that have hitherto only rarely been used for cognitive-functional analyses of syntactic variation but are much more frequently used in contemporary corpus linguistics, psycholinguistics and other scientific disciplines.10 Chapter 4 proposes a hypothesis, the so-called Processing Hypothesis, in order to explain the word order alternation at hand. This hypothesis explains the choice of construction by native speakers in terms of the amount of processing cost associated with the two constructions, thereby subsuming most of the previous variables under a single notion and excluding some variables from further consideration. Although my analysis aims at being more comprehensive than previous analyses, it follows the tradition of works by Givon and Hawkins. Chapter 5 is devoted to methodological issues. The present work is completely corpus based, which is why 1 start by illustrating exactly how the corpus data came to be selected and how they were analysed with respect to the afore-mentioned inventory of variables. For each variable, naturally occurring sentences from the corpus data will be used to exemplify the assignment of values of interval/ordinal variables arid of levels of categorical/nominal variables to the variables figuring in the statistical evaluation to follow. Chapter 6 presents and discusses the results. First, in section 6.1, the most basic (i.e. monofactorial) descriptive results (correlation coefficients and cross-tabulation) arc reviewed thoroughly in order to find out which variables govern the alternation when each variable is examined in isolation. Put differently, we are concerned with the absolute strength of each variable. Second, section 6.2 investigates oppositions between variables and values/levels of variables in order to illuminate the interrelationships between variables, i.e. their respective strengths compared to one another. In other words, here, the relative strengths of variables and values/levels are dealt with. Lastly, in section 6.3, a multifactorial analysis is discussed by means of which we can (i) assess the current state of the art by answering the question 'to what extent can the alternation be explained?', (ii) shed light on the way the constructional alternation can be explained given a particular complex speech situation and (iii) predict native speakers' choices in authentic discourse situations. Although the advent of corpus linguistics and the rapid growth of other approaches such as psycholinguistics and/or computational linguistics have increased the statistical awareness of linguists, some of the statistical procedures used below are quite advanced. I therefore devote some space in each section of Chapter 6 to a brief explanation of these procedures. Chapter 7 is devoted to the discussion of some important implications pertaining to more general linguistic questions. Section 7.1 shows how
INTRODUCTION
9
prototypical instances ol both constructions can be identified objectively on an empirical basis. On the other hand, T also address the question of whether a prototypical VPC (comprising both word orders) can be determined. Section 7.2 discusses how the present analysis relates to other conceptually similar analyses of variability in language (though not necessarily syntactic variation) and their importance for linguistic theories. Lastly, section 7.3 addresses the question of how my analysis relates to the conflict between syntactic and discourse-functional approaches to syntactic variation. T concentrate on comparing my analysis to another, superficially similar, analysis of particle placement in terms of processing cost (Hawkins 1991, 1994). Chapter 8 addresses a conceptual refinement or complementation of the preceding analysis in terms of processing ellbrt as embraced by Givon (1992b) and Hawkins (1994). I propose a means to abstract away from the processing notions introduced before in order to further integrate the present findings into recent psycholinguistic interactive activation models (lAMs) such as those proposed by McClelland and Rumelhart (1982), Rumclhart and McClelland (1981), Stcmberger (1985), Dell (1986) and Bates and MacWhinney (1989). It will be shown how they go beyond a processing approach and, ultimately, seem to be the best alternative. Chapter 9 concludes the present study: I summarize the basic findings and provide an outlook on future research. Finally, Chapters 10 and 11 contain various appendices with tables and figures of factors, registerdependent results, a list of TPVs and the list of references. Notes 1 Note that, according to the above definition of allo-scntences, this study is restricted to truth-coriditionally equivalent sentence pairs - I will not consider sentence pairs where the word order differs in their literal meaning, e.g. The wind blew down the chimney and The wind bleif the chimney down. 2 Throughout this study, capitalization in example sentences is used to mark stressed constituents. 3 Cf. O'Dowd (1998: section 2.1) for a comprehensive discussion of a variety of preposition vs. (adverbial) particle tesls, which arc summed up as follows: The1 general principle underlying these tests is that in certain vcrb-P sequences, P constructs closely with the verb, performing an adverbial function while in others it constructs with the following NP. performing a prepositional function. (O'Dowd 1998: 14-15) 4 According to Bolinger (4971: 3), this classification appears to be the one most generally accepted; cf. among others Mitchell (1958: 106); Palmer (1988: 219ff.); Quirk et al. (1985: 1152—63). However, there exists considerable terminological profusion concerning the verb class under consideration; other terminological options are: VVortverband (Garstensen 1964), verb-particle colligation (Charlton 1990), separable verbs (Francis 1958), verb-particle construction (Fraser 1965. 1966, 1971, 1976), verb-adverb compound (Kennedy 1920), compound verbs
10
5
6 7
8
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS (Kruisiriga and Erades 1953). discontinuous verbs (Live 1965), verb-adverb locution (Roberts 1936) and two-word-verbs (Taba 1964). It is also worth noting that sometimes (e.g. Cowie and Mackin 1975, 1993: ix) the term TPV is restricted only to those TPV constructions where the meaning is idiomatic (as in, e.g., He gave up smoking as opposed to those instances where the meaning is literal (such as He brought the book back), although (i) it is widely accepted that there is no clear-cut boundary between idiomatic and literal meanings, and (ii) there is also disagreement as to how intermediate stages on a continuum between idiomatic ity and literalness can be objectively measured. Apart from that, there is yet another similar class of sentences, namely those exemplified by He painted the house green or He talked himself unconscious. These constructions have sometimes been considered equivalent to TPVs (cf. Bolinger 1971: 67 71) and are referred to as verb-adjective constructions (Quirk el al. 1985: 1167-8). Goldberg (1995: Chapter 3, 7-8) treats them as an extension of the socalled caused-motion construction, namely rcsultativc constructions. Although these constructions are not of major interest here, we will briefly return to them in Chapter 4. It would, of course, be much nicer to have more mnemonic terms for the two constructions, which is why one could simply coin terms such as continuous construction and discontinuous construction for construction,, and construction, respectively. However, the terminological option of construction,, and construction, has one very important advantage: the numbers 0 and 1 have been used in the statistical calculations in the chapters to follow, and the interpretation of correlation coefficients makes it necessary to know which construction is characterized by which value; in other words, even if I used continuous construction and discontinuous construction for the introductory parts of this study, the reader would have to know which construction is encoded by 0 and which by 1 for Chapters 6 and 7 anyway. Therefore, the terms construction,, and construction, will be used throughout this study. (Note that the assignment of the values does not have anydeeper conceptual import and could just as well have been the other way round. As a mnemonic help, note that the subscript in the construction's name provides the number of constituents between the verb and the particle.) We will briefly return to diachronic aspects of VPCs in section 7.1 below. This pattern is still present in only a limited number of ModE verbs with either a literal (e.g. to upraise and to uplift) or a more idiomatic meaning (i.e. one where the usual spatial meaning of the particle is not fully present in the meaning of the verb such as to forgive, to mislead and to understand). There arc even combinations of verb and particle that exist in both combinations in ModE, but the two combinations have different meanings: to overtake does not mean the same as to take over. As early as 1917, Deutschbein argued for a way of analysis that is in this respect fully compatible with the present approach and does not bear the weaknesses of many linguistic accounts to be discussed below: owing to their being couched in terms of necessary and sufficient conditions they were quite inflexible and remote from 'natural' data and their interpretation with respect to natural explanatory parameters of language (cf. Givon 1979: 3-5): Dabei hahe ich. vvie ublich, die Spracherscheimmgeii zu formulieren gesucht, doch mochtc ich ausdriicklich hervorheben. dafi cs sich menials urn Gesetze irn slrengsten Sinne handelt, sondern ebcn nur um Krafte, die nach einer bestimmtcn Richtung wirken. (Deutschbein 1917: vii)
INTRODUCTION
11
9 For instance, given the theoretical assumptions of this study, it would be pointless to review and discuss in detail the question of whether particle placement is better analysed by incorporation due to the Case Adjacency Principle (Stowell 1981: 113) or the Small Clause Analysis by Kayne (1985: 129). 10 A more detailed discussion of my objectives is postponed until Chapter 3 as most of them follow from my characterization of linguistic and methodological drawbacks of previous analyses.
2
Review of literature
Particle placement is a linguistic phenomenon that has been noticed in many traditional grammars and a variety of linguistic frameworks for a very long time. One of the very first mentions of this alternation can be traced back to an English Grammar from 1712 (Mattaire 1972). However, the first linguistic investigations that go beyond simply noticing the different possible word orders in a prescriptive fashion date back to 1892 (Sweet 1892). In other words, particle placement has now been investigated for about 100 years. The following discussion of the previous findings will focus on the variables that have in these analyses been argued to influence the constructional alternation. For each sub-branch of linguistics, I will name the respective variables and illustrate which of their values/levels are purported to influence particle placement in which way.' As was mentioned in section 1.4, it is important to note that I will not discuss the theoretical underpinning of the previous proposals in more detail than is necessary for the complete understanding of the variables' impact on the constructional alternation. This also implies that not all of the approaches that have argued for a particular variable's influence can or should be named here: it does not make sense to name repeatedly dozens of references to analyses postulating perhaps a single variable. This entails that, to name at least one example, no exhaustive theoretical comparison of the claims of the various schools that emerged from the Transformational-Generative paradigm will be provided since nearly all of the dozens of these approaches centre on the same two variables." After the discussion of variables in sections 2.1 to 2.5, section 2.6 summarizes and evaluates the previous analyses and points out several recurrent shortcomings. 2.1 Phonological variables
The most important phonological variable thai has been argued to influence particle placement is concerned with the stress pattern of the verb phrase.3 As wras already noticed in even the earliest works, if the direct object is stressed, then construction,, is much more likely to occur (cf.. e.g., Van Dongen 1919: 352; Kruisinga and Erades 1953: 78; Bolinger 1971:
RKV1KW OK UTKKATl'RE
1:5
50-5). Consider (11), where the stress on the direct object serves a contrastivc function. (11) a. Fred picked up the BROWN BOOK, not. the blue one. b. Fred picked the BROWN BOOK up, not the blue one.
This variable is, of course, intimately related to both semantic and discourse-functional aspects of the utterance, which is why we will return to it again below.' Apart from the stress pattern of the verb phrase, there is a second phonological variable that has been put forward in one analysis, namely the phonetic shape of the verb. According to Frascr (1974: 571), verbs not bearing initial stress prefer construction,; this means that, for instance, the TPV to divide up should be less acceptable in (12a) than in (1 2b). (12) a. Fred divided up the cake, b. Fred divided the cake up.
2.2 Morphosyntactic variables Many of the variables that have been proposed are morphosyntactic in nature. A variable that has been mentioned in every single analysis (e.g. Van Dongen 1919: 351-2; Kennedy 1920: 30; Chomsky" 1957: 75-6; Rohrbacher 1994: 194-5) is concerned with the NP type of the direct object of the TPV: if the direct object is pronominal as in (13) rather than lexical, then construction! is, at least in general, obligatory. (13) a. *Fred picked up it. b. Fred picked it up.
The only exception to this rule is the class of contrastively stressed pronominal direct objects (cf. Fraser 1976: 20), a combination that is very rare and is very likely to occur only with further modification of the utterance, as is exemplified in (14). (14) a. Fred picked up HIM, not her. b. PFred picked HIM up, not her.
Tn some approaches, a distinction has not only been made between pronominal and lexical direct objects — according to, among others, Van Dongen (1919: 352), Kruisinga and Erades (1953: 77-8) and Quirk et al. (1985: 1370), there is a somewhat intermediate class of direct objects, namely refercntially vague semi-pronominal nouns such as matters or things, which also have a strong preference lor construction,. (15) a. ?Thcy talked over matters, b. They talked matters over.
14
MULT1FACTORIAL ANALYSIS IN CORPUS LINGUISTICS
For lexical objects, it has been claimed that the determiner of the direct object also plays a role in the choice of one construction over the other (Chen 1986: 84'on the basis of Givon 1983; Gries 1999): indefinite determiners tend to occur with construction,, and definite determiners tend to occur in construction,.1 However, this claim cannot be exemplified simply on the basis of acceptability judgements since minimal pairs of sentences with TPVs differing only in the determiner of the direct object do not differ markedly in acceptability: (16) a. b. c. d.
Fred Fred Fred Fred
picked picked picked picked
up a book. a book up. up the book. the book up.
(16a) and (16d) are argued to be more acceptable or likely to occur than (16b)and(16c). Another influential variable is concerned with the length of the direct object: in nearly all studies, it is argued that long direct objects lead to a preference for construction^ Consider (17), where the long direct object not only prefers, but seems to require, construction,,. (17) a. Fred picked up the book John had bought him while he was in Europe, b. *Fred picked the book John had bought him while he was in Europe up.
This variable has come in several guises: in most studies, it is not even mentioned how length is measured. Some studies measure the direct object's length using the number of syllables (e.g. Chen 1986; Gries 1999);6 some other studies rely on the number of words as an indicator of length (e.g. Hawkins 1991, 1994).7 Several other scholars have suggested that it is not so much the length of the direct object that is relevant for particle placement but rather the complexity of the direct object (cf, e.g., Eraser 1966: 46, 1976: 20; Ross 1986 [1967]: 32-3; Yeagle 1983: 11-12; Kayne 1985: 106). In general, one might argue that the length of the direct object and its complexity (however that is to be quantified) will be highly correlated (recall Hawkins's way to operadonalize complexity by number of words) - however, Chomsky (1961: n. 18) and Fraser (1966: 9 n. 3) have argued convincingly that there are cases where the influence of length is different from that of complexity. Consider (18) (Fraser's example and acceptability judgements). (18) a. The student worked more than seven of the difficult examples out. b. *Thc student worked the example which he recognized out.
While in (18a) the direct object is longer (7 words, 12 syllables) than that in (18b) (5 words, 9 syllables), (18a) is still more acceptable than (18b) (at least if we follow Fraser's intuitive assessment):8 in i'18a), the direct object noun phrase is very long, but not as complex as the one in (18b), where the
REVIEW OF LITERATURE
15
direct object noun phrase contains an embedded clause. Similar evidence was gathered by Hunter (1981), who concluded that the direct object's complexity is in fact more important than its length, lending further support to Fraser's claim. Be that as it may, even in view of Hunter's evidence and Eraser's intuitions, it is not obvious how complexity can be measured reliably (Fraser 1976: 20): cf., however, Wasovv (1997b). who demonstrates that many measures of grammatical weight yield similar results. 2.3 Semantic variables The first semantic variable to be discussed is concerned with idiomatic meanings of the verb phrases in which particle placement occurs: following, say, Fraser (1974: 573) and Chen (1986: 82), if the meaning of the verb phrase is idiomatic,'1 then construction,, is preferred. (19) a. b. (20) a. b. (21) a. b. (22) a. b.
Fred has tried to eke out a living, *Fred has tried to eke a living out. Fred brought down the plane, Fred brought the plane clown. I will turn over a new leaf. (Potter's 1965: 287 example) I will turn a new leaf over. He threw up his dinner because he got food poisoning. (Fraser 1974: 573) He threw his dinner up because he wanted to stain the ceiling.
In (19a), the meaning of the verb phrase is idiomatic and construction,, is acceptable in (19b), however, the idiomatic verb phrase does not go together with construction, and (19b) is ungrammatical. In (20a), the verb phrase is ambiguous in that it can cither mean that Fred had a (toy) plane and brought it to some lower location (the literal meaning) or that Fred is a soldier who has shot down a fighter plane (the idiomatic meaning). On the other hand. (20b) licenses the literal interpretation only. According to Potter, (21 a) is ambiguous between / will turn over a new page of the book and / will begin a new life, but this ambiguity is not present for (21b), where again only the more literal interpretation is possible. Finally, for (22a), Fraser has claimed that, given the idiomatic meaning of the TPV (note the subordinate clause), the choice of construction,, is most natural; for (22b), on the other hand, the literal meaning of the TPV7 results in a preference for construction, — if each main clause was to be joined with the subordinate clause of the other main clause, awkward sentences would result. One has to bear in mind, however, that this constraint is far from being absolute for two reasons. First, the idiomaticity of the verb phrase need not. have the above-mentioned effect at all (cf. Cowic and Mackin 1993: ix): although she made up her face is clearly idiomatic (as opposed to, say; Bill carried away the rubbish'), both sentences allow for a change to construction, - she made her face up and Bill carried the rubbish away are fully acceptable (especially given specific discourse contexts); moreover, she made it up even requires
16
MUI.TIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
construction j. Second, the meaning of a verb phrase cannot always be categorized as being either fully idiomatic or totally literal (cf. also Gibbs 1994: chs. 5, 6). Rather, there are many cases where the meaning is somewhere between these two extremes. For instance, in it has taken many years to bring the town up to the standard the meaning is definitely not literal since the town has not been moved to a spatially higher position, but equally the meaning is definitely not fully idiomatic as it can be easily computed on the basis of what Lakoff and Johnson (1980: 14) have called oricntational metaphors (here, GOOD is UP); that is, there seems to be an intermediate level of meaning between idiomatic and literal, namely metaphorical.10 Another semantic variable concerned with the meaning of the verb phrase that has been brought up by Fraser (1976: 20-1) is the habitual meaning of the verb phrase: if the verb phrase denotes the habitual meaning, then construction,, is preferred; (23) is his example. (23) a. b. c. d.
The police The police The police The police
are tracking down criminals. arc tracking criminals down. track down criminals. track criminals down.
According to Fraser, (23a) and (23c) are more natural than (23b) and (23d) respectively. An additional variable Fraser (1974: 573) has argued for is concerned with semantic modification of the particle: 'Several particles, notably up, can take a perfective marker such as all, all the way or completely or an aspectual marker such as right. When such a marker is present, the particle must follow the object.' (24) a. b. (25) a. b.
*I will clean all/right up the room, I will clean the room all/right up. *Harry said he would bring all/right over the vegetables, Harry said he would bring the vegetables all/right over.
As an additional variable, I have suggested in a former analysis (Gries 1999) that the degree of cognitive entrenchment (or cognitive familiarity) of the referent of the direct object contributes to particle placement. The degree of entrenchment has, following analyses by Deane (1987. 1992: 199-236), been measured by the position of the direct object's referent on the entrenchment hierarchy in Table 2:1 (based on an adaptation of the Silverstein Hierarchy). On the basis of a corpus analysis and a survey of acceptability judgements of native speakers of British English, T have shown that for frequency counts and acceptability' judgements highly entrenched referents of direct objects correlate significantly with construction, while barely entrenched referents of direct objects correlate significantly with construction,). The last semantic variable to be mentioned is concerned with the semantic focus of the verb phrase. Both Bolinger (1971) and Yeaglc
REVIEW OF LITERATURE
17
Table 2:1 Entrenchment hierarchy of dries (1999: 122-4), based on Deane (1987, 1992) Least entrenched
1 2
3 4 5 6 7 8 9 10 11
Abstract entities Sensual entities Locations Containers Concrete objects Animate beings (other than humans) Kin terms Proper names 3"' person singular pronoun 2'"' person singular pronoun 1 M person singular pronoun Most entrenched
(1983) have argued that the two possible constructions denote the same objective situation (i.e. the two constructions arc, apart from cases of idiomatic meanings discussed above, truth-conditionally equivalent), but the position of the particle tends to highlight one out of two possible readings: when the particle is postposed. it tends to modify the noun; when it stands next to the verb, it behaves more like a verbal affix (cf. Bolinger 1971: 82). More specifically, Bolinger has argued that speakers use the coupling of accent and the position of the particle to highlight what is important about the utterance: accenting an item and/or its sentence-final position serve to indicate that the referent of the linguistic expression corresponds to the semantic peak of the utterance whereas de-accenting an Hem and/or its postverbal position marks it as being redundant (Bolinger 1971:58-9)." Ycaglc's (1983) analysis has solely concentrated on scmantico-cognitive motivations for the two word orders. As a starting point. Yeagle (1983: 1245) has, following Lindner's (1981) classic analysis of out and up, advocated an analysis within the framework of Cognitive Grammar where she convincingly shows that the above-mentioned distinction of whether the particle is an adverb or a preposition is fundamentally flawed in that it forces us to conclude that up belongs to two different parts of speech: we can classify up as an adverb in some cases (cf (26)) and as a preposition in others (cf. (27)). (26) a. b. (27) a. b.
Fred ran up [N!, the flag] . Free! ran [Nl, the flag] up. Fred ran [,>,, up [Nl, the hill]], *Fred ran [Xl, the hill] up.
18
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
Moreover, she also observes a more general semantico-cognitive regularity that neatly accounts for this distributional pattern: particles must not follow their landmark. In (26), the flag designates the trajector of up and the landmark of up is not encoded (up profiles the relationship of some primary/ focal participant, namely the flag, to some, in this case unexpressed landmark, probably something like a pole}: both word orders are possible without violating Yeagle's rule. In (27), however, Fred denotes the trajector of up while the hill denotes the landmark of up and (27b) is ruled out since the particle follows its landmark.12 Yeagle's analysis of particle placement is derived from her analysis of the word class of particles. According to her, construction,, is used in order to highlight the unified view of the action and its end as continuous construction, is used in order to highlight the resultant state (designated by the particle) of the referent of the direct object. Unfortunately, no independent evidence is offered to support these claims. 2.4 Discourse-functional variables
So far, our main focus has been on variables determining the final phonological and morphosyntactic shape of VPGs and variables determining their sense. However, in the preceding section, it was already shown that, following Bolinger (1971), the context of an utterance can influence the choice of a construction by a native speaker as well. Among very first accounts in which the context of the utterance in question has been considered are works by Kruisiriga and Erades (1953) and Erades (1961), who proposed that the news value of the direct object motivates the choice of construction. Consider (28). (28) a. ?Wc'll make up a parcel for them . . . On the morning of Christmas Eve together we made up the parcel. b. We'll make up a pareel for them . . . On the morning of Christmas Eve together we made the parcel up.
In (28), where the referent of the direct object is introduced in the first sentence, the panel is not newsworthy in the second sentence, where the TPV is used. Thus, it is construction, that is preferred (i.e. (28b)) while construction,, in (28a) sounds rather odd. One empirical investigation of the role of news value has been done by Bock (1977). In a question-answering task on particle placement and nine other cases of syntactic variation, Bock found that construction, was preferred by speakers when the referent of the direct object was mentioned in the question whereas construction,, was more often used when the direct object's referent had not been mentioned previously. A second study on the role of news value or givenness of the direct object's referent is Peters (1999, 2001) using a picture-description sentence completion task. Subjects were asked to complete sentences following a stimulus sentence and a schematic
REVIEW OF LITERATURE
19
black line drawing; the independent variable was whether the referent of the direct object to be produced by the subjects (given the drawing) was mentioned in the stimulus sentence (given) or not (new) the dependent variable was the choice of construction. Two of Pcters's findings are relevant to our analysis: first, there is a highly significant overall preference for construction,, (both in the control condition LX2(1) = 6.942; p = .008] and, taken together, in the two experimental conditions [/ (1) = 23.7; p<.00]). Second, according to Peters, the results of this experiment demonstrate chat one key factor to the ordering of the particle and noun phrase [. . . ] is the presence of a noun phrase that is cither given information or new information to ihe discourse situation. If the noun phrase has been previously mentioned (given), it tends to surface much more frequently than expected in the non-canonical position between the verb and the particle (1999: n. pag.). it, thus, seems as if the effect of news value has been supported experimentally (we will, however, return to Peters's study below in section 2.6.1), so we will include the variable last mention of the referent of the direct object (as an operationalization of news value) into our analysis. Apart from these cases (where there is a preceding context introducing the referent of the direct object), the variable of news value also accounts for sentences in which an existential phrasal verb introduces something new to the discourse setting: (29) a. b. c. d. c. f.
It opens up unlimited possibilities. Pit opens unlimited possibilities up. It lets in a certain doubt. ?It lets a certain doubt in. It breaks up a nice combination, ?It breaks a nice combination up.
In (29), the existential verbs introduce the referent of the direct object as a consequence of something mentioned earlier. Since the referents of the direct object (the consequences) have not been referred to earlier, (29b), (29d) and (29f) are worse than their constructional counterparts in (29a), (29c) and (29e). This variable appears to be a very useful one in that it alone can account for several up to then unrelated observations: it explains why pronouns and semi-pronominal (i.e. referentially vague) nouns normally require construction, (they are generally used for discourse-familiar referents that need not be identified by further modifiers) whereas heavily modified nouns most frequently occur in construction,,: in the latter case, the head nominal is modified by many other constituents whereby the noun phrase is enriched information ally. The notion of news value even accounts for more subtle instances (Bolinger 1971: 56):
20
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS (30) a. b. c. d.
?W'here's Joe? - He's sailing in his boat. Where's Joe? - He's sailing his boat in. Where's Joe? - He's hauling in his boat. Where's Joe? He's hauling his boat in.
The verb in (30a) and (30b) already allows to infer its direct object or, put differently, already presupposes or activates the referent of the direct object: the activity of sailing normally requires a boat. Thus, the direct object has little news value, yielding a preference for construction] (cf. (30b)). In (30c) and (30d), the action denoted by the verb can apply to a much less constrained variety of entities so that both sentences are acceptable. Likewise, the same explanation can be extended to explain the preference for construction) in (31), where the mention of the time in this particular context already foreshadows the nightie: (31) a. It's almost ten o'clock. Put your nightie on, and run up to bed. We will return to some similar cases in later sections.
In some later analyses within functionalist and/or psycholinguistic schools of thought (e.g. Chen 1986 and Grics 1999), the role the context of the utterance in question can play for particle placement has been investigated in some more detail. Another variable that can also be seen as an operationalization of the news value of the direct object is the distance to the last mention of the referent of the direct object. On the basis of a corpus analysis, Chen has shown that the smaller the distance between the last mention of the referent of the direct object in the preceding discourse and the utterance where the speaker has to choose one construction, the more likely the speaker will choose construction,. If, for instance, the referent of the direct object has been mentioned in the sentence immediately preceding the speaker's decision for a construction (i.e. the distance is 0). then construction! is much more probable than it would be if the distance were 9. Furthermore, not only is the choice for a construction influenced by the distance to the last mention of the direct object's referent, the times of preceding mention of the referent of the direct object is also relevant: the more often the referent of the direct object has been mentioned in the discourse preceding the decision for one of the constructions, the more probable it is that construction) will be used. 15 So far we have been considering factors concerned with either the parts of the TPV construction or with the preceding discourse. One additional factor investigated and empirically supported by Chen (1986) is connected to the discourse following the utterance in question. Chen (1986: 85) has called this factor relevance, and, following the work by Givon (1983) that Chen himself has referred to explicitly, the variables concerning the subsequent discourse have to do with the (thematic) importance of the referent of the direct object in the particular discourse. According to Givon (e.g.
REVIEW OF LITERATURE
21
1992b: 16 17), referents that were mentioned most frequently in a given stretch of text/discourse are most important in that text/discourse. Therefore, importance is measured in terms of the times of subsequent mention of the referent of the direct object: the more often the referent of the direct object is mentioned in the discourse following the VPC. the higher the probability of construction,, - conversely, the less often the referent is mentioned in the subsequent discourse, the higher the probability for construction!. The last discourse-functional factor to be considered is also connected to the discourse following the utterance in question. It is concerned with the number of clauses until the referent of the direct object in the sentence with the TPV is mentioned again in the subsequent discourse (i.e. distance to next mention of the referent of the direct object; Chen 1986: 85): the earlier the referent of the direct object is mentioned in the discourse following the VPC, the higher the probability of construction, — likewise, the later the referent is mentioned subsequently, the higher the probability of construction, ,.u 2.5 Other variables
Up to now, we have discussed variables that can be easily and unambiguously classified as belonging to one of several widely agreed-on subbranches of linguistics. The following variables are those that cannot so easily be categorized into one of these classic fields of research. According\o Fraser (1974: 571 2) and Chen (1986: 82-3), the presence of a directional adverbial (e.g. a prepositional phrase or a directional adverb such as there) immediately following the VPC results in a preference for construction,; thus, (32b) should be more natural for native speakers than (32a), although the influence of this variable is probably again too subtle to yield noticeable differences in acceptability in isolated sentences. (32) a. Fred picked up the book from the table, b. Fred picked the book up from the table.
It should be noted in passing that this variable is the only one that seems to be relevant synchronically (as was just, mentioned) and diachronically: Meroney (1943: 8. 10, 42) has observed that, in OE, following prepositional phrases forced the particle out of its prcverbal position and thereby reinforced the First Particle Shift from preverbal to postverbal position. Another analysis of particle placement has related the constructional alternation to the previous variable and has argued that in the case of directional prepositional phrases following the VPC, construction, is preferred. If, however, the particle is identical to the preposition of the following prepositional phrase, then construction,, is more probable to occur; i.e. (33a) is more likely to be used than (33b).
22
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
(33) a. Fred packed in a lot more things in his clay, b. PFred packed a lot more things in in his day.
Finally, Arnold and Wasow (1996) have postulated that production and planning effects also have an effect on particle placement: the more difficult the noun phrase is to produce,15 the more frequently construction,, is chosen; the less difficult the noun phrase is to produce, the more frequently construction, is chosen. This hypothesis was empirically supported in a production experiment, and a logistic regression showed that both the length of the direct object noun phrase and the number of disfluencies have a significant effect on the choice of construction. Obviously, the influence of noun phrase length will contribute to production difficulties (which Arnold and Wasow explicitly mention), however, it is slightly unfortunate that Arnold and Wasow do not discuss other reasons contributing to the production difficulties; cf. Levin et al. (1967) as well as Goldman-Eisler (1968) for experiments indicating that content difficulty can also result in production disfluencies. 2.6 Interim summary and critical evaluation The preceding sections have dealt with many variables that were argued to influence particle placement. These variables and the effects they purportedly have arc summarized in Table 2:2. This section is devoted to the critical evaluation of these variables. First, I want to address some (though not all) problems that arise with respect to specific variables that were postulated in the literature (section 2.6.1). Second, I will voice criticism on some more general aspects that concern the general research methodology employed in the past 100 years (section 2.6.2). 2.6.1 Comments on specific variables The first specific point of criticism is concerned with Fraser's variable phonetic shape of the verb. As was briefly discussed above, Fraser has argued that verbs not bearing initial stress prefer construction,. However, it has to be noticed that the evidence for Fraser's analysis is extraordinarily weak. In more than 100 analyses, this variable was proposed only in Fraser's abovementioned account (not even in his other works on the same topic) so that, even after its publication, the postulation of this variable was not supported by a single analysis. One reason for this lack of support is probably that, already about 50 years prior to Fraser's work, Kennedy observed that [i]t is very noticeable that the verbs that enter into these combinations are largely monosyllables. Out of 826 combinations specially examined only ninety-seven have disyllabic verbs. And of these only three bear the accent on the last syllable, namely, collect (up), connect (up), divide (up). (1920: 29)16
23
REVIEW OF LITERATURE Table 2:2 Variables that arc argued to contribute to particle placement Value / Level for construction,! stressed direct object
definite /none long complex idiomatic habitual
low
direct object high long low
short high
yes
high
Variable
Value /Level for construction,
stress pattern of the verb phrase phonetic shape of the verb
verb has no initial stress
NP type, of the direct object determiner of the direct object length of the direct object complexity of the direct object meaning of the verb phrase. meaning of the verb phrase , semantic modification of the particle cognitive entrenchment of the direct object's referent focus of the verb phrase news value of the direct object's referent distance to last mention of the direct object's referent times of preceding mention of the direct object's referent distance to next mention of the direct object's referent times of subsequent mention of the direct object's referent following directional PP/advcrb particle = preposition of foil. PP production difficulty of/ disfluencies in the utterance
Type of variable
(semi-) pronominal indefinite
yes
high particle low
short high longlow
yes low
Additionally, even Eraser's own three examples suffer from the fact that he fails to control the values/levels of other variables influencing particle placement (partly even those he himself discusses). This drawback will be discussed more thoroughly in the following section - suffice it to say here that (he purported preference for construction, could in principle result from several other variables' influences as well. Given these weaknesses, it is hard to imagine any variable that is less supported both argumcntativcly and empirically. Therefore, I assume this
24
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
variable not to be relevant to the constructional alternation to be investigated and will not deal with it in what follows. A second of Fraser's variables deserving further discussion is the habitual sense of the verb. Fraser hypothesized that the habitual sense of the verb correlates positively with construction(l. Once again, however, the basis for this claim is somewhat dubious: Fraser himself admits he has no explanation for this observation (Fraser 1976: 21) and, as is the case with many of his claims, the data on which the 'observation' is based are only intuitive and subjective. What is more. Fraser's tentative claim that the habitual sense of the verb phrase is related to stress patterns seems sketchy and farfetched. 17 Finally, no study has ever found any support for this claim. Therefore, in what follows, this variable will also not be considered any further. Bolinger's account (or, at least, the way it is supported argumentatively) also stands in need of further qualification. As has already been criticized by Vestergaard (1974: 308), Bolinger's treatment lacks a principled distinction 'between cognitive and informational aspects of language': on the one hand. Bolingcr has repeatedly relied on the notion of semantic focus (which is why his analysis was discussed in this section), on the other hand, his argumentation and exemplification frequently makes reference to (the inferrability of referents due to) the preceding context and takes on a pragmatic line of reasoning, so perhaps Bolinger's variable should rather be counted as a discourse-functional variable. However, let us first discuss Yeagle's analysis before returning to this variable again below. Yeagle's analysis of particle placement in semantic terms has argued for a patterning of the two possible constructions with different subjective semantic coristruals of the objective scene. However, while her second assumption (that a sentence-final position emphasizes the resultant state of the referent of the direct object) corresponds to observations already discussed in section 2.1, the first of these two assumptions (that construction,, highlights the unified view of the action as continuous) seems a little sketchy and it is difficult to see how this assumption could be empirically tested or even falsified. If some kind of semantic focus plays a role here at all, the following explanation is probably much more reasonable (and still in line with her general approach to the problem). We have seen above that sentence-final and stressed direct objects prefer construction,, — analogously, stressed particles as in (34) prefer construction, (i.e. sentence-final position), too. (34) a. ??Frcd gave AWAY the brown book, not back, b. Fred gave the brown book AWAY, not back.
Since in both cases (examples ( 1 1 ) and (34)) the item to be emphasized is positioned sentence-finally (possibly serving a coiitrastive function), the rule that accounts for both of the observations of the stress pattern and 'semantic focus' will definitely be related to the notion of end-focus or end-weight
REVIEW OF LITERATURE
25
in English (cf. Quirk et al. 1985: c:h. 18): if the referent of a linguistic expression is to be emphasized, then frequently a construction is chosen that allows the speaker' to place this expression sentence-finally. In other words, we only need to extend Yeagle's justification of a speaker's choice of construction] (namely that the semantic focus of the utterance is on the particle) to construction0 (i.e. the constructional counterpart, where the semantic focus is on the referent of the direct object). This line of reasoning can be beautifully illustrated by the well-known constructional preferences in question-answer sequences in (35) and (36). (35) a. b. (36) a. b.
What did he pick up? - He picked up the book, ?What did he pick up? - He picked the book up. PWhere did he bring the book? He brought back the book, Where did he bring the book? He brought the book back.
Further support for this pragmatic interpretation of Bolinger's and Yeagle's analyses comes from Halliday (1985), who, just like Bolinger, bases his account on the interplay between stress and end-focus. If a speaker were to tell a hearer about a meeting having been cancelled, then he would have the choice between a non-phrasal or a phrasal verb to formulate the utterance. If he wanted to concentrate on the fact that it was a meeting that was being cancelled (rather than a concert), the choice between the two kinds of verbs would not result in any difference: in both (37a) and (37b) the stressed direct object the meeting (the stress signalling its importance for the hearer) is in the unmarked position for the information focus. (37) a. They cancelled THF. MF.F.TING. b. They called off THE MEETING.
If, however, the speaker wanted to concentrate on the fact that a meeting was cancelled (rather than summoned), the verb would have to be stressed which would result in (38). (38) a. They CANCELLED the meeting.
But in (38) the information-focus is now non-final, the word order is marked and, thus, carries with it 'additional overtones of contrast, contradiction or unexpectedness' (Halliday 1985: 185). Still though, these overtones might not be what the speaker wanted to convey, but the only way to avoid them would be to put the stressed item in its unmarked place at the end. With a non-phrasal verb this would be ungrammatical in English. (39) a. They the meeting CANCELLED.
With a phrasal verb, however, the speaker has the option of stressing and positioning at least a part of it in the unmarked position of the information focus.
26
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS (40) b. They called the meeting OFF.
In sum, the two different construals of both constructions are intimately related to end-focus, which seems to be much more elegant than Yeagle's (1983) ad hoc assumption. Some of the discourse-functional variables considered by Chen (1986), Gries (1999) and Peters (1999, 2001) merit further attention, too. Some of these variables (such as distance to next mention, times of subsequent mention] presuppose that there is a next mention of the direct object's referent in the subsequent discourse. However, this, of course, need not be the case, one should rather start by checking whether there is a next mention at all - it might be the case that particle placement is solely influenced by a next mention without its exact distance having any particular effect; cf. the above discussion of Peters (1999, 2001). Therefore, we will also test this effect, including the variable next mention of the referent of the direct object in the analysis for the sake of completeness. In fact it is difficult to see how exactly these variables should be related to the choice of construction: at the point in time when the speaker has to decide on either of the two possible constructions, he will definitely have an idea about how important the referent of the direct object has been in the preceding discourse (and is now) — however, the speaker cannot foresee precisely the development of the discourse in the future and, thus, simply cannot reliably estimate or even know how often both he and the hearcr(s) or other speakers (in the case of oral discourse/dialogues) will refer to the direct object's referent again. Thus, it is highly unlikely that the discourse following the VPC will exert a measurable and/or significant influence (i.e. one that goes beyond random noise) on the utterance itself.1" But what is more, the argument is deficient in some other respect as well: once its logical conclusions are pursued it is contradictory. It was argued that times of subsequent mention of the referent of the direct object and distance to next mention of the referent of the direct object are ways of operationalizing the importance of this particular referent since interlocutors consider frequently mentioned items to be more important (cf. Givon 1988: 248). If this is indeed the case (and, at present, I do not see any reason to doubt this intuitively appealing and empirically validated claim), then the question arises as to why it should only be the discourse following the VPC that should serve as data for quantifying the importance of the direct object's referent. Put differently, why not also count the occurrences of the direct object's referent in the preceding discourse in order to measure its importance? Since there is no argument for why we should restrict our attention to the discourse following the VPC or for why the frequency of the referent of the direct object in the preceding discourse is not also indicative of its importance, the preceding discourse should in principle also be used to estimate the importance of the referent of the direct object in the whole of the discourse. But now we have arrived at a paradox: on the one hand, it is claimed that a high value of times of preceding mention yields a preference for construction, because then the
REVIEW OF LITERATURE
27
news value of the referent of the direct object in ihe VPC will be low; on the other hand, it follows from the above-mentioned argument that a high value of times of preceding mention also yields a preference for construction,, because the referent of the direct object will be highly important. Thus, for both of the just mentioned reasons, I do not think (anymore) that the discourse following the decision for one of the two constructions will contribute to particle placement significantly.1'' However, in order to find out whether overall frequency (i.e. preceding and following context) plays a role for the constructional choice, we will include the variable overall mention of the referent of the direct object in the empirical analysis. We will have a look at Chen's results concerning these variables again when we discuss the empirical results of the present study. A final point of criticism concerning Chen (1986) and Gries (1999) is of a more general nature and is concerned with the way of operalionalizing the news value or the thematic importance of the referent of the direct object. In several papers, Bolkestein (1985) and Bolkcstein and Risselada (1987) have criticized Givon for claiming that in many languages there is a correlation between the selection of a syntactic construction [ . . . j and the so-called 'degree of topicality' of the constituents involved [ . . . ] . Roughly defined, the degree of topicality of an item is simply computed on the basis of the distribution of coref event elements in the surrounding discourse, specifically its frequency of occurrence and the distance between the various occurrences. (Bolkestein 1985: 2) This is argued to be problematic since it presupposes that only explicit/ direct mention of an item or co-referential linguistic expressions contribute to the degree of topicality/thematic importance of referents of linguistic expressions. Thus, they suggest refining these pragmatic notions and their way of operationalization by also considering other linguistic indicators of these psychologically grounded phenomena. Therefore, they introduce the notion of cohesiveness of the referent of a linguistic expression to the preceding/subsequent discourse, which can be defined best by the authors themselves: A constituent x is cohesive if x is coreferent to another item y in the sentence itself or the larger discourse; or if it is semantically related to another item y in the sentence or discourse, for example by sharing certain semantic features; or by being antonymous to y; or by standing in a part-whole relation to y; or by being a co-member of y in some superclass; or by being itself a subclass or superclass of y; or if it is pragmatically related to y, for example by being in contrast with it; or by being 'evoked' by it or 'inferrable' from it; etc. (Bolkestein and Risselada 1987: 503)''" This definition of cohesiveness bears resemblance to the notions of bridging (cf. Clark 1977) and infcrrability (cf, e.g.. Prince 1981) and is fully compatible with accounts based on activation where it could be argued that, e.g.,
28 MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
mentioning the superordinate term y of some category x activates x by means of spreading activation (cf, e.g., Deane 1992: 34-5; for a more detailed introduction, cf. Anderson 1996: 177-85). This account has been experimentally tested by Moss et al. (1997), who have interpreted their results as supporting Bolkestein and Risselada's analysis. Consequently, apart from the above-mentioned discourse-functional variables considered by Chen (1986) and Grics (1999), in the present study the cohesiveness of the referent of the direct object to the preceding/subsequent discourse will also be considered as variables possibly contributing to particle placement; the exact way of operationalization will be explained in section 5.2. Let us now briefly consider the most important results of Peters (1999). It was already mentioned above that Peters claims to have shown that the distribution of constructions is at least to some extent influenced by the givenness/newness of the referent of the direct object (DO). Consider Table 2:3 for an overview over the results of Peters (1999), focusing only on the two experimental conditions (as these are our only concern) and leaving aside a variety of results specific to the phrasal verbs that were used. This distribution of constructions is significant (jj(\} = 9.82; p = .0017): as was argued by Peters, we indeed find that construction,, is the generally preferred word order variant. This, however, has the consequence that the status of the DO's referent does not help us to predict which construction the speaker will choose. Suppose we were to predict the constructional choice of a speaker without knowing the givenness status of the DO's referent. In this case, we would always guess that the speaker would use construction() simply because we see from the row totals that construction,, is in general used more often, namely in 62.22 per cent of all cases our error rate would correspondingly be 37.78 per cent. Suppose that we, again, were to predict some speaker's choice of construction, but now know the givenness status of the DO's referent. The crucial point now is that, whatever the givenness status of the referent of the DO, we would still guess constructionl); which is, in Peters's data at least, used more often regardless of the givenness status (54.05 per cent and 69.34 per cent of the cases); as a result, we would still have an error rate of 37.78 per cent. There is a coefficient of correlation quantifying exactly (his sort of correlation: Asymmetric lambda (A,) assesses the percentage of error reduction in predicting the dependent variable (here: choice of construction) once we know the value of the Table 2:3 Partial results of Peters (1999): Absolute frequencies and column percentages
Consiruction0 Construction! Column Totals
Referent of DO — given
Referent of DO = new
Row Jotals
100(54.05%) 85 (45.95%) 185(100%)
147(69.34%) 1)5 (30.66%) 212(100%)
247 (62.22%) 150(37.78%) 397 (100%)
REVIEW OF EITERATURE
29
independent variable (here: givenness status), and in our case it is obviously 0. Thus, on the basis of the data in Table 2:3, one cannot conclude that givenness has a mcntionablc effect on the choice of construction.21 Finally, the treatment of cognitive entrenchment in terms of the entrenchment hierarchy proposed by Ueane (1992) and applied to particle placement by Grics (1999) also calls for critical comment. Although it could be shown that the position of the referent of the direct object on the entrenchment hierarchy correlated significantly with the occurrence and the acceptability of the two constructions, I now consider the use oi" this hierarchy as an entrenchment hierarchy as questionable and misleading for the following three reasons. First, as I have already mentioned in my previous study (cf. Gries 1999: 141 n. 15), the entrenchment hierarchy given above is not without problems since: • the levels postulated differ conceptually in that their categories are cither semantic/encyclopaedic (e.g. container) or grammatical (pronouns);2' • the levels postulated differ hierarchically: some categories arc superordinate (e.g. container or concrete object) and some are quite specific (e.g. P1 person singular personal pronoun); • the levels postulated do not distinguish between referents that should be fully entrenched for a speaker due to their position on the hierarchy but are not by virtue of never having been acquired by this very speaker (cf. already FJeane 1992: 196). Second, in order to at least partially overcome these shortcomings and test for the relevance of the entrenchment hierarchy more thoroughly, I partly revised the hierarchy by (i) subsuming the personal pronouns I, you, he, she under a single class and (if) subsuming the category container under the category concrete object. Then I tested the corpus data of Grics (1999) for whether the theoretically revised entrenchment hierarchy (EH) still correlates significantly with the choice of construction; the results showed that this is not the case since both effects clearly fail to reach significance (r KH ,.,, nMrui . ti ,, n( i = —.31; Po,,,-u,ik,i = -23 and rni ,.,insln,,,innl = .48; p,,n,Milil,d = .12). Third, I would like to draw attention to a most fundamental flaw that underlies many of the previously discussed approaches and that will be illustrated in the following sections in much more detail. This Haw is related to the criticism levelled at some of Fraser's arguments in the beginning of this section and derives from the still current prevalence of monocausal approaches to syntactic variation. If a variable is investigated empirically in isolation, it might very well be the case that a significant effect can indeed be detected, but the problem, especially with more complex phenomena, is that one needs to test whether the impact found is due to the variable as a whole or due to some conceptual subpartfs) of the variable. In the case at hand, the entrenchment hierarchy is a rather complex variable, consisting, as it were, of a combination of several more primitive variables, each of which has
30
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
Table 2:4 The entrenchment hierarchy and its constitutive subparts Least entrenched
1'
• A b s t r a c t e n t i t i e s i n a n i m a t e a b s t r a c t l e x i c a l
2 3 4 5 6
Sensual entities ' Locations Containers Concrete objects Animate beings (other than humans) Kin terms Proper names
7 8
animate
3rd person singular pronoun 9 2nd person singular pronoun • 10 11 , ' 1 st person singular pronoun human animate concrete
proper name personal pronouns
Most entrenched
already proven to be relevant for a variety of linguistic phenomena;2! this is represented in Table 2:4. If the above reasoning is applied to the case at hand, one has to test whether the explanatory power of the entrenchment hierarchy is superior to that of its constitutive subparts in combination: in Gries (1999), a highly significant Spearman rank correlation was found between the position of the referent of the direct object on the entrenchment hierarchy and the choice of construction (rs = .68; ptwo_lajiM| = .001). However, if, for the same data set, a multifactorial analysis is calculated lor the relation between all the constitutive subparts of the entrenchment hierarchy (and their interactions) and the choice of construction, it becomes obvious that the explanatory power of the individual subparts of the entrenchment hierarchy and their interactions is much greater than that of the entrenchment hierarchy alone: R = .89; F(10, 136) = 50.912; p<.001. In other words, taken together, the individual variables account for particle placement much better than their joint combination into the entrenchment hierarchy - in fact, we lose explanatory power if we use the entrenchment hierarchy rather than its subparts.24 This result is further supported by comparing the partial correlations"' for these variables in a multifactorial analysis: the partial correlation of the entrenchment hierarchy with the choice of construction is the second smallest of all the above variables entering into the equation. In sum, it seems as if the role of entrenchment (at least when measured using Deane's (1992) or Grics's (1999) entrenchment hierarchy) simply does not exist26 - it is merely a statistical artefact that has arisen from a monocausal analysis without considering the simultaneous interrelationships between several variables, a problem which has important methodological
RKVIEVV OF LITERATURE
31
consequences both in general and for the particular approach to be pursued here. However, the analysis has nevertheless been useful to the extent that it has shown that there are some semantic variables (namely those constituting the effect of the entrenchment hierarchy as a whole; cf. Table 2:4) that merit further attention in the analysis. There-fore, these variables, namely animacy of the referent of the direct object and concreteness of the referent of the direct object will be added to the list of variables to be empirically tested; moreover, proper names will be included as a level of the morphosyntactic variable MP type of the direct object. 2.6.2 Comments on more general methodological aspects Apart from the preceding comments on some specific variables, the discussion of the notion of entrenchment has already addressed a major methodological problem; this section will illustrate some further problems. As a starting point, I would like to draw attention to something quite obvious by now: previous research has documented that particle placement, which may at a superficial glance seem simple to describe and purely syntactic in nature, is, on closer inspection, much more intricate. This has important ramifications for (cognitive-) linguistic analyses of syntactic variation that have, at least on the whole, been neglected so far. First of all, it is highly unfortunate that the majority of analyses is based on introspective analyses of made-up sentences and intuitive acceptability judgements on the part of the investigating linguist. While some frameworks (most notably Transformational-Generative grammar) in principle allow for this avenue of research, from my point of view this methodology does not do justice (i) to generally accepted standards of scientific research to be found in many other disciplines2' and (ii) to the complexity of the phenomenon to be investigated. The number of independent variables an introspective linguist would have to consider simultaneously in his analysis of particle placement is so large as to render this subjective and error-prone to the point of being vacuous (cf. Fraser's comments on the difficulty of obtaining unequivocal judgements supporting his own analysis mentioned above). In this respect, a further point of criticism can be levelled against many purely structurally oriented approaches (at least from a cognitive-functional and corpus-based point of view): many of the variables investigated so far would not even constitute data in those approaches since, by definition, they would be relegated to the irrelevant realm of performance data. Legitimate as a restriction on core grammar is in such theories, from the point of view advocated here, this a priori limitation on what constitutes data is highly questionable (cf. Lakoff 1991 for a more elaborate discussion). In sum, the small number of empirical analyses of naturally occurring data is a severe drawback for the research on particle placement.'!> Apart from the fact that most previous analyses are based on introspective data, a second major problem of most preceding analyses can be identified. Scientific research is generally held to pursue one or more out of the
32
M L I .TIFACTORIAL ANA! ,YSIS IN CORPUS 1JNGUISTICS
following three major objectives: description of the problem at hand, its explanation and the prediction of future states of affairs concerning the particular problem. Let us briefly discuss these objectives and their degree of attainment for particle placement. In spite of the large amount of research devoted to particle placement, it should have become obvious by now that we still lack even a descriptively adequate account of particle placement for two different reasons. First, for many of the above-mentioned variables only the consequences of one of its possible values/levels has been identified. For instance, researchers have unanimously agreed that the level complex of the variable complexity of the direct object favours construction,,. However, up to now, nothing is known about which tendency is to be found if the direct object is not complex or somewhere between simple and complex (since complexity is a matter of degree). Consider Table 2:5, where part of a fictitious result of an empirical analysis of the variable complexity (for expository reasons only as a binaryvariable) is represented. This table would be a legitimate basis for concluding that complex objects correlate with construction,, in that it represents the generally accepted finding just mentioned, but nothing is said about the distribution of simple direct objects. Analogously, Table 2:6 to Table 2:8 would equally support the claim that complex direct objects tend to occur in construction,,. This matter is further complicated by the fact that testing, e.g., Table 2:7 and Table 2:8 for significance using a Chi-square test would yield significant results, although the conceptual implications of the two tables differ drastically, of course. In other words, the ubiquitous claim that complex objects favour construction,, is interesting in itself, but it does not even provide an adequate or complete description of the single variable complexity as only part of the set of possible values/levels of this variable is referred to Table 2:5 Fictitious analysis of the complexity of the direct object 1 Simple direct object Conslruction,, Construction,
Column totals
Complex direct object
Row totah
90 10
100
100
200
Table 2:6 Fictitious analysis of the complexity of the direct object 2
Construction,, Construction, Column totals
Simple direct object
Complex direct object
Row totals
90 10 100
90 10 100
180 20 200
RFA-'IKVV OF LITERATURE
33
Table 2:7 Fictitious analysis of the complexity ol the direct object 3
Construction,, Construction, Column totals
Simple direct object
Complex direct object
Row totals
10 90 100
90 10 100
100 100 200
Table 2:8 Fictitious analysis of the complexity of the direct object 4
Construction,, Construction, Column totals
Simple direct object
Complex direct object
Row totals
50 50 100
90 10 100
140 60 200
explicitly;''1 unfortunately, we have seen that similar remarks apply to many variables. The second point that is relevant in this connection is that the interrelationships and interactions of these variables are, up to now, barely known. More generally speaking, previous analyses have, on the whole, been monocausal in nature, which is why they suffer from several important drawbacks. In order to illustrate these more clearly, I would like briefly to come back to the critique of Gries's (1999) entrenchment analysis in the previous section. It was shown that monocausal analyses can result in producing mere (statistical) artefacts. Note that this danger is not only restricted to empirical quantitative studies alone; in fact, one might argue that this danger is even more apparent in many introspective analyses that prevail in the literature. Consider Fraser's variable habitual sense of the verb: while he superficially managed to back up his argument with some examples, his argument suffers from the fact (already mentioned above for the variable phonetic shape of the verb phrase) that he does not consider the other variables' values/levels at the same time. If one has to assume that there is a multitude of variables influencing particle placement, it is utterly senseless to corroborate one's claims with respect to a single variable with examples where the remaining variables are neither controlled nor taken into consideration by the investigator; given the complexity of the alternation, one can never be sure whether Fraser's judgements really result from the habitual sense of the verb, or the following directional prepositional phrase, or one (or a combination) of many other variables. This line of reasoning extends to most of the existing approaches to syntactic variation in general. In order to exemplify this and at the same time anticipate a possible objection to this quite abstract argumentation, it is essential to note that this
34
MULTIFACTORIAL ANALYSIS IN CORPUS LIXCUISTICS
methodological drawback particularly pertains to a classic research method in linguistics: the contrastive comparison of minimal pairs. While it is true that minimal pair tests definitely have proven their worth for a variety of linguistic problems, their power should not be overestimated for two reasons. On the one hand, the argument just discussed is in fact of vital importance lor minimal pairs although the choice of minimal pairs superficially suggests that all variables held constant in the minimal pair are controlled for. The problem lies again in the variables not manipulated or. more specifically, in the fact that it could be the variables held constant that arc responsible for the observed facts rather than the ones manipulated in the minimal pair. Consider once again a variable (namely phonetic shape of the, verb) and an example from Frascr (1974). In order to substantiate his claim that verbs without initial stress prefer constructionh he provides the following minimal pairs and acceptability judgements: (41) a. b. (42) a. b. (43) a. b.
?I will insult back the man. I will insult the man back. ?We converted over the heating to steam, We converted the heating over to steam. ?They attached up the tag on the wall, They attached the tag up on the wall.
Following linguistic convention, Fraser manipulates only the constructional choice, holds constant all other variables and assigns acceptability values. The question now is, however, how does Fraser know that it is the phonetic form of the verb that is responsible for the purported preference for the (b) sentences rather than other variables he also discusses in the same paper? In (41), the direct object is simple, very short and definite, and in (42) and (43), the direct object is simple, very short and definite and the verb phrase is followed by a directional adverbial, all properties that also result in a preference for the (b) sentences. The answer is simple: he cannot know this for certain. The data he cites do not warrant this conclusion as they fail to account for these additional variables. This instance shows that, in complex cases like particle placement, minimal pairs can distort the picture more than they arc helpful in spite of (i) their time-honoured place in linguistics and (ii) the fact that, at a first superficial glance, they account for the variables that are manipulated by the investigating linguist. On the other hand, it has long been shown in more statistics-based fields of study that, in complex designs such as the present one, independent variables often interact with one another with respect to the impact they have on the dependent variable. That is, the effect one independent variable (such as, e.g., the following directional prepositional phrase) has on particle placement can depend on another independent variable (e.g. the idiomaticity of the verb phrase). Thus, when one relies on a minimal pair of the two constructions one has to bear in mind that, at least in principle,
REVIEW OF LITERATURE
35
the constellation oi" variables not investigated may be responsible for interactions distorting the picture. More specifically; it is possible that following directional prepositional phrases only lead to the purported preference for construction, if there is also an indefinite determiner of the direct object. Phenomena like these are hard to identify using classical minimal pair tests, which contributes further to the lack of an adequate description placement. In sum, analysing variables in isolation can be fruitful for a variety of reasons (it will probably be a general starting point of most analyses), but it also has severe limitations. In a way, it is exactly the cognitive-functional branch of linguistics (advocating cognitively real explanations of linguistic data) that should attempt to account for all the complexity of the data and could learn much from corpus linguistics, where such aspects of quantitative research are quite elementary. Consider the following quotation from a classical textbook (!) on corpus linguistics, on the basis of which most analyses of particle placement should be severely criticized, if not abandoned. [. . .] straightforward significance or association tests, although important, cannot always handle the full complexity of the data. The multivariate approaches [. . .] offer a way of looking at large numbers of interrelated variables and discovering or confirming broader patterns within those variables. (McEnery and Wilson 1997:82)
Additionally monocausal analyses contradict any attempt to come up with a psychologically/cognitively real account of how linguistic production functions under real-time constraints:30 as was argued above, the language processor simultaneously receives mixed input from various sources so it is illusionary to assume that monocausal approaches can help us to model actual online processes of linguistic production let alone understand them, since, in the mind of the speaker engaged in a communicative situation, there are neither variables with values/levels in isolation (even if, polemically speaking, monocausal approaches argue as if this were the case) nor will the decision for a construction take place as a purely sequential comparison strategy. If cognitive and functional linguists want to account for syntactic variation in a cognitively realistic, way, they need to investigate the complex speech situation as speakers face them, namely by looking at all relevant parameters simultaneously/' As has been demonstrated, for two reasons we lack a descriptively adequate account of particle placement, but unfortunately we also do not yet have an account with a satisfactory explanation12 for particle placement (with the notable exceptions of the attempts by Erades 1961:57; Chen 1986; Hawkins 1991, 1994; and Gries 1999). In this respect, we have not made much progress since Van Dongen (1919), who has single-handedly managed to find nearly all the variables discussed so far.15 What is, with very few exceptions, also lacking is an approach integrating all (or, at least, the majority) of the variables discussed up to now in order to simply answer the question why does particle placement look the way it does.
36
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
Let us now turn to prediction. Here, the methodological weaknesses of prior analyses become most apparent: monocausal analyses that do not even account for all values/levels of even a single variable fail to describe and explain let alone predict speakers' decisions in situations where several variables conflict with one another (as in Fred turned down an offer, where complexity and length favour construction, and the indefinite determiner and idiomaticity favour construction,,). Given the multitude of variables that were discovered, situations with conflicting variables will be the rule rather than the exception, but so far no account addresses, let alone solves, these problems. Put differently, in many cases we cannot even predict speakers' decisions for a construction in the artificially delimited contexts of minimal pairs, which is why we are at present far from being able to predict native speakers' choices in real life. Lastly, a minor point of criticism. Frequently, the register in which language is put to use has a considerable influence on how language is put to use: spoken and written language often display profound differences with respect to the frequency or distribution of linguistic items (be it words, word senses or grammatical elements such as tenses, aspectual markers, constructions). While it is true that no clear-cut distinction between spoken and written language can be seriously upheld (cf. Bibcr 1988, 1995), both other cases of syntactic variation'54 and large-scale corpus research by Biber et al. (1999) show that there are indeed substantial differences across different registers. Hence, it should also be investigated (i) whether such differences also exist for particle placement and/or (ii) whether the impact of variables may differ depending on the register. The following chapter will present the main objectives of this study, most of which derive naturally from the critical comments discussed in the preceding two sections.
Notes 1 In spite of the argument to follow in section 2.6.2, this chapter will frequently make use of minimal pairs of sentences in order to exemplify the effect variables have on particle placement. Therefore, the present treatment relies on many judgements (taken over from the literature) as 10 the naturalness of these sentences. I do not draw any distinction here between (syntactically motivated) gramrnaticality or (pragmatically motivated) acceptability since it can, in practice, often be quite difficult to differentiate between these two effects on a principled basis. 2 However, the discussion below is organized such that the most widely cited studies within the Transformational-Generative paradigm will be referred to. 3 The first mention of the variables in each section is made in bold type so as to attract attention. In later parts of this study, references to variables will be made either in italics (in order to facilitate the reading process) or by means of abbreviations in capital letters that will be introduced in section 5.2 - references to values/levels of particular variables will also be in italics.
KF.VIKW OF LITKRA1 I'RE
37
4 This relation of stress and notions such as discourse-function does not follow automatically. Given the cognitive-functional assumptions underlying this study, this relation is only natural - in the Standard Theory of transformational grammar, however, Fraser (1976: 21) 'explains' particle placement with purely formally motivated 'output constraints' on 'the final phonetic shape of the verb phrase'. 5 According to Fraser (1974: 572), the absence of determiners correlates with construction,, as well. However, no other analyst has ever supported this claim and the category noun phrases without determiners is extraordinarily heterogeneous since- it comprises cases of plural forms, proper names, pronouns and semipronouns. It is, therefore, questionable whether using such a broadly defined category is useful. We will include the level no determiner only for the sake of completeness. 6 Strictly speaking, one might argue that the length of the direct object measured in syllables is a phonological variable since syllables are not really morphosyntactic but phonological units. While this is correct, we will nevertheless treat it as a morphosynlactic variable in the remainder of the analysis since length in syllables is contingent and highly correlated with rnorphosyntactic length measured in words. 7 This simple classification does not fully do justice to Hawkins's account since, strictly speaking, he does not merely measure the length of the direct object, but computes a so-called EIC ratio (Early-Immediate-Constituent ratio). This ratio relates the number of words a hearer has to process to the number of (phrasal) constituents made up by those words. In the case of long direct objects in construction,, we get a low EIC ratio; i.e., the hearer has to process many words until all immediate constituents of the sentence can be identified. This places a heavy burden on the allocation of memory and attention resources and, therefore, construction | is ruled out. \Vith short direct objects in construction,, we get a high ETC ratio; i.e., the hearer needs to process few words in order to identify all immediate constituents of the sentence. This way, the hearer does not face a great cognitive effort so that construction, is possible. Nevertheless, I subsume Hawkins under the variable length rather than complexity because, in VPCs, where the particle is always one word long, the calculation of the EIC ratio for a particular construction boils down to considering the length of the direct object in most cases (exceptions are direct objects noun phrases with FPs or subordinate clauses); cf, however, section 4.5 where 'particle phrases' are discussed that arc longer than one word. 8 Actually, Fraser supports his claim about the relevance of the complexity of the direct object with his own intuitive judgement while, even in the immediately preceding sentence, he in fact undermines his own analysis by stating that 'it is almost impossible to get agreement on what is acceptable and what is not for many of the borderline cases' (Fraser 1966: 59 n. 3). Strangely enough, this acknowledgement has not motivated him to pursue a methodologically more reliable line of reasoning. 9 Tn this study; I follow Eraser's (1976: v) definition of idiomaticity: 'I shall regard an idiom to be a single constituent or series of constituents, whose semantic interpretation is independent of the fbrmatives which compose it'. In other words, I take a sentence to be idiomatic to the extent that the meaning of the sentence as a whole is not compositional, i.e. a function of the meaning of its individual parts in their structural configuration.
38
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
10 Bolingcr (1971) has discussed three levels of what he refers to as stereotyping: The first compositional layer is the simple association of a verb and a particle. The second layer is a differentiation within the phrasal verb, related to the varying positions of the particle and other factors. (112) There is also what might be termed third-level stereotyping, in which the entire phrase is frozen. These are idioms. (114) He later has gone on to observe that with idiomatic third-level stereotyping construction,, prevails: [t]he most frequent case is that of a particle captured next to the verb in a second-level stereotype that is only slightly specialised from the literal meaning. Both positions of the particle arc still possible, but the stereotyped meaning is relatively more frequent when the particle is next to the verb and the nonstereolyped meaning when the two are separated: [. . .] They cut short the conversation. They cut the stick short. (121-2) In spite of some terminological differences the parallel is obvious: first-level, second-level and third-level stereotyping equal literal, metaphorical and idiomatic VPCs respectively. Similarly, Fairclough (1965) has also proposed to classify TPVs as being either literal, metaphorical and figurative or non-idiomatic, semiidiomatic and idiomatic respectively. 11 Accented items arc usually positioned sentence-finally, which corresponds to the natural order of English where end-focus is reserved for expressions denoting the focal referent of the utterance. 12 Yeaglc fails to provide any explanation as to why particles must not follow their landmark. A possible explanation, however, can be given once we interpret Yeagle's observation in terms of Behaghcl's law (cf. Behaghel 1932: 4) or, more recently, the iconic principle CLOSENESS is STRENGTH OF EFFECT (cf. Haiman 1992: 191). As was argued previously, the direct object the flag is the trajcctor of up in (26), i.e. theflagand up belong together scmantically, which is reflected by their syntactic closeness to one another in both word orders. In (27), on the other hand, the sentential subject Fred is the trajector of up, but here the word order of (27b) is ruled out not by an arbitrary rule stating that particles must not follow their landmark. Rather, what rules out (27b) is that the distance between the two linguistic expressions (i.e. up and its trajector) is too large given that the referents belong together scmantically/conceptually: (27b) violates CLOSENESS is STRENGTH OF EFFECT, and thus the distributional pattern can be accounted for in terms of a much more general principle than Yeagle's initial rule, thereby also showing that sometimes semantic considerations can reveal interesting patterns that strictly formal and/or structural approaches fail to account for. One might of course object to this analysis by (correctly) pointing out that, in English, prepositions are heads preceding the KP, but this would only 'explain' why (27a) is grammatical while (27b) is ungrammatical - the question why both word orders of (26) are possible remains unanswered and we still have the problem that up is both a preposition and an adverb. A further objection could be that it is implausible to assume that, in (26), the flag and up belong together semantically since the NP and the particle do not form a classical constituent that can, e.g., be identified by movement transformations/constituent tests. Again, this is correct, but this objection still misses the point slightly. As has been argued by Langacker (1997), Cognitive
REVIEW OF LITERATURE
13
14
15
16
39
Grammar does not rely on classical constituents in the same way other grammatical frameworks do. Langacker distinguishes conceptual constituents and phonological constituents (paralleling his distinction between the semantic and phonological pole of every linguistic sign) and claims that classical syntactic constituents only come into existence when a conceptual constituent happens to be realized by a phonological constituent. In the case of (26), he would probably argue that up and the flag make up a conceptual constituent (since the flag is the trajector of up), but not a phonological one (since the linear order of the two constituents is not predetermined). Thus, up and the flag do not make up a classical constituent that can be identified with the familiar battery of tests, but given the Cognitive Grammar view of conceptual constituents up and the flag still belong together scmantically. Distance to last mention and times of preceding mention are probabilistically related: the more often a referent is mentioned in the, say. last ten clauses before a VPC. the more likely it is that it has occurred close to this VPC. Note that Chen's diagram, which is claimed to relate the variable times of subsequent mention to Chen's notion of relevance, rather illustrates the otherwise unmentioned criterion distance to next mention of the direct object (cf. Chen 1986: 85). Presumably, this is merely a typographical error because Chen would certainly not want to equate the variables times of subsequent mention and distance, to next mention since they are not necessarily identical. They will certainly correlate positively in most cases: (i) they arc probabilistically related since the line of reasoning already put forward in note 13 for the context preceding the VPC also applies to the following discourse, and (ii) both of them can be derived from Givon's (1983) persistence. However, one can easily imagine a situation where these variables could yield contradictory results. Consider an utterance containing a direct object that is, in the following ten clauses, mentioned again only once, namely in the immediately following one. This would result in both a high and a low value of relevance showing that these variables are not identical. However, for the sake of completeness distance to next mention will be included in the analysis. By the way. this is by no means the only typographical error in Chen's study: there arc also several errors, some of which are even concerned with his numerical results: on p. 87, Chen has discussed nine examples of proper names in his study; namely two instances of what is here called construction, and six of construction,, (adding up to 8 rather than 9 instances). Additionally, on p. 88, 124 sentences with a distance to last mention value of larger than twenty have been argued to consist of 42 instances of construction, and 132 instances of construction,, (a value of 1 74 rather than 124 would have been needed to yield the overall result of 239 instances investigated). Such mistakes and others are not just a nuisance as such, but they also seriously undermine the trustworthiness of the results as a whole. The difficulty of producing a noun phrase was measured in terms of the number of production disflucncics. i.e. the number of'uh, urn, repeats, repairs, etc.' (Arnold and Wasow 1996: 1). This finding is corroborated by my own examination of 1,357 TPVs in Appendix 10.3 where, out of the 717 different verbs entering into VPCs, 583 (81.31 per cent) are monosyllabic; 127 (1 7.71 per cent) are disyllabic and 7 (0.98 per cent) have more than two syllables (average, deliver, invalid, partition, separate, summarize, telegraph). (Similar findings are reported by Martin (1990): 84 per cent
40
17
18
19
20
21
MUL11 FACTORIAL ANALYSIS IN CORPUS LINGUISTICS monosyllabic verbs, 15 per cent disyllabic verbs and 1 per cent trisyllabic verbs, namely examine and deliver^ Of the 134 non-monosyllabic verbs found in my corpus, only 10 (7.46 per cent) arc not stressed on their first syllable. More precisely, Eraser's (1976: 21) argument was that 'There appears to be some indication, however, that the stress on a verb denoting the habitual sense [. . .] is heavy compared to the stress on the particle' (Eraser 1976: 21). A more plausible explanation would be that both the progressive morpheme and construction, make it possible to separate two otherwise potentially stressed adjacent items (i.e. the verb and the particle), thereby satisfying the principle of rhythmic alternation, which might also be responsible for the variable phonetic shape. One might argue that the situation is different in written discourse or texts that are written to be spoken (e.g. planned speeches) since here the writer has more time to organize the discourse in an appropriate way and could, in principle, know how often he will refer to the referent of the direct object again in the, say. following ten clauses. While this might be true, it would have to be shown first that one can indeed find such an effect. More importantly, however, spoken language is generally considered more relevant to the study of language, given its structural, functional, biological and historical priority over written means of linguistic communication (cf. Lyons 1981: section 1.4). Consequently, the argument concerning the planned production of written discourse docs not necessarily extend to oral language or the linguistic system as a whole. It should be noted, however, that Givon is aware of the fact that thematic importance and referential predictability can yield conflicting preferences (while, unfortunately, Chen is apparently not). Givon (1988: 273-5) discusses empirical data for two such cases (definite NPs in Papago and direct object promotion in Nez Perce) and mentions indirect objects in English, where thematic importance is more powerful in determining word-order facts still, he points out that 'whether this a universal tendency remains to be investigated' (Givon 1988: 275). Cf., however, Bock's (1982: 36-8) insightful discussion of the possibly conflicting preferences of thematic importance and referential predictability. She docs not only distinguish these two determinants, but also the different roles they have for speakers and hearers, proposing several possibilities to resolve the ordering paradox (cf. also Chapter 8). In some respect at least, this definition of cohesiveness is much too vague in that nearly everything may share semantic features with everything else without really contributing to cohesiveness. For instance, it is difficult to see how the mention of the word ball in a sentence increases the cohesiveness of ball to paper dip: if the only thing the two referents of these words have in common is that they both share the semantic feature [+ CONCRETE], it is highly unlikely that these two referents are cohesive in any way worth mentioning. Admittedly, Peters (1999: n. pag.) concluded that 'If the noun phrase has been previously mentioned (given), it tends to surface much more frequendy than expected in the non-canonical position between the verb and the particle', and Table 2:3 docs in fact show that the relative frequency of construction, in constructions with given referents (45.95 per ccntJ is higher than the average frequency of construction, (37.78 per cent). Still, the frequency of construction, with given referents (85, that is) does not differ significantly from the one expected from the marginal totals (namely 70 [= 37.78 per cent of 185]: X"(l) — 3.262; p = .071). That is to say, the effect Pelers claims to have found is not significant and too weak to enable us to guess which construction a speaker will
REVIEW OF LITERATURE
22
23
24
25
26
27
41
choose. However, cf. section 6.1.4 for more supportive results of the effects of last mention of the direct object. At first, this fits nicely with many cognitive linguists' belief that linguistic and natural object categories arc organized similarly, but it still has an undesirable side-effect: in the case of pronouns (e.g. it) one has to decide for assigning entrenchment values cither on the basis of the referent or on the basis of the word class. If, e.g., the direct object of a VPC is it, referring to love, then one can. in principle at least, either assign the value of 9 (3"' person singular pronoun) or the value of 1 (abstract entity) to this direct object. Of course, one would probably choose the value on the basis of the referent (i.e. 1) because entrenchment is intended to measure the referent's entrenchment in semantic/encyclopaedic memory - if one chose the value reflecting the pronoun, one would not measure the cognitive entrenchment of the referent in semantic/encyclopaedic memory, but rather its news value since pronouns are generally chosen to denote discourse-familiar referents. Cf., e.g., some of the hierarchies discussed by Sicwicrska (1988) or the scales suggested by Givon (1988: 249) such as the code-quantity scale and other quantitatively validated topic-scales. Although the correction for shrinkage is not necessary here, given the sample size (N = 150) and the number of predictor variables (P — 11; cf. Werner 1997: 91-2), multiple R after the correction for shrinkage by Olkin and Pratt is about equally high, namely R = .88. A partial correlation is 'a correlation between two variables that remains after controlling for (i.e. partialling out) one or more other variables' (StatSoft 1999: s.v. 'Partial Correlation'). While entrenchment does not seem to influence particle placement if the abovementioned entrenchment hierarchy is used as an operationalization of entrenchment, one might of course suspect: that entrenchment still plays a role, if measured differently. Since entrenchment has been defined as being a function of the 'frequency of successful use of a concept' (Dcanc 1992: 34), a plausible assumption is that the entrenchment of the referent of the direct object could be approximated by its overall frequency in the language. Thus, in the corpus data to be discussed below, I assigned to each non-pronominal direct object a value from 5 (very frequent) to 0 (very infrequent); the frequency values were laken from the Collins Cobuild R-Dict on CD-ROM 1999, based on 200m words from the Collins Cobuilcl Corpus (Bank of English); in cases where the direct object was complex, the frequency of the head noun was used in the very few cases of co-ordinated nouns in the object noun phrase, the average of the frequency values was used. A correlational analysis shows, however, that there is no significant correlation between the frequency value of the head noun and the choice of construction: y — -02: 7. — .344: p = .731. Thus, either entrenchment really plays no role at all or we still lack a proper way of opcrationalizing it. The standards I refer to arc objectivity, reliability and validity, all of which I consider to be violated by this kind of acceptability-based analysis. It has empirically been shown quite impressively (e.g. Nisbett and Wilson (1977) as well as Nisbett and Ross (1980) for supportive examples from psychological research) that introspection is one of the least objective and reliable ways of gathering adequate data on subjects of interest other than introspection bias itself. In linguistics, acceptability judgements both of a single person at different times
42
MULTIFACTOR1AL ANALYSIS IN CORPUS LINGUISTICS
and of different people at the same time vary greatly, as was demonstrated by Labov's (1975) discussion of many studies showing the unreliability of acceptability judgements for non-clear cases beyond any reasonable doubt; cf. Schutze (1996) and Cowart (1997) for book-length treatments of shortcomings of naively gathered acceptability judgements. This opinion is shared by other researchers, too: 'most studies of VPCs are notable for the overabundance of asterisks or noncommittal question marks prefacing their sentence examples' (O'Dowd 1998:42). As to validity, from a more data-oriented perspective one might question whether the artificially sounding sentences introspective linguists often use to back up their analyses bear any relation to naturally occurring dala and to the cognitive processes underlying these data. Polemically speaking, nothing is gained by the analysis of sentences about which native-speaker informants regularly say things such as 'I could say that - but I never would.' More seriously, it is far from clear that the ability of providing acceptability/grammatieality/ naturalness judgements tells us something about one's knowledge of one's native language: 'no one has ever presented even a hint of evidence that any part of the human's linguistic competence is the ability to evaluate sentences produced artificially, out of context' (Prince 1991: 80). 28 This is not to say that empirical (in the sense of corpus-based or experimental) approaches are a guarantee for noteworthy and/or reliable results. Apart from the above points of criticism raised against Chen (1986), it is also quite surprising to see that Chen has made extensive use of cross-tabulation in order to support his claims with 'solid data', but none of the results has ever been subjected to tests of significance. This is, of course, not characteristic of Chen's (1986) analysis alone. Rather, it is just one instance of an attitude towards statistics that has been neatly summarized by Wardaugh (1986: 153): Many sociolinguists have tended not to be very rigorous in their statistical treatments, hut this has not stopped them from very strong conclusions, which seem 'obvious' and 'interesting' to them; whether these conclusions are 'significant' in the sense of having met a good statistical test of a well-stated hypothesis is hardly ever a concern. While the quote is aimed at sociolinguistic studies, the drawbacks of Chen's analysis show that such problems are far from being restricted to sociolinguislic studies alone. Givon (1992a: 308, 317) has raised similar issues also supporting this point; cf. below. I also do not mean to imply that significance is all that counts, but again following Wardaugh (1986: 184), checking for significance 'is the only proper test once you have decided to go the statistical route in investigating the distribution of phenomena'. 29 Of course, it is linguists especially who would probably argue that, if some scholar got a result similar to the one in Table 2:6 and simply said that complex objects tend to occur in construction,,, then he would violate Grice's maxim of quantity in such an extreme way as not to be co-operative anymore so that this 'result' would probably not be published. Nevertheless, I hope the much more general point has become clear. 30 This is. of course, not to imply that every single study has aimed at finding such an account for particle placement: many structuralist analyses do not pursue such a goal - rather, they focus on finding the 'correct' or most elegant structural description of the two constructions.
RKVTFAV ()!' LITERATURE
43
31 The preceding paragraphs are also no! meant to imply that other researchers have not realized that variables can interact in various ways since, e.g., even the heavily criticized approach by Fraser has briefly commented on this issue: The range of linguistic factors which influence particle position are not only greater than a casual examination might indicate, bul, predictably, interact with one another lo create a truly complicated array of data. (1974: 571) My point, however, is that this complexity has not been dealt with in any satisfactory way: multifactorial approaches to particle placement do not exist. This is all the more astounding since even previous analyses should have stumbled over the few interactions that have been identified, e.g., stress Xpronoun (cf. Fraser 1976) or pronoun X idiomaticity (cf. above section 2.3). For many other well-known cases of syntactic variation, the situation is similarly problematic (with laudable exceptions, namely Browman (1986) and Siewierska (1993), as well as the studies by Leech et al. (1994); Arnold and Wasow (1996) and Arnold et al. (2000)). 32 The notion of explanation will be explained in Chapter 3; for the moment, the everyday meaning of explanation will suffice. 33 In this respect, it should also be noticed that although the majority of variables influencing particle placement was already known in early traditional grammarians' accounts, it is highly unfortunate that many of the accounts such as Western (1906), Van Dongcn (1919), Fijn van Draat (1921) and Kennedy (1920) are not recognized in later analyses so that the wheel has obviously been repeatedly invented anew. 34 One of these instances is Preposition Stranding, i.e. the (often stylistically marked) alternation between Which newspapers do we maintain strict editorial control over? and Over which newspapers do u:e maintain strict editorial control? Although, again, both word orders are equivalent with respect to their satisfaction conditions, the former word order is much more common in spoken language (and hardly ever to be found in written language) whereas the latter prevails in written language (cf. also below).
3
Objectives of this study
The present study pursues two kinds of objectives: one is linguistic, the other is methodological in nature. Given the criticism formulated previously, the linguistic objectives are the description, explanation and prediction of particle placement. With the descriptive part of this work (cf. sections 6.1 to 6.3), I intend to answer the question 'what happens?' More specifically, the present study will be the first one to provide an exhaustive characterization of the way (i.e. with which degree of strength and in which direction) (he variables discussed previously contribute to the native speaker's decision for one of the two constructions. In this respect, this study is (after 100 years of research) the first one to investigate empirically and quantitatively: • the impact of each variable in isolation (in order to review the descriptive adequacy of previous accounts); • the way variables that are relevant for particle placement interact with one another in cases where values/levels conflict with respect to the construction they prefer or even require; • the degree to which particle placement can be satisfactorily described at all, given the inventory of variables that has been proposed so far; i.e. the status of linguistic research on particle placement will be quantified in order to show 'how far we have come'. I intend to answer the questions 'why does particle placement happen the way it does?' and 'what pattern lies behind the subconscious decisions of native speakers?' The answers to these questions (to be dealt with in Chapters 4 and 8) will be found by an analysis of particle placement that integrates those variables relevant for particle placement into a single account and, following Givon (1979: 3-4. 8), relates the phenomenon under investigation to 'something more profound', namely to "explanatory parameters of language'; the most important of these are the prepositional content and the discourse pragmatics of the VPC and the cognitive processes in interlocutors. Lastly, prediction (cf. section 6.3) is concerned with the following. Generally, particle placement has been analysed in such a way that either some variables were postulated and shown to be relevant on the basis of some
OBJKCTIVKS OK THIS STUDY
45
example sentences, or the results of an empirical analysis of data enabled the researcher to identify distributional patterns or frequency differences of the variables (that were argued to derive from structural requirements or communicative functions of the two constructions). However, it has never been attempted to test the hypotheses by testing the actual predictive power of the analyses. Inventing sentences is one (from my point of view, inadequate) way of supporting one's analysis; an already much better wayis to arrive at (hopefully significant) post hoc generalizations on the basis of previous accounts or empirical data, but the most rigorous and fruitful alternative is to find out how well one's own analysis can predict what speakers will actually do 'in real life'. This is not to say that predictive power in this sense is the sine qua nun for linguistic research - it only means that the value of an already argumentatively reasonable account of a linguistic phenomenon would be enhanced much more if it also turns out to produce an output sufficiently similar to what the native speaker does (whose linguistic production is being investigated). Figure 3:1 gives one piece of discourse (from the British National Corpus) and will help to clarify this idea. Here we see how a piece of discourse stretches over time and develops over several clauses up to a clause where the speaker must choose one of the two possible constructions. The short arrow shows the latest possible point where the speaker, who. so to speak, knows the words he will use as well as the information status of the words' referents, decides on a particular word order. Native speakers effortlessly and subconsciously seem to weigh all
10 9
Clauses before the clause in which
8 7 6 particle 5 placement 4
3 9 1
will occur
That does not mean plummeting house prices the old adage about most people simply refusing to move rather than . . . remains as true as ever. Rather, it looks as if we will have to wait until the end of 1991 before we get any dramatic improvement. L Tlie market is so bad now that it won't get any worse, at least in the South,' said Richard Roberts, an economist with Barclays Bank. 'It will go on being like it is for longer.'
M any estate agents were whistling to keep . . . |up their confidence] or [their confidence up] T (latest possible point of decision)
Figure 3:1 A piece of discourse up to the point of decision for a construction
46
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
of these values/levels simultaneously and arrive at a natural-sounding construction, but note how bcwildcringly complex this situation is from an analyst's point of view: every single one of about twenty variables dealt with so far has a specific value/level that is associated with some hitherto unknown degree of importance. In this case, the direct object is two words/four syllables long, not complex, lexical, abstract and inanimate and has a definite determiner; it was not mentioned in the preceding discourse (and will not be in the subsequent discourse); the verb phrase is slightly idiomatic (metaphorical) and there is no following directional prepositional phrase. Given this large amount and the intricacy of the situation, it seems quite difficult to foresee speakers' constructional choices in all cases and in a principled fashion. Consequently, the final linguistic objective of my study is to be able to predict with sufficient accuracy which construction the speaker will choose, given the communicative situation he is engaged in. In this way, the analysis is subjected to a much more demanding test than is traditionally done (if at all). Commonly, as was argued earlier, hypotheses in the area of syntactic variation have been 'proven' either with intuitive acceptability judgements or with some post hoc correlational measures or cross-tabulation. Equally important, in the long run, is the methodological goal of the present study. As has been shown repeatedly in quite some detail by now, previous research on particle placement suffers from a variety of drawbacks, the most devastating of which is the monocausal perspective on an inherently multicausal problem. Thus, the methodological objective to be pursued is to show how particle placement or, more generally speaking, instances of syntactic variation should be investigated instead. Put differently, on the basis of particle placement, I would like to establish new methods for analysing syntactic variation and show that many new insights can be gained by combining multifactorial statistical procedures arid a cognitively oriented perspective. The question arises why such an approach has not been advanced much earlier: the reason for this lack of more advanced statistical procedures in cognitive-linguistic research is probably simply a lack of statistical knowledge and is thus a matter of convention. While data-driven statistical methods have a long tradition in many other fields of scientific research, modern linguistics has been predominantly theory-driven, and the most predominant linguistic school of thought, namely Chomskyan generative grammar, has rejected quantitative investigations of (corpus-based) performance data. The discipline is only slowly becoming more data-oriented and corpus-based studies and quantitative analyses of linguistic data have become increasingly common, especially with the advent of computerized corpora. For instance, the statistical investigation of socially conditioned variation in sociolinguistic studies in the 1970s has already utilized multifactorial statistical techniques. My own introduction of multifactorial analyses into the investigation of syntactic variation from a cognitive-functional perspective is inspired by:
OBJECTIVES Or THIS STUDY
47
• the now classic studies of Biber (1988, 1995), where multifactorial statistics, e.g. cluster analysis and factor analysis, have made it possible to gain extremely valuable insights into register variation that are impossible to reach otherwise; • the studies by Browman (1986), I,ccch et al. (1994) and Arnold el al. (2000), where cases of syntactic variation (namely VPCs, the English genitive [the speech of the president vs. the president's speech] and the Dative Alternation as well as Heavy NP Shift) have been investigated with ANOVAs and logistic modelling/regression, though not from a cognitive-functional perspective. Since there is nothing inherent in syntactic variation (or many other superficially syntactic phenomena) and a cognitive-functional orientation that rules out this methodology and similarly promising results. I hope to show how rewarding the results of such an analysis can be. The following two quotations perfectly illustrate the line of reasoning advocated in the present study: Although linguists . . . typically do not use statistical techniques, the approach just illustrated fits conceptually with correlational models using multiple regression analyses . . . [i.e.,] with a more complex design we can obtain information that is not readily available by armchair analysis. (Bates and Mac.Whinn.cy 1982: 181) I take a linguistic tbeory r of high explanatory value to be one in which [interdependent] forces are not only analyzed in isolation . . . (Lambrecht 1994: 12).
4
Key notions and hypotheses
This chapter will introduce a hypothesis as an explanation of the distributional pattern of the two constructions. This hypothesis is related to that of Gries (1999) and is based on cognitive-psychological concepts such as processing cost as dependent on, among other things, attention allocation, the storage and retrieval/activation of concepts and structural complexity of linguistic expressions; my discussion is based on Givon (1992b), Lambrecht (1994); the cognitive-psychological background is discussed in, e.g., Anderson (1996). Given the previous sections, it will come as no surprise that I do not consider particle placement to be a purely formally motivated phenomenon where two structurally similar constructions are transformationally related — rather, T am convinced that the alternation of VPCs is a manifestation of processing requirements on the speaker. More precisely, 1 put forward the following hypothesis as an explanation of particle placement. (44) The Processing Hypothesis: by choosing one of the two constructions for an utterance U a speaker S adapts to the processing requirements of the two constructions in two respects, namely his own production of U and U's comprehension by the hearer H: 1 By choosing a construction, S indicates his assessment of the amount of the processing cost of U required for its comprehension by H and, thereby, simplifies H's comprehension process. 2 By choosing a construction, S subordinates to different processing requirements of both constructions in that he formulates U in such a way as to communicate the intended message with as little processing effort as possible. This means that the choice of word order will serve to facilitate processing; more specifically, for most variables at least, this means that construction,, will be preferred for VPCs with direct objects requiring a lot of processing effort - construction, will be preferred for VPCs with direct objects requiring little processing effort.'
From (44), several different but intimately related questions arise. First, what exactly are the determinants of the processing cost associated with VPCs
K.KY NOTIONS AXD HYPOTHESES
49
and why should the two different constructions be associated with different degrees of processing effort at all? Second, if the two constructions are indeed associated with processing requirements, why should the association exist in the way predicted rather- than the other way round? Finally, since most variables discussed so far in the literature were not explicitly related to processing cost, how can they be integrated into the present account? The remainder of this chapter is devoted to the discussion of these issues, and I will start by investigating how the different kinds of variables reviewed in Chapter 2 are related to or indicative of the processing cost of an utterance; for expository reasons, the order in which the kinds of variables are discussed has to be reversed. 4.1 Discourse-functional variables If two interlocutors are engaged in linguistic communication (I take this to be the primary purpose to which language is put to use), their interaction generally involves referring to entities and predicating something of these entities: the speaker verbally denotes some referents and expresses some states of these referents or relations of these referents to other entities. At the same time, the hearer tries to interpret the content of the speaker's original message. This interpretation of the utterance on the part of the hearer is an active process involving many inferential processes, which in turn require attentional resources for identifying and accessing the concepts evoked by the utterance in order to process the message (e.g. by access to long-term memory). The referents within a linguistic expression uttered by the speaker can, thus, differ with respect to two distinct criteria, namely their identifiability and their degree of activation (cf. specifically Lambrecht 1994: Chapter 3 where these concepts are introduced as information-structure categories).2 Consider Figure 4:1 for a classification of concepts along the two criteria. Concepts evoked by the speaker are identifiable for the hearer if they are already stored in the hearer's memory and can be accessed (cf. @) concepts evoked by the speaker are unidentifiable for the hearer (cf. ©) if the hearer has no mental representation of them prior to the speaker's utterance (cf. Chafe 1994: 93-4; Lambrecht 1994: 78; cf. further Tomlin 1986: 133 on shared information (idcntifiability) and thematic information (activation)). Unidentifiable concepts require most processing effort on the part of the hearer because the hearer has to establish a new mental referential file for them first before they can be further processed. Identifiable concepts, however, come in two different kinds again, namely concepts not having been activated before (inactive concepts; cf. ©) and activated/active concepts (cf. ©). Concepts are inactive if they have not been previously evoked/ activated in the preceding discourse concepts arc active if they have already been activated in the preceding discourse.3 The process of activating a concept can take on various forms, two of which are essential to the present study: concepts can be evoked/activated directly or indirectly.
50
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
Figure 4:1 Concepts and their activation cost for the hearer in a communication situation
Activation through direct reference designates the evocation of a concept by naming it explicitly (e.g. the concept of your friend Fred can be evoked by saying Fred or, at a later stage of the discourse when Fred is already known to all interlocutors, he). In this case, the speaker requires the hearer to focus his attention directly on the intended referent (cf. ®). What I have called indirect reference designates the activation of a concept by (i) what Deane (1992: 34—5) has called spreading activation, e.g. by naming synonyms, antonyms or superordinate terms,4 or by (ii) producing an utterance in which the referent can be readily inferred otherwise (what Prince (1981: 236-7) has dubbed inferrability and Lambrecht (1994: 100) has called inferential accessibility of referents); cf. ®. Both ways of activating identifiable but inactive concepts require less processing effort than unidentifiable concepts because it is not necessary to establish a new referential file. However, their activation (by focusing [direct reference] or spreading activation [indirect reference]) is still costly as it involves a search in long-term memory. Finally, identifiable and already active concepts (cf. ©) do not require particular processing effort for their activation since their current activation level can be upheld (cf. Givon 1992b: 25-6). On the basis of this classification of concepts, we can easily relate many of the discourse-functional variables mentioned in Chapter 2 to processing
KEY NOTIONS AND HYPOTHESES
51
cost as determined by identifiability and activation. Let us begin with the variable last mention of the, re/went of the direct object. If the referent of a direct object in a VPC has been mentioned before (e.g. in the ten clauses preceding the VPC"). then the referent is very likely to be identifiable and probablyeven active. Thus, the hearer's processing of this referent proceeds by simplycontinuing the activation level, which does not require a lot of processing effort.1 If, on the other hand, the referent has not been mentioned before in the preceding discourse or perhaps even never before to the hearer, then the hearer must deliberately assign focus to this unused/discourse-new (or perhaps even brand-new/hearer-new) referent, which involves more processing effort than simply continuing previous activation.1' What we find, then, is that previously unmentioned referents, which arc according to the literature more typical of construction,,, require a lot of processing effort whereas active referents, which are more typically found in construction,, do not require much processing effort. This distribution of constructions was predicted in (44), so we have succeeded in relating this variable to matters of processing cost and activation. This line of reasoning naturally extends to the variable news value (postulated by Erades). Let us now turn to times of preceding mention of the referent of the direct object and distance to last mention of the referent of the direct object. The discussion of these two variables proceeds along very similar lines. The two variables are only important if there is a last mention of the referent of the direct object in the first place, which is why we are only concerned with the difference between inactive/unused and active referents. It is quite clear that the more often a referent has been mentioned in a given stretch of discourse, the more active the referent will be and the less processing effort it will, therefore, require. Likewise, the shorter the distance to the last mention of the referent, the less likely it is that the referent will already be unused again and the more likely previous activation simply needs to be upheld. In other words, direct objects mentioned frequently and/or shortly ago (which were shown to prefer construction,) require little processing effort whereas referents mentioned barely and/or long ago (which were shown to prefer construction,,) require a lot of processing effort. Again, we have related two of the discourse-functional variables to processing cost and arrive at the distribution predicted in (44). Next, the variable cohesiveness of the referent of the direct object to the preceding discourse needs to be dealt with. As was mentioned in section 2.6.1, it has been argued that a concept can be topical without having been explicitly mentioned, but rather by its having been evoked by, say, its hyperonyms. For example, if in the ten clauses preceding a VPC with the flowers a speaker mentions roses very often, then it is plausible to assume that the concept of flowers will already have been activated by the subordinate concept of roses. 7he flowers is thus a little more active/given (what Bolkesteiii and Risselada have called cohesive) and requires less processing cost so that the correlation between cohesive referents requiring less effort than new referents and construction, exemplified in (30) matches the prediction in (44).
52
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
As to the discourse-functional variables concerned with the discourse context following the VPC, I would expect that these variables do not contribute to the processing effort of the VPC on the basis of both my arguments given in section 2.6.1 and the fact that discourse at the time tx cannot influence the activation of referents at time t x _.,; still, in order to test previous analyses, these variables will be empirically investigated as well. In sum, it has been shown how discourse-functional variables are naturally related to the way and the effort associated with identifying and activating referents in discourse in the direction formulated in the Processing Hypothesis; in the next section, we will be concerned with semantic variables and their relation to processing cost. 4.2 Semantic variables The semantic variables to be discussed in this section are intricately related to one another and also to the organization of the discourse in which the VPC occurs. As is commonly known, in English expressions denoting new and/or important referents are frequently positioned clause-finally; this phenomenon has been called end-focus. It was shown above that Bolinger's analysis (in terms of semantic focus) and Yeagle's analysis (in terms of different construals of the same objective scenes) can be subsumed under the notion of end-focus. End-focus is, in turn, quite obviously related to processing: • I have argued above that focused elements require more processing effort (cf. n. 50): • whichever of the two constituents (the direct object or the particle) is to be emphasized, it will be positioned sentence-finally so that the hearer has already received all the information that is necessary (or at least helpful) to process the information yet to come; • it was shown that construction,, results in stronger activation and longer persistence of the referent of the direct object in working memory (Clifford 1990, quoted in Heliel 1994: 143). Thus, the distribution to be expected from Bolinger's and Yeagle's analyses matches the one predicted by the Processing Hypothesis. Let us now turn to the degree of idiomaticity of the verb phrase. We know that VPCs come in two different word orders, and we also know that the meanings of VPCs range from literal' to idiomatic. Even if we did not already know from the literature which of the two word orders is more common or acceptable with which degree of idiomaticity of the verb phrase, we could already make an educated guess: in construction,, the particle is positioned in the canonical position for local elements, i.e. clausefinally, so that the particle is processed more intensively than the direct object. Thus, construction, naturally underscores the spatial contribution the particle makes to the meaning of the utterance and would, therefore, be
KEY NOTION'S AND HYPOTHESES
53
the natural choice for a speaker who intends to communicate a state of afi'airs where the spatial meaning is prominent. The structural configuration of construction, very closely parallels the definition of the caused-motion construction in Construction Grammar (cf. Goldberg 1995: ch. 3, 8), which is defined as follows (for expository reasons slightly changed): [SUBJ,,IUM.,. V OBJlhl.nu, OBI^,,,,.,,,,,^] such as Frank sneered the tissue off the table or They sprayed the paint onto the wall. This structural configuration is then argued to have the following basic meaning: 'the causer argument directly causes the theme argument to move along a path designated by the directional phrase; that is, X CAUSES Y TO MOVE // (Goldberg 1995: 152). Even at a superficial glance the very close structural and semantic parallels between the caused-motion construction and construction, become apparent: the structural definition and the examples are similar to construction, and in, say, Fred brought the book back, the causer (Fred) directly causes the theme argument (the book) to move along a path or up to a point designated by the directional phrase (bacK).{'' Thus, it makes sense to consider construction, to instantiate a special subtype of the caused-motion construction. From this parallel, it also follows that concrete referents of direct objects will preferably occur in construction, while abstract objects will probably not. since concrete referents can undergo the movement process commonly denoted by construction, whereas abstract referents cannot. Moreover, Bock (1982) discusses experimental evidence showing that variables governing lexical accessibility also influence constituent ordering: the retrieval of words during sentence formulation influences sentence form, partially independent of the sentence's intended substance. Thus, factors that facilitate lexical retrieval are also associated with the early placement of words and constituents in sentences. (39) Among the variables she mentions we lincl concretcness: concrete, imageable referents arc easier to retrieve than abstract, less imageable referents. Since the degree ol retnevability ol a concept adds to the overall processing cost associated with the concept, we would expect concrete referents in construction, and abstract referents in construction,,. In sum, the notion of end-focus is responsible for the literal/spatial interpretation of construction,, which is in turn correlated with a preference for concrete referents. However, we still need to account for the preference of idiomatic sentences to occur in construction,,. If the meaning of the VP is idiomatic, the particle does not just add some spatial information to an already straightforwardly intcrpretablc literal sense of the verb (as with literal VPCs such as, e.g., John brought the book back). More precisely, the meaning of the VPC is not compositional: its meaning cannot be computed solely on the meaning of its individual parts and their interrelations in the syntactic structure. In other words, with idiomatic TPVs, the meaning of the expression as a whole is much more dependent (in an intuitive sense) on the co-occurrence of the verb and the particle, which are,
54
MLLTII'ACTORIAL ANALYSIS IX CORPUS LINGUISTICS
thus, semantically much more dependent on each other. But how do we characterize this dependence more objectively and how do we relate it to processing cost? In an analysis of the order of post-verbal PPs in English, Hawkins (2000) has suggested the following in order to distinguish between dependencies of different strengths between verbs and following PPs. For instance, in He v[ accounted [Pl.for the fact// the interdepcndency between accounted and for the fact is stronger than the relation between slept and in the garden in He [ v slept [PP in the garden]]. This claim can be supported by two entailment tests: • the verb is semantically dependent on the following PP, if a sentence s[ NPSubj V PP] does not semanticaUy entail [s NPSuhj V]. While He slept in the garden semantically entails He slept, He accounted for the fact does not semantically entail He accounted', 9 the prepositional phrase is semantically dependent on the preceding verb, if a sentence [s NPSllhj V PP] does not semantically entail [s NPSubj pro-V PP], where pro-V refers to several kinds of dummy verbs such as to do something. While He slept in the garden semantically entails He did something in the garden, He accounted for thefact does not entail He did somethingfor the fact. Returning to TPVs, the difference between literal and idiomatic VPCs can be dealt with in a similar vein: with literal VPCs, the semantic dependency between the verb and the particle is much weaker since neither the verb nor the particle require the other component to be assigned an interpretation online (cf. above for John brought the book back). With idiomatic VPCs, the correct interpretation can only be arrived at after both parts of the TPV have been processed: John eked out a living neither entails that John eked nor that John did something to a living — only both components yield the most probably intended interpretation, the interdependence is quite strong. How does this difference between literal and idiomatic VPCs relate to processing and the choice of construction? Again following Hawkins (2000), there is a tendency to minimize what he refers to as lexical dependency domains (cf. Hawkins 2000: 244), i.e. (slightly simplified) the distance between expressions that mutually depend on one another for their interpretation.9 Since in the case of idiomatic VPCs, verb and particle are strongly dependent on one another (in the sense that both are required for a felicitous interpretation of the utterance), they can be processed in a more economical way if they are positioned close to one another in the actual utterance (i.e. construction,,); put differendy, it would be uneconomical to process the opaque meaning of a TPV but produce the parts that trigger this opaque meaning in possibly widely disparate positions of the sentence.10 With literal phrasal verbs, by contrast, no preference for a construction is to be expected on grounds of semantic dependency because the low degree of interdependence does not require a particularly small distance between the component parts arid, thus, licenses both word orders.'' Thus, the postulated preference of literal VPCs to occur as construction, results only from
KEY NOTIONS AND HYPOTHESES
.">">
the above argument based on end-focus and the resulting spatial causedmotion interpretation. With this account of literal and idiomatic VPCs, we can also explain a phenomenon that has so far only been observed by some scholars but that has never been explained, namely the fact that, for sentences which could in principle be interpreted both literally and idiomatically, the literal interpretation is preferred for construction, and the idiomatic interpretation is preferred for construction,) as in (22). here repeated for case of reference as (45). (45) a. He threw up his dinner because he got food poisoning. (Fraser 1974: 573) b. He threw his dinner up because he wanted to stain the ceiling.
In (45a), an idiomatic meaning can be processed easily since the verb and the particle are adjacent; the literal meaning is in principle also possible, but is ruled out in this example by the subsequent subordinate clause. In (45b), by contrast, the main clause focuses on the spatial (and resultative) meaning of the sentence-final particle (along the lines argued above) so the literal meaning is the most natural interpretation; an idiomatic meaning is, however, not licensed as the two components of the TPV are too distant from each other, given their strong semantic interdependence. In this example, the literal meaning is further supported by the message of the subsequent subordinate clause that is concerned with upward movement. The next semantic variable to be related to the Processing Hypothesis is semantic modification of the particle. According to Fraser, only construction, allows for perfective or aspectual modification. This observation can be related to the fact (observed by Dirveri and Raddcn 1977: 184-6) that this perfective or aspectual modification is only possible for the literal sense of particles (which, for reasons just discussed at length, in turn prefer construction,), whereas the perfective or aspectual modification is not possible for idiomatic verb phrases requiring construction,,. Thus, Frascr's observation results from a more general principle, which, in turn, is related to Dirvcn and Raclden's observation. In other words, this variable does not seem to have a direct causal influence on particle placement rather, it is correlated with another variable (namely the idiomaticity of the verb phrase or, more precisely, the lack of it) that has a direct influence. Be that as it may, the prediction following from the variable conforms to the Processing Hypothesis. Lastly, we are left with the animacy of the referent of the direct object. This variable was included in the remainder of the analysis because it is one among others that has empirically been found to be relevant for replacing the entrenchment hierarchy as discussed above (cf. section 2.6). Bock's (1982: 17 23) discussion of variables concerned with lexical retrievability referred to above also mentions that animacy of referents contributes to constituent ordering. However, on the basis of the Processing Hypothesis, no significant effect of animacy of the referent of the direct object on particle placement is to be expected: there is no reason to assume that animate
56
MUI.TII'ACTORIAL ANALYSIS IN CORPUS LINGUISTICS
referents yield context-dependent processing requirements substantially different from inanimate referents, and there is also no reason to assume that animate referents are more likely to undergo a literal change of location (due to caused motion) or a change of state; in fact, animate referents are more likely to be found as subjects of the VPCs, i.e. as causers or agents, rather than as the (typically inanimate) objects of the actions denoted by TPVs. Therefore, the variable will be empirically investigated, but is hypothesized not to contribute significantly to particle placement (cf. also Browman 1986 and especially McDonald et at. 1993). To summarize, we have seen that apart from several discourse-functional determinants of processing effort, a variety of interrelated semantic variables argued to influence particle placement is also strongly connected to processing effort: the notion of end-focus, which ultimately serves the purpose of facilitating the processing (of the most important aspects) of messages, is responsible for slightly different semantic interpretations of the utterance (depending on which element is positioned in the focal position of the clause). Finally, the degree of idiomaticily of the VPC and other variables related to it are also influenced by processing cost. The following section will deal with morphosyntactic characteristics of the two constructions and their relation to processing. 4.3 Morphosyntactic variables
In order to describe the relationship between the morphosyntactic characteristics of VPCs and processing effort, we need to look at two different morphosyntactic elements figuring in these constructions: the complex verb and the direct object noun phrase. Let us first turn to the complex verb. It has been argued elsewhere (Hawkins 1994; Rohdenburg 1996: 150; Cries 1999), that construction,, simplifies structural processing (i) for the speaker in that the particle immediately follows the verb and, thus, does not have to be borne in mind until the direct object has been uttered and (ii) for the hearer in that he processes the verb and the particle before the- direct object is also processed. In construction i, by contrast, the speaker, while producing the utterance, has to bear in mind that there is still a particular particle to be inserted after the direct object has been completely uttered and the hearer has to wait longer for assigning the correct parse to the incoming expression, namely until some yet unknown particle completes (and sometimes even disambiguates) the verb.12 But what about the direct objects of TPVs? In order to facilitate communication, language has several means of guiding the hearer's interpretation of the incoming message. These means have a variety of explicit linguistic manifestations with respect to the encoding of the entities referred to by the speaker. In section 4.1, we have seen how referents of linguistic expressions can differ with regard to their degrees of identifiability and activation. In English, the idcntifiability status of a concept used in an utterance is connected to issues of definiteness,
KEY NOTIONS AND HYl'OTHKSKS
57
pronominalization and the syntax of reference - the degree of activation, on the other' hand, is connected to issues of voice, word order and sentence prosody (cf. Tomlin 1986: 39-40). Given these morphosyntactic correlates (one might even say, manifestations) of identifiability and activation (and, thus, processing), it conies as no surprise that the morphosyntactic variables influencing particle placement can be related straightforwardly to the processing cost of the utterance. Let us start with the variable NP type of the direct object. This variable is in fact quite closely connected to utterance processing: personal pronouns and semi-pronominal refcrentially vague nouns are only used when their referents arc identifiable and active whereas lexical noun phrases are much more likely to be used with unused referents and are probably always used for brand-new referents. Again, the distribution is the same as predicted above in (44): the active referents of personal pronouns do not require much processing effort and correlate with construction, - referents of lexical noun phrase objects arc, on the whole, more likely to require more attention and occur preferably in construction,,. Likewise, the determiner of the direct object noun phrase is also concerned with processing aspects: speakers do not decide randomly in favour of definite or indefinite determiners - instead, one can find a fairly clear pattern as summarized in the following quote: Linguists traditionally deal with the binary distinction between definite and indefinite, with the former marking topics which the speaker assumes the hearer can identify uniquely, is familiar with, are within his file (or register) and thus available for quick retrieval. On the other hand, indefinites are presumably topics introduced by the speaker for the first time, with which the hearer is not familiar, which therefore are not available to the hearer readily in his file. (Givon 1983:9-10) Comment is hardly called for: definite determiners (used for active referents barely requiring extra attention arid processing) are said to prevail in construction | and indefinite determiners (used for unused or even brand-new referents requiring conscious activation) are said to prevail in construction,, so that the pattern found matches the expected distribution. Length of the direct object and complexity of the direct object (irrespective of how these are measured; cf. section 5.2 lor mailers of operationalization) can be dealt with simultaneously. Self-evidently, longer and more complex noun phrases require more processing effort just because of their heaviness and likely complexity while shorter (and commonly simpler) noun phrases can be more easily processed. But apart from this purely structurally motivated approach, there is also a functional principle at work: 'the new information often needs to be stated more fully than the given (that is, with a longer, "heavier" structure)' (Quirk et al. 1985: 1361). Thus, if the newness of a referent, on average, renders direct objects long and complex (in order to provide the information necessary for the hearer), the larger amount of linguistic material requires more processing effort than the one needed for given information less heavily encoded. Ultimately, both the structural
58
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
variables and their functional motivation go hand in hand and much processing effort is again linked to construction,, and little processing effort to construction,. 4.4 Phonological variables The variable stress pattern of the verb phrase, can be straightforwardly related to processing requirements: in functional analyses of information structure within English sentences, it is useful to distinguish two kinds of information, namely given and new information.13 It is common ground that stress on a linguistic expression typically serves to indicate the newness or importance of the referent of this linguistic expression. For the present purpose, it is not necessary to differentiate categorically between stress signalling newness and stress signalling importance - the crucial aspect here is that, by assigning stress to a linguistic expression, the speaker directs the attention of the hearer to the respective referent, thereby increasing the processing cost associated with that referent (cf. again n. 6). Since stress on the direct object yields a preference for construction,, while stress on the particle yields a preference for construction,, the assignment of stress to linguistic expressions correlates with the principle of end-focus. The elements that are intended to be processed more thoroughly are positioned in such a way that they can be accordingly processed, resulting in the distribution as predicted in (44): the expressions whose referents are hardest to process occur clausefinally. 4.5 Remaining variables
The next variable is concerned with the presence of a directional adverbial following either the verb or the particle. If a directional adverbial follows the VPC, then it typically serves to either elaborate the path along which the referent of the direct object is being moved (cf. (46a)) or the resultant location of the referent of the direct object (cf. (46b)). (46) a. So Tom took [NI, Peter] along [PP past the new Pump House]. b. Fred put [XI> the book] down [Pp on the table].
For construction,, where the spatial meaning is, according to the line of reasoning in section 4.2, foregrounded, it is therefore quite natural to expect additional material (in the form of a directional adverbial) providing additional information on the direction or the endpoint of the movement process; for construction,,, however, the opposite would be expected as construction,, does not in general denote a movement process that can be further elaborated with information concerning directionality (cf. above). In other words, the directional adverbial as such does not causally influence particle placement, but it is correlated with another variable that does, namely the degree of idiomaticity of the construction. This is what one
KEY NOTIONS AND HYPOTHESES
59
would expect since it is unlikely to assume that directional adverbials influence particle placement against the unidirectional flow of time; still, at least argumentatively, the distribution predicted in the Processing Hypothesis seems to be fully justified. However, there is a second way of relating this variable- to the Processing Hypothesis. Up to now the variable directional adverbial was taken to contribute to particle placement due to its semantic contribution to the meaning of the utterance. Equally possible would be the following effect: compare an analysis of sentences such as those in (46) along the lines of Hawkins's EIC principle or the principle of end-weight to that of the sentences in (47). (47) a. So Tom took Peter along. b. Fred put the book down.
Length of NP=1 Length of NP=2
Length of Farticlc=l Length of Particle=l
The EIC principle predicts that longer constituents will be positioned sentence-finally. In (47). the direct objects are short (one and two words respectively) and the particles are short (one word) so EIC or the principle of end-weight do not make strong predictions. In (46) (repeated with additional information as (48)), by contrast, the particles are followed by spatial prepositional phrases. In view of the just-mentioned semantic function of these spatial prepositional phrases, one could assume that the particle and the spatial PPs belong together closely in terms of constituency (just like prepositional phrases as in Fred took the glass //>r out of the cupboard]), yielding a semantic constituent we might informally call 'particle phrase' or PartP for short (for lack of a better term). 14 (48)
a.
So Tom took Peter [P.lrtP along [PI, past the new Pump House]]. NP=1 PartP=5 b. Fred put the book [ranP down [HJ, on the table]]. NP=2 PartP=4
Thus, the PartPs are now much longer than just one word and much longer than the direct objects in the two sample sentences. Therefore, it is only natural that the word order of construction, (where the heavy PartPs are positioned sentence-finally) is preferred since it yields a (in terms of processing) more economic word order.'' If, on the other hand, the direct object splits up the PartP (i.e. is inserted between the particle and the following directional adverbial as in (49)), yielding construction,,, then the sentences sound awkward since the expressions that belong together conceptually do not stand together contiguously, violating Behaghel's Law (cf. Behaghel 1930, 1932) or derivations thereof (although the principle of end-weight is satisfied). (49) a. So Tom took along Peter [p,, past the new Pump House]. b. Fred put down the book [,,H on the table].
Finally, construction;, with a pre-posed PP is also not a good alternative by
60
MULTIFACTOR1AL ANALYSIS IN CORPUS LINGUISTICS
means of which one can leave together structurally what belongs together conceptually since then the word order severely violates the principles of EIC and end-weight: the heavy elements are not positioned sentence-finally; cf. (50). (50) a. *So Tom took [HarlP along [,,,, past the new Pump House]]
Peter. ' ' PartP=5 NP=1 b. *Fred put [,,artP down [,,P on the table]] the book. PartP=i NP=2
By now, it should have become clear thai presence of a directional adverbial can in principle be related to both semantic and morphosyntactic determinants of processing. However, this does not pose a problem to the present analysis since, to whichever hypothesis we relate presence of a directional adverbial, the predictions deriving from that variable are identical: following directional adverbials yield a preference for construction,, be it on the grounds of length and processing of the particle and the directional adverbial or on the grounds of the semantics of the two constructions. Next, it was argued that if the particle is identical to the preposition of the following directional prepositional phrase, then construction,, is preferred. This variable does not as readily relate to the Processing Hypothesis proposed above. Still, it was shown above how the two constructions differ in their processing requirements irrespective of the (specific properties of the) direct object; more specifically, construction, was claimed to place a burden on both the speaker's and the hearer's working memory as both have to wait for the production and the comprehension of the particle respectively. Now imagine a situation where the speaker produces an utterance using construction, so that the particle is positioned sentence-finally In this case, the hearer will wait for the particle to complete the verb phrase. If, however, the particle of the TPV is uttered twice (from the speaker's perspective, as the first word of the following directional PP - from the hearer's perspective, for a split second, as a senseless repetition of an item already given) it is plausible to assume that the hearer will be slightly confused. In other words, construction, would in such circumstances be even harder to process for the hearer than it is already on its own. Therefore, it is plausible to assume that speakers will tend to avoid this constellation in order to facilitate communication. That is, this variable is also related (though more remotely) to processing aspects. Admittedly this is up to now only a speculation that needs to be supported by psycholinguistic experimentation, where reaction times and EECs could be used to measure whether subjects are surprised by the two occurrences of the particle.1" Finally, consider Arnold and Wasow's (1996) evidence on how production and planning constraints influence particle placement. Their observation is intimately related to the length of the direct object and ties in very nicely with the Processing Hypothesis. It is natural to assume that (long) direct objects that are difficult to plan and produce require more processing effort while their exact formulation has to be figured out; on the other hand, direct
KEY NOTIONS AND HYPOTHESES
61
objects that arc easy to plan and produce require less processing effort. Since Arnold and Wasow found that difficult and simple direct objects tend to occur in construction,) and construction, respectively, these predictions fit those of the Processing Hypothesis perfectly. 4.6 Interim summary
So far we have seen that many of the variables discussed in Chapter 2 have been integrated into the hypothesis put forward in this study while some have been argued to be irrelevant to the processing cost of the VPC. However, the attentive reader may have noticed that not all of the variables discussed above have been incorporated so far, namely the phonetic shape of the verb, the habitual sense of the verb and, finally, the degree of cognitive entrenchment/familiarity. This is owing to their lack of empirical support (or even their empirical falsification) and their inherent difficulties, and they will not be analysed any further. As to the variables to be included in the remainder of the analysis, we have seen that most of them bear a close relation to aspects of processing an utterance within a given stretch of discourse. Thus, prirna facie evidence for the Processing Hypothesis, which postulated a causal relationship between the choice of a construction and its processing effort, has been gathered. The processing effort of an utterance was shown to derive from several interrelated factors so this chapter can be neatly summarized in Figure 4:2. ' However, so far the relation between the variables and the Processing Hypothesis has only been established on an argumentative basis. If the
phonological aspects, e.g. • stress of the direct object
construction0
morphosyntactic aspects, e.g. • length and complexity of the direct object • early/late completion of the phrasal verb
semantic aspects, e.g. • idiomaticity of the VP • concreteness of the referent of the direct object
high
"
processing (cost) of the utterance
low information-structural aspects, e.g. • last mention of the direct object's referent (identifiablity) • times of preceding mention of the direct object's referent (degree of inactivition) construction..
Figure 4:2 Determinants of processing effort and particle placement
62
MUITIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
Processing Hypothesis was correct we would also expect to find empirical support for it that does not suffer from the drawbacks of other accounts outlined in section 2.6. Following established statistical convention, the statistical hypotheses following from the line of reasoning so far are: (51) H(l: There is no statistically significant relation or difference (be this relation or difference measured by using means, correlation coefficients or absolute/ relative frequencies) between the moiiofactorial statistical values of the variables calculated for construction() and those calculated for construction { . (52) H|: There is a statistically significant monofactorial relation (again, irrespective of the statistical technique used) such that direct objects with a high amount of processing cost (as indicated by the variables investigated in the ways argued for in the previous sections in this chapter) correlate with construction0 whereas direct objects with only little processing cost correlate with construction |.
The following chapter introduces the nature and the treatment of the data, while Chapter 6 looks at a multitude of empirical results in order to sec whether the Processing Hypothesis is supported or not. Notes 1 As the formulation of the Processing Hypothesis suggests, there are two different perspectives on these two criteria, the first depends only on S's assessment of the state of the concepts in H's mind (cf. (1)), while the second depends on the state of the concepts in the mind of S (cf. (2)) - however, we may not need to choose categorically between them. Typically, S can assume that the processes in H's mind are in harmony with those in his own mind. For the moment, we will concentrate on the processes in S's mind since it must be S's assessment of his own and H's mental processing that takes priority if the language is to perform its communicative function satisfactorily: '[1 language works best when the expression of activation cost is listener-oriented' (Chafe 1994: 75; cf. also Siewierska 1988: 84—5). In sections 7.2 and 7.3, we will return to the question of which of the two perspectives on mental processing (S's or H's) is more important for VPCs; moreover, Chapter 8 is devoted to another analysis, which is conceptually different from but compatible with the Processing Hypothesis. 2 Note that the notion of activation is used here not in the psycholinguistic sense; cf. Ch. 8 for a more formal treatment of activation within contemporary interactive activation models. 3 Note that two kinds of newness of a concept arc involved: a concept can be discourse-new and/or hearer-new (cf. Birner and Ward 1998: 14—15; Lambrecht 1994: 105-9; Prince 1981: 235 7). Following Prince, unidentifiable (i.e. hearernew and, thus by necessity, at the same time discourse-new) referents arc called brand-new, identifiable but inactive (i.e. hearer-familiar but discourse-new) concepts are called unused. 4 Cf. the obvious parallelism to the above-mentioned line of reasoning by Bolkcstein and Risselada (1987) concerning cohcsiveness. 5 If I say that active referents do not require much processing effort, then this is meant to refer only to the processing effort resulting from the givenness of
KEY NOTIONS AND HYPOTHESES
6
7 8
9
63
the referent, leaving aside other determinants of processing effort. It is possible (though probably not very likely) that a given referent is encoded with a lot of rnorphosyntactic material. In such a case, the discourse-functional processing effort of the direct object's referent would be low whereas the morphosyntactically motivated processing effort would be high. Even such a simple example supports my above reasoning, namely that interactions between different variables or variable groups need to be accounted for. In further sections of Chapter 4 where I deal with individual variable groups. I am only concerned with the respective variables' impact on processing, in the same way as 1 was just concerned with discourse-functional determinants of processing cost only. The question may arise why the focus on some linguistic expression should increase its processing cost. Apart from the intuitive appeal of this idea, it has been shown (cf. Roth 1997: 228-31) that the allocation of attention by the brain is to a large degree determined by the newness and/or the importance of the entities to which attention is allocated; naturally attention is allocated to those experiential aspects that are new and/or important so that such a concept will be processed more thoroughly, thereby increasing its processing cost. The literal meaning of VPCs is spatial due to the prepositional nature (i.e. the locative/spatial meaning) of the particles. One might object to the structural parallel given above by correctly pointing out that the two sentences exemplifying the caused-motion construction have full prepositional phrases as directional phrases rather than just a particle. However, it has in different (one could even say, opposing) linguistic schools been argued that the particle in TPVs instantiates a prepositional phrase without an overt NP or, put differently; that panicles are intransitive prepositions (cf. Transformational-Generative treatments by, e.g., Emonds 1972: 547-8 or similar discussions by Aarts 1989: 283, 1992: 81; Den Dikken 1995: 270; Radford 1988: 90—100). Analogously, it has been postulated that 'particles are not distinct from the class of prepositions: they are simply prepositions employed in grammatical constructions where the landmark happens not to be elaborated, as it otherwise normally is' (Langacker 1987: 243). Thus, there is clearly a close structural similarity between the caused-motion construction and construction, even if we find no structural identity. This similarity is even more obvious if we consider that Talmy (1985: 102 10) has shown that motion events can be decomposed such that semantic features like PATH can be separated from the verbal nucleus and Icxicalized as independent satellites in order to foreground the respective semantic feature, here PATH (observe the parallel to the historical development of VPCs); not surprisingly, the first example Talmy mentions is VPCs. One might also object to equating the directional prepositional phrases of the caused-motion construction with the particles of VPCs which sometimes have a rcsultative meaning rather than a spatial one. However, in terms of metaphorical relations along the lines of Lakoff and Johnson (1980), achieving a result is reaching the enclpoint of a path. Thus, again we have a semantic (metaphorically motivated) similarity between both construction types. This tendency can be related to the fact that syntactic processing is immediate (i.e. the processing of an clement within an utterance takes place immediately after the element has been encountered by the listener; cf, e.g., Tyler and Marslen-Wilson (1977), Carlson and Tancnhaus (1988), Clifton et al. (1991), to name but a few).
64
MULTI FACTORIAL ANALYSIS IN CORPUS LINGUISTICS
10 This strong interdependence of verb and panicle in idiomatic VPCs might be responsible, for a variety of scholars suggesting that the verb and the particle belong together so strongly that the particle forms a part of the verb; recall Bolingcr's (1971) suggestion that in construction,, the particle behaves like a verbal affix. Especially, but not exclusively, in the Transformational-Generative paradigm, a variety of different proposals have been made. Stowell (1981) proposes a rule of NP Incorporation and a rule of Particle Incorporation, where an NP and a particle respectively are incorporated into a verb. For construction,,, Particle Incorporation yields a new complex unit (a verb with an incorporated particle) that subcategorizes for an NP (for the sake of completeness, for construction,, NP Incorporation followed by Particle Incorporation yields a new complex unit, namely a verb with an incorporated direct object). For a similar approach also using particle incorporation cf. Baker (1988); other approaches to particle placement based on a complex verb analysis of the VPG are Johnson (1991, 1992); Neelernan (1994); Radford (1988). Collins and Thrainsson (1996) also rely on incorporation (of the particle into a covert be in V,) to derive construction,, from construction,. Their paper is worth quoting for what they claim to be a justification of the movement processes they postulate: 'Perhaps the particle is optionally analyzed as affixal in some sense, which we will write as Prt [affix].' (Collins and Thrainsson 1996: 431; my italics). Given this level of abstractness and optionality, this 'justification' is singularly unhelpful, and Collins and Thrainsson seem to be unaware of Bolingcr's somewhat less vague proposal. Furthermore, Lindner's (1981: 189) analysis also supports the assumption of 'alternative constituencies': construction,, and construction, arc argued to differ in terms of constituency such that in construction, V + NP constitutes a composite scene to be modified by the following particle whereas in construction,, V + Part constitutes a composite scene; cf. above n. 22 for some comments on the constituency of VPCs and Langacker (1 997) for a more detailed account of constituency in Cognitive Grammar. 11 This claim is supported by the independent observation that idiomatic expressions are in general much less susceptible to syntactic rearrangements than literal expressions. 12 Consider, for instance, the verb heal. Beat can be used transitively (as in, say, He had a gilt-edged opportunity to beat Stephen Hendry, so far the man of the season with four first prizes) or it can be used with a particle (namely up) as a TPV (as in, e.g.. The last time I cut my hair my father beat me up very badly). Thus, in the case of TPVs, it frequently is the particle which allows disambiguation. To my knowledge, there arc no experimental data on this in English, but experiments on the basis of event-related potentials (ERP) for German separable complex verbs have revealed that, for sentences such as Er lachelte den Bauherrn an, one can show that the particle an (which changes the intransitive verb Idcheln into the transitive verb anlacheln) is accompanied by the so-called P300 component, which is generally taken to reflect the resolution of an uncertainty (cf. Urban and Friedenci 1999a, 1999b). Thus, the evidence from German at least supports the line of reasoning for the English data although investigation of English data is still necessary. The question arises as to why there are two constructions at all if one of them is inherently simpler to process than the other one. This has already been quite problematic for Hawkins (1994) who has argued that construction, is basic rather than construction,,. The reason for this is that if construction,, was basic, then there would be no reason why construction, should have come into
KEY NOTIONS AND HYPOTHESES
13
14
15 16
65
existence at all since construction, is optimal as far as the processing of the immediate constituents is concerned. Cf. section 7.1 and Chapter 8 below for further discussion of this issue. These two terms are naturally related to the slighdy more elaborated differentiation of concepts in Figure 4:1. A variety of other dichotornous terms has been introduced as being related to or compatible with the given-new distinction such as topic-focus; topic-comment; thcmc-rheme. However, these terms are not uniformly used in functional schools of linguistics and can, thus, not be used interchangeably. I use given and new here since (i) these simple terms suffice to communicate what I intend to say without being as theory-laden as the other proposals and (ii) the remaining discussion will employ the terminology used in Figure 4:1 anyway. For an attempt to bring order into the terminological 'chaos', cf. Ostman and Virtancn (1997); for an approach defining topic and focus in relational rather than absolute terms, cf. Lambrecht (1994). Recall that the account of constituency used here allows for semantic, phonological and classical constituents; cf. n. 12. My use of the term PartP is simply a convenient label for this semantic constituent without implying that PartP is a classical constituent such as PP, which could be identified using classical movement tests. In this respect, the motivation for arguing that the particle and the PP make up a semantic constituent is straightforward: the particle and the PP belong together scmantically in that together they provide all the spatial information about the movement process the referent of the direct object undergoes. However, there is also a second line of reasoning supporting the preliminary analysis of the particle and the following PP as a so-called PartP. This argumentation is concerned with the way the hearer comprehends (or, more precisely, parses) the incoming string. There are some well-known parsing principles such as Right Association (Kimball 1973), Late Closure (Frazier 1979, 1985) or Hawkins's EIC. and what these principles have in common is that they predict a preference for low attachment of incoming phrases to the already existing phrase marker. According to these principles, attaching the PP to the particle (yielding what I have called PartP) would be the natural way to parse sentences with construction, followed by a directional PP. As early as 1919. Van Dongen observed that long particles such as toge/her also prefer end-position, which ties in nicely with our observation concerning 'particle phrases'. There is some prima facie evidence for my assumption that hearers might be surprised to encounter the same particle twice: on the basis of all oral data in the BNC about 10 million words), we can determine how likely it is that (i) any two adverbial particles (BNC-tag: PRP) follow one another (e.g. down in), and (ii) the same adverbial particle occurs twice by calculating a measure of collocational strength, namely Mutual Information (MI; cf. Church and Hanks 1990, Oakes 1998); this statistic takes on positive values when two items (e.g. words or tags) co-occur more often than expected and negative values when items occur in complementary distribution. Consider the table below for the results. As is obvious, in the vast majority of cases where one adverbial particle follows another one, MI is negative irrespective of whether the particles are identical or not. This is not conclusive evidence for my claim, but it still supports the above assumption that hearers will rarely encounter two adjacent particles, let alone the same particle, twice - the rare cases where this happens are mostly
66
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS Collocational strengths of two subsequent adverbial particles in the spoken part of the BNC
MIX) MKO Column totals
PRP, = PRP2
PRP, * PRP,
Row totals
2
10 331 341
12 372 384
41 43
disfluencies and (floor-holding) repetitions. In a cognitive framework, this structural pattern will thus be barely entrenched and may indeed result in surprise. 17 Figure 4:2 is not to be seen as a causal model where morphosyntactic, semantic and information-structural variables are independent from (since not graphically related to) one another (cf., e.g., the obvious inverse correlation between length and givenness discussed in section 4.3) -- it is just an informal sketch of which factors are related to processing effort. For a more advanced discussion of these matters, cf. Chapter 8.
5
The data
This chapter will be concerned with the data investigated in the present analysis. Section 5.1 explains how the data were gathered in order to obtain a representative sample. Then, by means of detailed exemplification, section 5.2 discusses how the values/levels of the variables to be investigated were assigned to each of the example sentences. 5.1 Origin of the corpus data As already repeatedly stated, the present work advocates an empirical approach where psychologically real factors (or, more precisely, their linguistic correlates) are used for the analysis of naturally occurring data. Consequently, the perspective taken here is rigorously corpus-based in order to find out what is actually done by native speakers. This study relies on 403 instances of the VPC taken from the British National Corpus and I will start by explaining how these 403 sentences were selected. While I have argued against introspective analyses of introspective data several times, the use of corpora in linguistics is by no means a guarantee for attaining objective, reliable and valid analyses: corpora can be biased since, at least for so-called general corpora (as opposed to specialized corpora; cf. Kennedy 1998: 20), it is difficult to reach an agreement on how the key goal of representativity can be safely achieved. What is more, even if a reasonable degree of representativity of a corpus has been reached, it is still possible that the sample drawn out of such a corpus is skewed by not containing the examples typical of what is being examined. Several precautions were taken in this study to avoid such deficiencies. In order to find the most frequently used instances of TPVs, I checked the Cambridge International Dictionary of Phrasal Verbs (1997) for all verbs that were listed as principally allowing both construction,, and construction ( . To those I added TPVs discussed in the literature and entries from other dictionaries and works such as the BB1Dictionary of English Word Combinations (1997) and the Oxford Dictionary of Phrasal Verbs (1993) (cf. the end of Chapter 11 for a list of verb sources). The result was a list of 1,357 dilferent English TPVs; this list (the largest of TPVs so far) is given in Appendix 10.3. To make sure that one can generalize from the sample to less typical/frequent instances of
68
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
TPVs, I did not base my analysis on an only marginally typical sample. Thus, I mainly concentrated on the ten most frequent particles (namely up, out, off, down, in, away, back, over, on and around, in descending order of frequency) and the ten most frequent verbs (namely put, bring, take, turn, throw, pull, call, get, keep, and kick, again, in descending order of frequency) from this list.' I used a concordancing software' to find all co-occurrences of one form of the ten verbs and one of the particles where the particle was located up to sixteen words to the right of the verb form.3 This was done once on corpus hies containing only written text and once on corpus files containing only spoken language, yielding two separate concordance files. Prior to the closer inspection of both output files, the concordance lines were sorted according to the file names where the data originated from; this is the most economic way of randomizing the data so that no text type is preferred since, according to the manual of the British National Corpus, 'the three-character identifiers [i.e. the file names] used (and hence the directory structure) are entirely arbitrary and do not convey any information about the type of text contained' (Burnard 1995: 139). The listed concordance was then manually scanned for instances of VPCs since lilerally hundreds of examples had to be weeded out, namely cases of passives of TPVs (cf. (53)), instances of phrasal-prepositional verbs (cf. Quirk et al. 1985: 1160-1; cf. (54)) and cases where the verb and the particle did not occur in the same sentence (cf. (55)). (53) . . . when the Nissan Serena was brought out. (54) We need to put up with these problems. (55) So that's why I didn't take it. Well it wasn't working in the field that I want to work in.4
In this way, 200 VPCs in spoken language and 203 VPCs in written language were extracted for further investigation and stored in separate files as a sample of 403 cases. This sample is controlled for with respect to the (for all practical purposes equivalent) numbers of cases per register and mainly consists of the most frequently used TPVs in order to exclude too many marginal and barely representative instances of VPCs.5 Table 5:1 shows the distribution of the 403 clauses making up the sample depending on the register. Table 5:1 Distribution of the 403 sample sentences
Construction,, Construction, Column totals
Spoken
Written
Row totals
67 133 200
127 76 203
194 209 403
THE DATA
69
Since many of the variables are not just concerned with the particular sentence in which particle placement occurs, the next step was to establish the requisite context for each sample sentence. For each of the cases, the preceding and subsequent ten clauses were added so that, taken together, the analysis is based on several thousand individual clauses.'1 In establishing the context, expressions were counted as clauses only if: • they contained a noun phrase or a clause as grammatical subject as well as a finite verb; or • they were participial or gerundival clauses (such as. e.g., the non-italicized part in The new rules forbid more than one to put up a sign, a rule usually ignored); or • a new conversational turn started. However, in order not to exclude too much data from consideration, the following cases were not counted as full clauses on their own even if they met any of the above-mentioned criteria: question tags; discourse markers such as you know, as it were, I mean] cleft sentences and false starts. Then, for each of the cases, numbers were assigned representing the values/levels of the variables for the respective case, which will be explained in some detail in the following section. 5.2 Treatment of the corpus data The basis for the statistical analyses lo be discussed in Chapter 6 was a huge table containing each sentence and the figures that describe its characteristics with respect to the independent variables mentioned above. In this section, I will illustrate (mainly in tabular form) how the utterance characteristics were encoded, which also involves (i) comments on the measurement scales of the variables involved and (ii) examples from the corpus data in order to clarify the coding process. First, each sentence was assigned a value representing the register of the sample sentence, i.e.. the nominal variable register (REGISTER) has two levels and. thus, can take two values, namely one for spoken data (0) and one for written data (1). Second, each sentence was assigned a value representing the levels of the choice of construction, i.e., the dependent nominal variable in this study, construction, can also take on two values, namely 0 for construction,, and 1 for construction, (cf. Chapter 1, n. 5). The variable JVP type of the direct object (TYPE) is, in this study, a nominal variable. It comprises the levels pronominal, semi-pronominal, lexical and proper name. Table 5:2 provides some guideline of how the sentences from the corpus data were encoded. Determiner of the direct object (DET) is a nominal variable, too. The set of possible levels were no determiner, indefinite determiner and defmile determiner; cf. Table 5:3. The variable complexity of the direct object (COMPLEX) is measured on an ordinal scale; three different values of increasing complexity are
MUI.T1FACTORIAL ANALYSIS IX CORPUS LINGUISTICS
70
Table 5:2 Encoding of NP Type of the Direct Object (TYPE) Level
Possible sub-categories
Examples from ihe corpus data
pronominal
personal pronoun possessive pronoun demonstrative pronoun reflexive pronoun indefinite pronouns refereritially vague nouns
Put it back later (no examples such as, e.g., mine found) Make sure you take these down please It won't be able to warm itself up I'd have to put something else in we put this matter down till later in the day it brought back memories I put my head down low enough we can actually take out the files The lawyers took Valenzuela off to record his testimony . . . that bring France down
semipronominal lexical
proper name
Table 5:3 Encoding of Determiner of the Direct Object (DET) I^evel no determiner indefinite determiner
definite determiner
Possible sub-categories
Exampksfrom the corpus data
it brought back memories plural forms personal pronouns In that case put it back to later in the day typical indef. determiner let's put a word in for you general determiner the lawyers have patched up some sort of agreement if you ever put any lettering on typical def. determiner I could have put the headings on possessive pronouns Could you just take your coats off? demonstrative pronouns in the future they will carry on that practice
distinguished, namely simple direct object NPs, intermediate direct object .NPs and complex direct object.NPs (see Table 5:4). The final morphosyntactic variable is length of the direct object (measured in words, LENGTHW, and in syllables, LENGTHS).' For this interval variable, the number of words or syllables of each direct object was simply counted and entered into the table. For instance, the direct object in (56a) is three words/ four syllables long, in (56b), the direct object consists of five words/ten syllables and in (56c) it is one word/three syllables long. (56) a. They will take off [NP their own clothing]. b. You know a lot of have sort of brought up [ XI , the alternative sort of er medicine].
71
THE DMA
Table 5:4 Encoding of Complexity of the Direct Object (COMPLEX) Ijtmll Value simple (0)
Possible sub-categories
pronominal NPs NPs consisting of (Dot +) N intermediate NPs with adjectival (T) modifiers NPs with a genitive co-ordinated NPs
complex (2)
Examples from the corpus data if you're going to put them out Put your hand up
they brought in private buses to their local areas we have to put descriptions of levels down that hopefully would lake away any dirt and extra bits NPs with embedded finite they could bring back trams which are or non-finite (e.g. much less in terms of pollution participial or He also took over some national gerundival) clauses (cf., functions handled now by the Metropolitan Police e.g., Ross 1986: 33)
c. But then it was the right decision for the longer term, to bring down [N-P inflation].
Note that hesitations such as er in (56b) or um were not counted as contributing to the length of the direct object, although they were included as production difficulty (cf. below). The first semantic variable is animacy of the referent, of the direct object (AiviMACY). It is nominal and two levels are distinguished: animate (i.e. 1, for human beings as in (57a) and animals as in (57b) and (57c)) and inanimate (i.e. 0, as in (57d)).B (57} a. b. c. d.
1 am going to bring down a lot of other people with me. I put up a startled grouse that exploded into flight. We'll put him [a bird] back into the marsh around. I was coughing up blood from the stomach.
A second nominal semantic variable is concreteness of the referent of the direct object (CONCRETE). Again two levels were possible, namely abstract (i.e. 0, cf. (58)) and concrete (i.e. 1, cf. (59)), depending on whether the referent is visible and physically manipulable or not. (58) a. b. (59) a. b.
In the future they will carry on that practice. I just wanted to bring up the point about secrecy. Can you get your fingers out? He tore it [an illegal barbed wire fence] down.
Closely related to the abstractness/concrctcncss of the direct object's
72
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
referent is idiomatic meaning of the verb phrase (IDIOMATICITY). In this study, I followed the approaches of Bolinger (1971), Cowie and Mackin (1993: ix) and Fairclough (1965) and took this variable to be measured on an ordinal level, differentiating between literal, metaphorical and idiomatic sentences. A sentence was counted as literal (i.e. it was assigned the value 0) if the meaning of the whole expression was totally predictable from the meaning of the parts (which was generally equivalent to the referent of the direct object undergoing a change of location in a manner specified by the verb); cf. (60). (60) a. You might at least put away your socks, b. You can stick the pin in.
A sentence was classified as being metaphorical (getting the value 1) if its meaning was not fully predictable from the meaning of its parts because of, say, violations of selectional restrictions that could be accounted for with reference to simple metaphorical or mctonymic mappings9 as introduced by Lakoff and Johnson (1980) or, more importantly, preference violations (cf. Fass 1991: 59-68 on the met* method); cf. (61). ' (61) a. I put down comments. b. When writers go abroad they bring back ideas. c. But then it was the right decision for the longer term, to bring down inflation.'"
Lastly a sentence was categorized as idiomatic (getting the value 2) if the meaning of the sentence was not predictable on the basis of the parts alone and maximally two simple mappings (cf. (62)). In cases where this classification procedure seemed only slightly problematic, the degree of idiomaticity was checked using the Longman Dictionary of Phrasal Verbs (1983), where idiomatic expressions are accordingly marked. (62) a. Cerda interviewed those [...], and then threw down the gauntlet to Pinochet, b. Divers should lake out decompression insurance.
The last semantic variable to be dealt with is semantic modification of the particle. However, while its encoding would have been quite easy (either there is an aspectual or perfective marker or not), not a single instance was found in the corpus data, so this phenomenon seems to be very rare, which might be the explanation for the fact that this variable was only considered by a small number of authors.'' Let us now turn to the variety of discourse-functional variables. The clearest and yet most economic way of illustrating their encoding is by means of an example where the whole context of an utterance with a VPC is investigated simultaneously. Table 5:5 is an instance from the written part of the British National Corpus. The first column displays the position of
THE DATA
73
Table 5:5 Example sentence with context from the British National Corpus (Fik-: A91) -10 IT WAS a full year t -9 before he made the break -8 but eventually, on August 21, 1984, he went over to one of the more independent local magazines, Cauce, and asked for its leading journalist Monica Gonzalez. —7 The resulting interview was heavy going for both of them. -6 Monica had been a close friend of Ricardo and Jose Wcibcl. —5 Now she was listening to their kidnapper. —4 What she remembers most —3 was his self-disgust and terror. —2 Monica rang lawyers at the Vicariate of Solidarity; — 1 which had been set up by the Catholic Church in 1975 and was by now Chile's leading human rights organization. 0 The lawyers took Valenzuela off to record his testimony. 1 A few days later, Valenzuela went underground. 2 After several months in hiding he crossed into Argentina and left for France. 3 Monica's interview was published in December 1984 4 and its impact was swift and savage. 5 First to suffer were the relatives of those 6 who had been kidnapped by the Joint Command. 7 They now knew the terrible truth, 8 even though without a body they still could not mourn. 9 They gave Valenzuela little credit for speaking out. 10 'The impact on the Vicariate of Solidarity was even more violent.
clause in the right column relative to the VPC (which itself is accordingly at position 0). The direct object of the VPC under consideration is Valenzuela. Its coreferential expressions arc printed in bold type for expository reasons. The variable news value of the referent of the direct, object is not explicitly encoded for the simple reason that it is operationalized by means of several other inherently much more objective variables to be discussed shortly. The variables last mention of the referent of the direct object (LM) and next mention of the referent of the direct object (Nivi) are nominal: either the referent of the direct object has been mentioned before or after ihe VPC (value: 1) or not (value 0). Looking at the table, first, it becomes obvious that Valenzuela is referred to several times both before and after the VPC. so for this instance each of these variables is assigned the value 1. Second, the last mention of Valenzuela is two clauses before the VPC (namely, in clause - 3) so the variable distance to last mention of the referent oj the direct object (DTLM) gets the value 2; analogously, Valenzuela is mentioned in the same sentence again, which is why distance to next mention of the referent oj the direct object (!)TNM) gets the value 0 (representing no clausal
74
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
distance between the different occurrences).12 Thirdly, in the preceding context, Valenzuela is referred to four times (he, he, their kidnapper, his, we will discuss them later) so that times of preceding mention of the referent of the direct object (TOPM) takes on the value 4, in the subsequent discourse, Valenzuela is referred to four times (his, Valenzuela, he, Valenzuela; we will discuss Joint Command further below) so that times of subsequent mention of the referent of the direct object (TosM) takes on the value 4. From this, it follows that the value for overall mention (Oivi, as a potential measure of the overall importance of the referent; cf. section 2.6.1) is 9 (4 from the preceding discourse. 1 in the VPC itself and 4 from the subsequent discourse). The final two discourse-functional variables that remain measure the cohesiveness of the direct object's referent to the preceding and the subsequent discourse respectively (CcmPC and CoaSC). It was argued previously that simply counting co-referential expressions does not suffice as a means to determine the contextually dependent degree of activation of a referent - however, it was also argued that the notion of cohesiveness as defined by Bolkcstcin and Risselada (1987) is too powerful to be empirically useful (cf. Chapter 2, n. 20). Therefore, in this study I have restricted my attention to three different ways in which the cohesiveness of an entity to the context (i.e. its degree of activation due to contextual clues) can be increased. First, two different ways of increasing the cohesiveness of the referent X of a linguistic expression without explicitly naming a co-referential expression are considered here: 1 If a linguistic expression in the context (be it preceding or subsequent) names a superordinatc term or a subordinate term of X, then the degree of cohesiveness of X to the respective context is increased by one. For example, in Have a look a those tulips. Anyway, I like flowers in general, flowers is cohesive to tulips as it continues the activation of the referent of those tulips in the preceding sentence. 2 If an expression in the (preceding or subsequent) context names a part of X or the whole to which X belongs, the degree of cohesiveness of X to the respective context is increased by one (e.g. in sentence 6 in Table 5:5, the referent of the NP the Joint Command, of which Valenzuela was a part, continues the activation of the concept Valenzuela). Second, of course, co-referential items can also increase the degree of activation of the referent of a linguistic expression X; since using strictly coreferential expressions is the most effective way of continuing the activation of a referent, strictly co-referential items increase the degree of cohesiveness by two. If we accordingly sum up the values for the cohesiveness to the preceding discourse and the cohesiveness to the following discourse, we arrive at the values 9 (4 X 2 for 4 co-referential expressions and 1 for them] and 9 (4 x 2 for 4 co-referential expressions and 1 X 1 for the Joint Command] respectively. Finally, three variables remain. The nominal variable presence of a
THE DATA
75
directional adverbial (PP) comprises two levels and can take on either of two values, depending on whether there is a directional adverbial following the direct object or the particle (i.e. the value 1, cf. (63)) or not (i.e. the value 0, cf.(64))/ (63) a. I would urge the panel to send out [XF their proposed leaflet] [pp to the ministers in various areas]. b. Why don't they put all [_Np the leaders of all the countries] up [pp in the air]? c. It has taken many years to bring [XF the town] up [rf to the standard]. (64) a. Fitt brought down the Labour Government, b. The person cannot draw air in.
The next variable is nominal, too, and it is concerned with whether the particle is identical to the preposition of the following directional adverbial (PART = PREP). It is hardly in need of detailed exemplification: (65) is an instance of a sentence (of only two in the whole of the corpus data) which was classified accordingly. (65) It means you can pack in [xr a lot more things] [pp iri your day].
Finally, let us turn to the interval variable production and planning effects (UISFLUENGY) of Arnold and Wasow (1996). For each clause with a VPC, I preciselyfollowed their approach and counted the number of hesitations such as um and er, false starts and repeats and repairs. Consider (66). (66) a. if you take out er a large cross section b. it can, can er take the stuff [coal] in from the er from the ports
In (66a), there is just one disfluency (namely er) so this sentence was assigned the value 1: in (66b), there is one repetition (can, can) and two signs of hesitation (er), so this clause is assigned the value 3. It needs to be noticed, however, that disfluency was of course only found in the oral data and diat the number of clauses with any disfluencies at all was quite low, namely 15, so statistically significant generalizations should not be expected. After this fairly detailed discussion and exemplification of the coding of the variables for the analyses,'1 Chapter 6 will briefly comment on the statistical techniques to be used and illustrate the multitude of findings from the statistical investigation of the corpus data. Notes 1 In the literature, sometimes slightly differing lists of particle frequencies are given; cf, for instance, Kennedy (1920); Frasor (1965); Nelson et al. (1982); O'Dowd (1994). Iri view of the large overlap of these lists it would be pointless to discuss these different figures at length. I will, therefore, only provide a summary of these proposals:
76
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS • the verbs mosi often found in these combinations are back, blow, break, bring, call, come, fall, get, give, go, hold, keep, lay, let, look, make, put, run, send, set, stand, lake, turn, work, • the most prolific particles are about, across, around (round), aside, away, at, back, by, down, for, forth, in, off. on, out, over, through, to, up, with.
2 3
4
5
6
The (to my knowledge at least) most up-to-date account providing frequency data is O'Dowd (1998). However, in her list (1998: 32) and some comparable ones from above, the frequency information is based on the frequencies of the particles in every grammatical form so that:, for instance, prepositions of intransitive prepositional verbs were also counted. Since these constructions and their contribution to particle frequencies are not relevant to my study focusing only on TPVs, I decided to use only the particle frequencies of actual VPCs as defined above (cf. section 1.1). For our practical purposes the differences are onlymarginal: apart from two particles my list is identical to O'Dowd's even if the percentages differ markedly. The software used was WordSmith Tools 3.0. There is no special motivation underlying the choice of sixteen as the right-hand margin. However, the studies by Chen (1986), Gries (1999) and Hawkins (1991, 1994) have unanimously shown that the probability of construction! with a direct object longer than just four words or six to nine syllables approaches zero, so sixteen words is sufficient to assume that the searching procedure does not unintentionally eliminate surprisingly long direct objects from consideration. The sample also contains some instances of VPCs that do not consist of the ten most frequent verbs and particles listed above. In order to arrive at a not too homogeneous sample, T also included all other VPCs I came across by accident. For example, the concordaricer provided one instance where the verb get and the particle on were found together without constituting a TP\J but the following sentence contained the TPV to loosen off. This was also included in the sample. The question may arise why 'only' 403 exam [lies were used for the analysis. First, although this may seem quite limited at a first superficial glance, it has to be observed that, with 403 cases, this is by far the largest quantitative analysis of particle placement ever undertaken (cf. Hawkins' (1994) analysis of a mere 179 cases or Chen's (1986) analysis of only 239 cases). In other words, the given study investigates nearly as many cases as the two most recent largest analyses together. Additionally, while it is frequently possible to calculate necessary sample sizes on the basis of an expected effect size i'cf., e.g., Cohen 1983) this is not possible for the present design, which is to a large degree exploratory in nature. Thus, while one could of course always say 'more data would be better', it is quite impossible to a priori recommend a minimum number of items on a principled basis. Be that as it may, the results to follow show that the predicted effects are all quite strong and highly significant so that the number of cases is not too small at all. What is more, section 6.3.2 discusses some additional results that will corroborate my claim that the sample size is not too small beyond any doubt. Again, there is no principled basis on which the choice of ten clauses as discourse context is based. Other studies have considered different numbers of clauses but have also admitted that the decision on a specific number is an arbitrary one (cf, e.g., Givoii 1992: 49 n. 17). Moreover, even if some other study has included more than ten clauses from the context, the definition of clauses used differ at times. Note also that in some cases ten preceding or
THK DA'IA
77
subsequent clauses could not be included, e.g. when the clause with the TPV was the first or last of a text. 7 One might wonder whether this distinction is in fact warranted. First, I mentioned in section 2.2 that different authors used different measures of length, which is why the present analysis can test which measure is better suited to the purpose at hand. Second, I know of at least one case (Kintsch 1972, reported in Bock 1982) where the correlation between a dependent variable and LENGTH\\'' was not significant whereas the correlation between the same dependent variables and LENGTHS yielded a significant correlation. 8 There were no instances of the intermediate class of plants as direct objects in the data. 9 A simple metaphorical mapping is loosely defined as a mapping making direct reference to a cogiiitivcly simple and real experience. For instance, in this sense SELF-INITIATED CHANGE OF STATE (ACTION) IS SF.I.F-PROPELLED MOTION is quite a
complex metaphor whereas CHANGE is MOTION is the simpler and more fundamental one underlying the first (examples are taken from the Master Metaphor List at the Conceptual Metaphor Homepage). 10 In this instance, two simple metaphors are combined for the understanding of this sentence: INFLATION is AN ENTITY (cf. Lakoff and Johnson 1980: 26) and LESS is DOWN (cf. Lakoff and Johnson 1980: 15—16). For a study focusing on the role metaphor and metonymy play for the semantics of VPCs, and for details of how different metaphors can jointly license a non-literal (i.e. non-spatial) meaning of the construction, cf. Morgan (1997). 11 The variable semantic focus of the verb phrase is not encoded. On the one hand, we have seen previously that this variable is more pragmatic than semantic in nature. On the other hand, it is difficult to see how corpus data would enable one to falsify the claim that this variable is important: there is no possibility to infer from the corpus data which referent the speaker intended to focus on. Possibly this variable can most fruitfully be investigated experimentally or with a corpus very precisely annotated with different stress levels. 12 The variables distance to last mention of the referent of the direct object and distance lo next mention of the direct object were receded for the following reason: in many instances, there is no mention of the referent of the direct object in the prior or subsequent context so that a variable measuring distance could not be assigned any value at all. This is not a problem as such, but it can be a problem in many statistical analyses since, by default, these eases enter into the analyses as so-called missing data (MD). If several variables have missing data in different eases, then one might investigate hundreds of cases, but could be forced to consider only a few, namely those where not a single variable lacks data. Therefore, DTLM (with its possible values from 0 to 9 and MD) was rccoded with values from 10 (for a distance of 0) to 0 (for no prior mention at all), as a variable called activation from the preceding context (AcrPC); for the above example, where DTLM = 2. that means that the AcrPC value is 8. Likewise, DTNM (with its possible values from 0 to 9 and MD) was receded the same way as a variable measuring the extent to which the direct object forms a cluster with the next mention (CLUsSC). For this example, where DTNM = 0, GursSC is 10. Thus, the statistical techniques are applied using these two new variables, which has no bearing on the conceptual basis of the analyses and the interpretation of the results.
78
MUITIFACTORIAL ANALYSIS IX CORPUS LINGUISTICS
13 The variables habitual meaning of the verb phrase arid phonetic shape of the verb have not been included in this discussion since they have, for the reasons given in section 2.6.1, not been taken to contribute to particle placement. The variable stress pattern of the verb phrase has also not been included here since no phonologically annotated corpus data were available. Nevertheless, on the basis of the results to be discussed further below, I will return to this variable again.
6
Results and discussion
The present study utilizes several statistical procedures ranging from verysimple to quite advanced to achieve both its linguistic and methodological goals. In order to render the results of this study more accessible to statistical laymen, I will in each section briefly comment on the procedures that were used, although it is not desirable to cover all the techniques in detail. I will, thus, explain the most essential characteristics of the techniques, all the results and their contributions to linguistic as well as further-ranging methodological issues. 6.1 Monofactorial results 6.1.1
Introduction
This section will illustrate in detail the results of rnonofactorial techniques, i.e. the extent l.o which each variable contributes to particle placement in isolation. One might ask why such monofactorial results are still dealt with, although it was argued previously that they suffer from several drawbacks. The reason for the detailed illustration of monofactorial results to follow is threefold. First, J want to test empirically whether the previous (mostly nonempirical) analyses of particle placement are supported. Second, I want to provide a descriptively more adequate characterization of particle placement, which overcomes the numerous deficiencies of a very general nature mentioned previously. Finally, the monofactorial analysis yields indices representing the absolute strength of the relation of each variable to particle placement in isolation. The most basic statistical method that will be used in this section is the calculation of frequencies to be represented in contingency tables for Chisquare analyses. For each table, the degree of significance (or the absence of it) of the observed distribution will be determined to identify typical characteristics of each construction.' In order to summarize these results for each variable in a single statistic, one can calculate correlation coefficients indicating: • the direction of the relationship between an independent variable and particle placement: a positive value represents a positive relationship,
80
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
i.e. high values of the variable eorrelate with a high value for the construction (namely construction,) and vice versa; a negative value represents a negative relationship, i.e. low values of the variable correlate with a high value for the construction (namely construction,) and vice versa;2 • the strength of the relationship between an independent variable and particle placement: the higher the absolute value of the correlation coefficient, the stronger the relationship (cf. n. 5 for nominal variables).' Apart from only observing the correlations between variables as such, we will in some cases also compute partial correlations in order to identify spurious correlations between variables. The statistical investigation of such a huge mass of data yields an enormous amount of individual results, and it would be quite tiring to review all these results at the same level of specificity. Therefore, the discussion of the results is organized as follows. Each kind of statistical result will at first be discussed quite extensively on the basis of one or two examples in order to introduce the less statistically informed majority of readers to the principles of the type of analysis and its subsequent interpretation. At a later stage, the discussion of the results will be much less lengthy. Sections 6.1.2 to 6.1.4 will deal with the results concerning the (morphosyntactic, semantic and discourse-functional) variables of the Processing Hypothesis. Then, section 6.1.5 will discuss the results concerning the remaining variables and those that I claim to be irrelevant; finally, section 6.1.6 will summarize the monofactorial results. 6.1.2 Morphosyntactic variables of the Processing Hypothesis The first variable to be dealt with here is complexity of the direct object. In the data, the distribution of the two constructions represented in Table 6:1 was obtained. In order to find out whether this observed distribution is significant (so that COMPLEX can be argued to influence the choice of construction) or not (so that COMPLEX probably does not influence the choice of construction), one first needs to calculate the expected dislribution of the two constructions, i.e. the distribution that would result from COMPLEX having no Table 6:1 Observed distribution of constructions relative to COMPLEX
Construction,, Construction! Column totals
a- >.!„ , KareNPs(O)
Intermediate .NT (I)
, ,-,_,,,,., Complex.NP (2)
Row totals
76 186 262
102 22 124
16 1 17
194 209 403
RESULTS AND DISCUSSION
81
influence on the choice of construction. In these cases (i.e. those where there are no previous distributional assumptions), the frequencies expected according to H, arc based on the so-called marginal frequencies, i.e. the row totals and the column totals, yielding the distribution in Table 6:2. From these two tables, the Chi-square value for the complete table can be calculated. In this particular case, Chi-square is highly significant, showing that the observed frequencies deviate from the expected frequencies in such a way that it is extremely unlikely to get such a distribution on the basis of pure chance (as predicted by H,,): % '(2) = 110.63; p<.()() 1. In other words, we can conclude that the complexity of the direct object seems to have quite a strong influence on the choice of construction. However, as Givon (1992a: 308) has correctly cautioned us, we need to identify further which of the six cells in Table 6:1 contribute significantly to the overall highly significant Chisquare value. Thus, for each of the six individual cells, its contribution to Chi-square is calculated and shown in Table 6:3. The contributions to Chi-square should be further examined by a posteriori tests concerning which of the contributions to Chi-square is significant. However, it is important to note that multiple post hoc tests inflate the probability of error dramatically: while accepting H, on the basis of a single significance test with an a-level of .05 yields an incorrect result with a probability of .05. the six post hoc tests possible for a 3 x 2 table result in a probability of 0.265 for accepting H, erroneously at least once. Thus,
Table 6:2 Expected distribution of constructions relative to COMPLEX'' Simple < Bare .NP (0) Construction,,
194-262
Construction,
403 209-262
Column totals
= 126
~ 136 403 262
Intermediate MP (1) 194-124
403 209-124 403 124
= 60 = 64
> Complex Complex .A'P (2) 194-17
403 209-17 403 17
=8 =9
Row totals 194 209 403
Table 6:3 Contributions to Chi-square for the distribution in Table 6.1 Simple < Bare .NP (0) Construction,, Construction, Column totals
19.92 18.49 38.41
Intermediate MP (J)
> Complex Complex jVP (2)
Ron; totals
29.99 27.83 57.82
7.47 6.93 11.4
57.37 53.25 110.63
82
MULTIFACTORIAL ANALYSIS IX CORPUS LINGUISTICS
statisticians recommend the use of cither Dayton's correction formula, the equally conservative Bonferroni correction, Holm's correction or the technique of configural frequency analysis (cf., e.g., Bortz et al. 1990: 51-2, 155 8). In the following tables, we will, therefore, only be concerned with contributions to Chi-square that were corrected according to Bonferroni for a posteriori significance. In this case, where we have six cells in the table, the critical value of a posteriori tests of contributions to Chi-square is 6.96. That is to say, nearly all the values contribute significantly to the overall highly significant result. But what do the results mean? The answer is straightforward, given the relation between the observed values and the expected values in Tables 6:1 and Table 6:2 respectively. In each of the cells significantly contributing to the overall Chi-square (in this case, all the cells), we simply compare the observed with the expected frequency: • for bare NPs, we see that construction,, is much less frequent than one would have expected by pure chance, whereas construction, is much more frequent than expected by pure chance; this can be summarized by a correlation coefficient for bare NPs and the choice of construction: X = .49;p<.001;s • for intermediate NPs, we find that construetion() is much more frequently found than expected, whereas construction, is much less frequent than would be expected by pure chance: A, = .412; p<.001; • finally, in the case of complex NPs, construction, is found just once: A. = .077;p<.001. In other words, on closer inspection we find that bare/simple object NPs yield a strong preference for construction, on the part of the speaker; direct object NPs with even the slightest degree of complexity yield a similarly strong preference for construction,,. The corresponding correlation coefficient for the variable COMPLEX as a whole, which can also be computed from the original data, is also highly significant: Somer's d (for COMPLEX as the independent variable) = —.524; p<.001. The negative value also indicates that the relationship is an inverse one, since the higher the degree of complexity, the lower the value for the construction that will be chosen and vice versa. Thus, the data support both preceding analyses (e.g. Fraser 1976 and Hawkins 1994) and the Processing Hypothesis proposed in the present study. A final comment must be made here with respect to the role of the register of the analysed sentences. In order to find out whether the register interacts1' with the influence that variables have, register-specific results will also be discussed. Consider Figure 6:1. Each dot in the diagram indicates the number of sentences of each construction with a direct object of the degree of complexity in the respective column; for instance, the © stands next to the dot indicating that there are 117 instances of construction, with simple direct objects in the oral data. Figure 6:1 can be interpreted as
RESULTS AND DISCUSSION"
;«
Figure 6:1 Interaction plot: construction X REGISTER X COMPLEX follows: in both registers, simple direct objects occur more frequently in construction, and intermediate and complex objects occur more frequently in construction,,. Both of these distributions indicated by the line plots are highly significant (observe the correlation coefficients for oral and written data given in the upper part of each diagram) so that, for COMPLEX, the register of the sentences does not have any particularly interesting influence: the overall observed significant effects of the complexity of the direct object hold for both the spoken and the written data, although COMPLEX is slightly more influential in oral data. Apart from the linguistic aspect of this variable this rather lengthy discussion of the complexity of the direct object also served an introductory function. The following discussions of a variable's impact on particle placement will be considerably shortened in that, for nominal and ordinal variables, we will only represent the table with the observed frequencies and the resulting conclusions. Moreover, interaction plots will only be provided in Appendix 10.2. The next two variables to be discussed are length of the direct object in words (LENGTuW) and length of the direct object in syllables (LENGTHS). The average length of direct objects in construction,) is 4 words (SD: 3.1) and 7.2 syllables (SD: 5.6); the average length of direct objects in construction! is 1.7 words (SD: 1) and 2.4 syllables (SD: 1.8). This apparent difference of average lengths is highly significant both for words (tVV(,|fh(228) = 9.78; p<.001) and syllables (tw,|,.h(229) =11.34; p< .001 ).7 The relation of each variable to particle
84 MULTIKACTORIAL ANALYSIS IN CORPUS LINGUISTICS
placement can again be summarized by a simple correlation coellicient: r,,EMm,w mnaru^im = -.45 (t(401) = 10.08; p<.001; r2 = .202), and r IJWlllS ,-,mS1nirnon = ~- 5 (t(401) = 11.69; p<.001; r2 = .254)." These numerical data mean that the longer the direct object, the smaller the value for the constructional choice (i.e. the more frequently construction,, is used) and vice versa. Moreover, the present analysis shows that measuring the direct object's length in syllables results in a stronger relationship between length and the choice of construction, so that the explanatory power of LENGTHS is higher than that of LENGTHW, supporting the way of operationalization chosen by Chen (1986) rather than the one by Hawkins (1994). However, the data can still be investigated from another perspective, namely by crosstabulation. Table 6:4 shows that there is a clear cut-off point of LENGTHW for the choice of construction: for objects shorter than three words, construction, is preferred, while for objects longer than two words construction,, is preferred; also, if the direct object is longer than seven words, construction! is not used at all. For spoken language alone, the findings are nearly identical, for written language, the results arc somewhat more extreme: construction, is only preferred for direct objects that are one word long, and direct objects consisting of more than three words are, with one exception, only used in construction,) (cf. Figure 10:2 in Appendix 10.2). Similar observations can be made for Table 6:5, where the distribution of constructions depending on the direct object's length in syllables is portrayed. Objects shorter than four syllables prefer construction,, while objects longer than three syllables prefer construction,,, and if the direct object is longer than ten syllables, construction, is not used at all. Table 6:4 Distribution of constructions relative to LENGTHW'' • Long
Row
I
2
3
4
5
6
/
8+
totals
2G 103 129
51 81 132
35 15 50
25 5 30
15 2 17
10 2 12
8 1 9
24
194 209 403
Short <
Construction,, Construction, Column totals
24
Table 6:5 Distribution of constructions relative to LENGTHS Long
Row
1
2
3
4
3
(,
7
8
9
10+
totals
5 82 87
26 51 77
26 35 61
21 18 39
19 8 27
14 8
18 1 19
1
2
7 4 11
51 — 51
194 209 403
Short
Construction,,
Construction, Column totals
9')
9
RESULTS AND DISCUSSION
8.1
For spoken language, the findings are virtually identical, for written language, the results are more extreme: a significant preference for construction, is only found for monosyllabic direct objects, and direct objects longer than six syllables hardly occur in construction, (cf. Figure 10:3 in Appendix 10.2). In sum, the direct objects typically found in construction, are, due to their limited length, plausibly taken to require less processing cost than the longer objects prevailing in construction,,. In other words, the distribution is as predicted in the Processing Hypothesis, and for extremely long or short objects fairly strong and register-independent predictions can be made as to which construction will be preferred. Let us now investigate the variable NP type, of the direct object. Table 6:6 gives an overview of the distribution of the two constructions depending on this variable. Again, the overall distribution as it is shown is highly significant (5(a(3) = 97.43; p<.001; A, = .366; p<.001), so TYPE is definitely quite an important variable. However, given the large number of levels of TYPE, the question remains where this significant value comes from. The investigation of the contributions to Chi-square (the Bonfcrroni-corrected critical value is 7.477) shows that semi-pronominal nouns and proper names as direct objects do not have a significant influence \k = 0; p = 1).'° Especially remarkable is the result of pronominal direct objects, for which only construction, is found (X = .32; p<.001)." As was argued in the literature, this variable is definitely a very powerful one. Still, what has only been pointed oul once so far (cf. Biber el. al. 1999: 932) is the highly significant preference of lexical direct objects for construction,, (A, = .366; p<.001). However, this last observation has to be qualified as it is strongly register-dependent. For spoken language, lexical direct objects favour construction, in about 55 per cent of the cases in both my analysis and the one by Biber et al.['2 For written data, however, lexical direct objects very strongly prefer construction,, (namely in about 75 per cent of the cases; cf. similar results by Biber et al. 1999: 932) so TYPE: lexical is considerably more important in written language (cf. Figure 10:4 in Appendix 10.2). As to the other levels of TYPE, no register-dependent effect can be found. On the whole, however, the tendency of semi-pronominal nouns to occur in construction,, the significant preference of lexical objects for construction,, (lexical objects are most likely to be newsworthy) and the Table 6:6 Distribution of constructions relative to TYPE
Construction,, Construction, Column totals
Pronominal
Sampronominal
Lexical
Proper name
Row totals
77 77
3 10 13
186 115 301
5 7 12
194 209 403
86
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
behaviour of pronominal objects definitely support the Processing Hypothesis. As the next variable, consider the determiner of the direct object. The corpus analysis resulted in the distribution shown in Table 6:7. The overall distribution is highly significant (f(2] = 41.04; p<.001; A, = .206; p<.001), but a more detailed investigation again shows that there is more to be said about this result: again, not all the levels of DET contribute to this overall result. As was argued by Fraser (1974), we find that direct objects without a determiner result in a significant preference for construction, (A. = .191; p = .016) and direct objects with an indefinite determiner yield a highly significant preference for construction,, (partially supporting Chen 1986; A, = .206; p<.001). However, contrary to what has been previously assumed, direct objects with definite determiners do not yield any preference for a construction (A, — 0; p = 1): even an informal glance at Table 6:7 shows that the observed distribution of constructions for direct objects with definite determiners (86 vs. 89) almost fully matches the one expected from the row totals.13 Once more, however, closer inspection reveals some interesting though hitherto unnoticed patterns. First, let us have a look at the preference of direct objects without a determiner for construction,. On the basis of the overall Chi-square value alone, one might assume that Fraser's claim about this variable's influence on particle placement has been supported. However, this assumption is fundamentally flawed. First, the contributions to Ghi-squarc fail to reach the Bonferroni-corrected critical value of 6.96, and A. is zero. Second, as was argued previously (cf. Chapter 2, n. 5), the level no determiner is quite a large and heterogeneous class. For instance, the level no determiner is not independent of TYPE since if there is a determiner, the direct object will be lexical, if there is no determiner, the direct object can, but need not, be lexical at all. Thus, I have tested whether the level no determiner \s only significant because it is related to TYPE, and there is indeed such a hidden effect: the abovementioned correlation between no determiner and the choice of construction is high only because most of the direct objects without a determiner are pronouns; if we partial out the influence of pronominal direct objects on the correlation between no determiner and particle placement, then no determiner has absolutely no impact on the choice of construction: r2 = .003; z = 1.01;
Table 6:7 Distribution of constructions relative to DET determiner
Indefinite determiner
Definite determiner
Row totals
56 108
52 12
86 89
194 209
164
64
175
403
JV0
Construction,) Construction, Column totals
RESULTS AND DISCUSSION
87
p = .311. 14 In other words, we have identified a spurious correlation in the data that can only be detected by more careful consideration. Second, in spoken English, only the preference of indefinite determiners for construction,) approaches significance - in written English, the results closely resemble the overall results (cf. Figure 10:5 in Appendix 10.2). In sum, previous assumptions about the role of definite determiners are not supported, but former analyses of the impact of indefinite determiners on particle placement are clearly borne out by the data. Finally, we have seen that Eraser's claim about no determiner correlating with construction,, is only due to the strong influence of pronouns (a finding his monocausal analysis failed to make). These results, in turn, provide empirical support for the Processing Hypothesis as they correspond to the predictions formulated in Chapter 4. In the following section, we turn to the results concerning the semantic variables and their relation to the Processing Hypothesis. 6.1.3 Semantic variables of the Processing Hypothesis The first semantic variable we will examine is the idiomaticity of the verb phrase; the distribution of the two constructions for this ordinal variable is shown in Table 6:8. This distribution is quite interesting in several respects. First, it is highly significant (x~(2) = 60.1; p<.001). Second, the highly significant negative correlation coefficient (Somer's d for IDIOMATICITY as the independent variable = -.319; p<.001) shows that, on average, the more idiomatic the meaning of the verb phrase, the more likely construction,, is used [k = .253; p<.001) and vice versa, i.e. the more literal the verb phrase, the more often construction! is used (k = .268; p<.001). Metaphorical constructions, however, do not contribute to the overall effect as they are distributed nearly evenly: A, = .015; p = .82 (contrary to Bolingcr's assumption; cf Chapter 2, n. 10). Finally, the overall effect is almost totally clue to the examples from written texts in the data on oral discourse, IDIOMATICITY is of little relevance to the choice of construction (cf. Figure 10:6 in Appendix 10.2). Thus, again we see that if there is an effect at all, then it corresponds to traditional findings and the Processing Hypothesis. But there is more to learn: • we have seen that it is pointless to argue for the influence of a scalar Table 6:8 Distribution of constructions relative to IDIOMATICITY
Construction,, Construction, Column totals
Literal < Literal IT (0)
Metaphorical VP (1)
> Idiomatic Idiomatic VP (2)
Row totals
43 IfO 153
88 85 173
63 II 77
194 209 403
88
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
variable if one does not investigate the impact of each level in detail: it would riot be appropriate to simply claim that IDIOMATICITY (consisting of ihe three levels literal, metaphorical and idiomatic) is influential (as was done by Bolinger 1971 and others) - a more detailed analysis has shown that the effect argued for is only due to two levels, which one would have to include in one's explanation; • we also see how important it is to control for the influence of REGISTER: again, it would not be appropriate to claim that IDIOMATICITY is generally important since, in oral data, the observed distribution of just one cell deviates significantly from the expected value whereas written data are much more influenced by the meaning of the verb phrase. The next semantic variable to be investigated in this section is concerned with the concretcncss of the referent of the direct object. Consider Table 6:9. Table 6:9 Distribution of constructions relative to CONCRETE
Construction,, Constructioiij Column totals
Abstract referent (0)
Concrete referent (1)
Row totals
125 64 189
69 145 214
194 209 403
Obviously, this cross-patterning is significant (x a (l) = 46.18; p<.001 and A, = .314; p<.001). Although there was no analysis in the previous literature which explicitly postulated such a correlation (recall that this variable wras only included to replace part of the faulty entrenchment hierarchy in Gries 1999), we sec that, nevertheless, the distribution follows the pattern predicted in the Processing Hypothesis: abstract referents are preferred in construction,, and concrete referents are preferred in construction,. What is more, all the cells in Table 6:9 contribute to this effect, thereby underscoring the importance of this variable. Finally, let us turn to animacy of the referent, of the direct object. Consider Table 6:10. At a superficial glance, ANIMACY plays a role for particle placement Table 6:10 Distribution of constructions relative to ANIMACY
Construction,, Construction, Column totals
Inanimate (0)
Animate (I)
Row totals
177 166 343
17 43 60
194 209 403
RESULTS AND DISCUSSION
89
Or(l) = 11 '08; p<.001). However, before we take this at face value it is worth further scrutinizing this result. First, the correlation coefficient shows that ANIMACY is virtually useless when it comes to predicting the choice of construction: X = .057; p = .552. Second, the contributions to Chi-square reveal that none of the individual cells reaches the critical value of 6.24. Third, if we take the effect of the register into consideration, we find that of eight cells in two tables not one contains a significant corrected contribution to Chi-square, so the assumption of the influence of ANIMACY is further eroded (cf. Figure 10:8 in Appendix 10.2). Finally, and most importantly, however, it has to be taken into consideration that ANIMACY is not independent of another variable, namely CONCRETE: a referent can only be animate if it is concrete; put differently, the level animate not only tells us that the direct object's referent is animate - it also tells us that the referent is concrete. From this it follows that we need to find out whether ANIMAO;Y really contributes to particle placement because of the information it provides about animacy or whether it contributes to particle placement because of the information it implicitly provides about concreteness (or both). Therefore, I calculated the semi-partial correlation between AXIMACY and the choice of construction where (he influence of CONCRETE on particle placement is partialled out. This semi-partial correlation drops down to zero: r = .001; t(400) = .757; p = .449. In less mathematical terms, ANIMACY only influences particle placement because of its implicit information about the concreteness of the direct object's referent — information about the animacy of the referent as such does not contribute to particle placement at all. Again, this is an example showing that sometimes more advanced techniques than the ones commonly found need to be used in order to detect patterns or identify artefacts arising from the data. In sum, all the variables show effects that were predicted on the basis of the Processing Hypothesis. The next section will explore whether this also holds for the discourse-functional variables.
6.1.4 Discourse-functional variables of the Processing Hypothesis Let us first look at the variable last mention oj the referent of the direct object, i.e. at whether the referent of the direct object has been mentioned in the context preceding the VPC. Consider Table 6:11. This distribution is obviously highly significant (X 2 (4) = 68.04; p<.001 and Table 6:11 Distribution of constructions relative to LM
Construction,, Construction, Column totals
Discourse-new (0)
Discourse-old (!)
Row totals
141 66 207
53 143 196
194 209 403
90
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
A. = .387; p<.001). All contributions to Chi-square are highly significant such that construction,, is preferred for discourse-new referents (and dispreferred for discourse-old referents) while construction, is preferred for discourse-old referents (and dispreferred for discourse-new referents); the Processing Hypothesis is clearly supported.'3 On the whole, these results are also found in written language - in spoken data, however, the predictive power of LM is insignificant (cf. Figure 10:9 in Appendix 10.2). Next, we will examine whether distance to the last mention of the referent of direct object has some influence on particle placement."' For construction,,, the average value is 2.03 (SD: 3.56); for construction, the average value is 6.07 (SD: 4.36). Although the high standard deviations show that both constructions are far from being used homogeneously with respect to this variable, the large difference between the two average values is, nevertheless, highly significant, as is also evidenced by a comparatively large positive correlation coefficient (r2 = .204; t(401) = 10.15; p<.001), which shows that high values of this variable (i.e. short distances to last mention) go together with construction, whereas low values (i.e. long distances to last mention or no last mention at all) go together with construction,,, as was observed by Chen (1986) and predicted by the Processing Hypothesis. While this relationship holds in both registers alike, the strength of the relationship differs markedly (though the difference just fails to reach significance): AcxPC has a much stronger influence in written data than in spoken data (cf. Figure 10:10 in Appendix 10.2). Table 6:12 summarizes the distribution of constructions for both registers. This table shows that although there is a clear and significant preference for objects whose referents have not been mentioned before to occur in construction,, (141 vs. 66), even this extreme value of this variable is by no means categorical. Moreover, although there is a clear threshold value, the exceeding of which leads to a preference for construction, (namely six), a significant preference for construction! can only be found for the values nine and ten (10 vs. 27 and 11 vs. 76 respectively). On average, these results hold again for both registers with the striking exception that for referents never having been previously mentioned, only written discourse displays the above-mentioned preference for construction^ in the spoken data, unmentioned referents had no preference, i.e. both constructions are equally likelv to be used. Table 6:12 Distribution of constructions relative to AcrPC (DTLM)
Construction,, Construction, Column totals
Long distance < 0 1 2
3
4
5
6
7
141 66 207
2 2 4
1 2 3
1 2 3
5 5 10
8 14 22
3 3
1 1 2
> Short /no distance Row 8 9 1 0 + totals
11 14 25
10 27 37
11 76 87
194 209 403
RKSULTS AND DISCUSSION
91
Intimately related to the preceding variable is times of preceding mention of the referent of the direct object. Referents of direct objects are, on aggregate, mentioned .45 (SD: .9) times (i.e. on the whole, not even once) before they occur in construction,, and 1.92 (SD: 2.07) times before they occur in construction,. This difference between no mention and two mentions in the preceding discourse is highly significant, which is also reflected in the (squared) correlation coefficient: r = .172; t(401) = 9.14; p<.001. Interpreting these values is straightforward: the more/less often the referent of the direct object has been mentioned in the preceding discourse, the more frequently construction|/construction,, has been chosen respectively, thereby supporting previous research by Chen (1986) as well as the present account. Again, this relationship holds for both registers but is stronger for written discourse (cf. Figure 10:11 in Appendix 10.2). Consider now Table 6:13 for a summary of the distribution for both registers. Obviously, construction, is already preferred even if the referent of the direct object has been mentioned before just once, and this preference is already significant for referents that have been mentioned twice, a pattern that holds, on the whole, for both registers. The final variable concerned with the preceding context to be dealt with in this section is cohesiveness of the referent of the direct object t,o the preceding discourse. The data show that referents of direct objects in construction0 are much less cohesive to the preceding discourse (AM: 1.75; SD: 2.44) than referents of direct objects in construction, (AM: 4.98; SD: 4.1). This difference is, as might be expected in view of the previous results, highly significant. Similarly, the significant correlation (r = .184; t(401) = 9.52; p<.001) also shows that the higher the cohesiveness, the more often construction, was used, while a low degree of cohesiveness more often resulted in a speaker's decision for construction,,. Analogously to the other discourse-functional variables, we again find that the correlation is stronger for written language (cf. Figure 10:12 in Appendix 10.2). From this table, we can conclude that for both registers together, construction, is more likely to be chosen after either the referent has been named explicitly or evoked by twice naming a superordinate term, a part of it or the whole to which it belongs. These results are quite similar to the ones for the written registers - with oral language, even a score of one on the ConPC
Table 6:13 Distribution of constructions relative to TOPM Mot/ rarely mend 0 ; 2
Construction,, Construction, Column totals
141 66 207
33 43 76
12 38 50
3
4
3 25 28
4 11 15
i t i o n e d 6+ 5 1 12 13
0 15 15
Row totals
194 209 403
92
MULTIFAC
FORIAL
ANALYSIS
IN
CORPUS
LINGUISTICS
Table 6:14 Distribution of constructions relative to CotiPC No I law cokesivmess <— 0 1 2 3 Construction,, Construction, Column totals
81 19 100
35 19 56
26 39 65
18 11 29
4
5
6
19 31 50
5 10 15
2 21 23
—> High cohesiveness 7 8 9 10+
Row totals
1 13 14
194 209 403
0 11 11
1 3 4
6 32 38
scale (i.e. naming, e.g., a superordinatc term only once) is already sufficient to yield a preference for construction,. Finally, since the correlation between particle placement and the cohesiveness to the preceding discourse is slightly higher than the one between particle placement and the times of preceding mention, there is at least a tendency supporting Bolkestein and Risselada's (1987) proposal to substitute the notion of cohesiveness for a strictly co-refercntially based notion of topicality along the lines of Givon (1983). Let us now concentrate on those variables pertaining to the following context, starting with NM. Consider Table 6:15. This distribution is, according to the overall Chi-squarc value, significant at the 5 per cent level (X 2 (0 = 4-33; p = .037). However, the small size of the correlation coefficient indicates that NM does not help much to predict native speakers' choices: (A, = .072; p = .299). Moreover, not a single contribution to Chi-square is significant (the smallest p-value is .27), so that even though the overall observed distribution deviates significantly from the expected one, there is no specific effect or level to which the overall significance of the results can be attributed. Finally, if we investigate each register separately, then we find that for both spoken and written language, the distributions are not significant at all (cf. Figure 10:13 in Appendix 10.2). Thus, this variable (that I have argued to be irrelevant on the grounds of the Processing Hypothesis) docs not seem to contribute to particle placement in any way worth mentioning. If, however, we take the significant Chi-square value al face value, then we must also note that the distribution of constructions is not the one following from Chen's (1986) results. We will return to this issue with regard to the following two variables and in the context of the multifactorial analysis.
Table 6:15 Distribution of constructions relative to NM
Construction,, Construcrion, Column totals
Not mentioned again (0)
Mentioned again (1)
Row totals
98 84 182
96 125
194 209 403
221
93
RESULTS AND DISCUSSION
Next, we need to establish whether the distance to the next mention of the direct object's referent yields a preference for one of the two constructions: for construction,, the average is 4.28 (SU: 4.56); for construction, the average is 5.2 (SD: 4.57). From this, one can already guess that the difference will probably barely be significant if at all, which is indeed the case: r = .01; t(401) = 2.01; p = .046. First, this shows that this variable accounts for just 1 per cent of the variation of particle placement. Second, the correlation between Cu?sSC and the choice of construction is significant for neither spoken nor written language; cf. Figure 10:14 in Appendix 10.2. This observation drastically undermines the importance of this variable since, for none of the other variables with a significant overall distribution did we find no influence in cither register. Finally, if we look at the crosstabulation in Table 6:16, we see that although there is a cut-off point at the value of eight, none of the higher values reaches (eight, nine and ten) a significant preference for construction, by a binomial test. In sum, while an initial glance at the statistics suggests this variable to be relevant, a closer look implies otherwise, since all the figures only accessible through more thorough analysis reveal that this variable's influence is quite problematic. Moreover, if the referents of direct objects are mentioned again very soon, construction, is preferred, although Chen's (1986) analysis suggested the opposite. For now, I take this variable to be of minor importance only; in due course we will return to it to decide on its relevance. The next variable to be investigated is times of subsequent mention of the direct object's referent. The referents of direct objects in construction,, are, on average, again referred to once (more precisely. AM: 1.04; SD: 1.34) while those of construction, are referred to twice (more precisely AM: 1.73; SD: 2.12); this difference is highly significant. Correspondingly, the correlation between TOSM and particle placement is also highly significant: r = .037; t(401) = 3.92; p<.001. For the two registers, the situation is somewhat different: in oral language, the relation nearly fails to reach significance, whereas in written language, the relation is stronger; cf. Figure 10:15 in Appendix 10.2. Finally, from the cross-tabulation shown in Table 6:17 no consistent pattern emerges.
Table 6:16 Distribution of constructions relative to CLUsSC (DTNM)
0
/
2
3
4
5
6
7
* Short /No distance Row 8 9 10 totals
98 84 182
1 2 3
1 1 '2
3 4 7
0 1 1
3 3 6
5 7 12
1 5 12
8 12 20
Ijnng
Construction,, Construction, Column totals
distal*
19 22 41
49 194 68 209 117 403
94
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
Table 6:17 Distribution of constructions relative to TOSM Often mentioned Row
Not/Rarely mentioned
0 1 2 3 4 5+ totals Construction,, Construction, Column totals
98 84 182
19 16 35
26 30 56
39
40 79
9 12 21
3 27 30
194 209 403
Objects that are mentioned four times and more often in the subsequent discourse correlate with a preference for construction, (again contrary to Chen's analysis). However, only the distribution of constructions in the 5+ column is significant. From all of this, we can conclude that the frequency of the direct object's referents is somehow related to particle placement. Let us now consider the cohesiveness of the direct object's referent to the subsequent context. The average for construction,, is 3.11 (SD: 2.97), while the average value for construction, is 4.18 (SD: 4.29), and this difference is very significant: r2 = .02; t(401) = 2.88; p = .004. Although this initially seems a straightforward contribution to particle placement it is not fully supported by the results in both registers: for oral data, there is no significant correlation; and for written data, the correlation nearly fails to reach significance. Moreover, as with the immediately preceding variable, it is difficult to see any consistent pattern once the data are analysed using cross-tabulation: a degree of cohesiveness higher than nine correlates with construction, in the preceding VPC, but the overall picture here is less than clear. The last discourse-functional variable to be investigated is overall mention of the referent of the direct object. For construction,,, the referent has, on aggregate, been mentioned 2.48 times (SD: 1.69) in the discourse; for construction,, the figure is nearly twice as large: 4.65 (SD: 3.58). This difference is highly significant: r2 = .128; t(401) = 7.66; p<.001, showing that the more often the referent is mentioned, the more frequently construction, was chosen. This relationship is valid for both registers although it is, again, slightly stronger in written discourse.
Table 6:18 Distribution of constructions relative to ConSG High cohesiveness Row
No/low cohesiveness
Construction;, Construction, Column totals
0
1
2
3
4
5
6
7
8
9
1 0 +
46 49 95
23 17 40
37 36 73
15 12 27
17 24 41
15 5 20
14 14 28
7 11 18
8 8 16
6 3 9
6 30 36
totals
194 209 403
RESULTS AND DISCUSSION
95
Table 6:19 Distribution of constructions relative to OM
Construction,, Construction, Column totals
Infrequent 1 2
F r e qequent uent
Row Row
3
4
5
6
7
8+
totals
44 31 75
22 24 46
24 28 52
14 16 30
8 11 19
3 14 17
2 42 44
194 209 403
77 43 120
There is also a clear cut-off'point: if the direct object's referent is important enough in the discourse to be mentioned more than two times, then construction, prevails; if, on the other hand, the referent is only mentioned in the VPC or one more time somewhere else, then construction,, prevails. Note, however, that this variable is fully determined by the variables TOPM and TOSM: each value of OM is a sum of all the occurrences of the referent in the preceding discourse (TOPM), the occurrence in the VPC and all the occurrences of the referent in the subsequent discourse (TosM). Thus, we need to find out whether, given the significance of OM, this is due to some emergent unique contribution of OM or to the two individual variables (just as we tested previously, whether the entrenchment hierarchy had some unique influence on the choice of construction or whether its significant correlation with particle placement was only due to its subparts). If we test TOPM, TOSM and OM simultaneously for the predictive power concerning particle placement with a multiple regression analysis, then only TOPM remains significant. In other words, OM is only significant because one of its constitutive variables (i.e. TOPM) is significant, so OM docs not contribute to particle placement on its own.'7 6.1.5 Remaining variables The final class of variables to be discussed are those that cannot so easily be classified into firmly established subdisciplincs of linguistics. Let us start with directional adverbial following the VPC. This distribution deviates highly significantly from the distribution
Table 6:20 Distribution of constructions relative to PP
Construction,, Constructiori, Column totals
Mo following directional PP (0)
Following directional PP (1)
Row totals
182 151 333
12 58 70
194 209 403
96
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
expected from Hn: %2(1) = 32.60; p<.001; the correlation coefficient is, however, only marginally significant although the percentage of reduction of error is considerably high (A, = .16; p = .088). The analysis of the contributions to Chi-square shows that the significance only results from the distribution of constructions before a directional adverbial or PP. In other words, speakers do not have a significant preference for a construction if no directional adverbial follows the VPC. It has to be added, though, that this overall result needs to be taken with a pinch of salt since the distribution of the constructions relative to PP is not significant in oral discourse (A. = 0) while highly significant in written texts: if a directional adverbial follows the VPC, then construction, is more than eight times more frequent than construction,, (cf. Figure 10:18 in Appendix 10.2). Still, though, we can say that if there is any significant effect at all, then it is in the direction described in previous treatments of this variable and predicted by the Processing Hypothesis.'" Let us now turn to Table 6:21 to examine whether it is important for the choice of construction if the particle is the same as the preposition from a following directional PP. As was previously said, the data do not really provide conclusive evidence as to the importance of this variable (but cf. Chapter 4, note 16). The distribution found is, of course, not significant (X 2 (l) = .003; p = .958 and A, = 0; p = 1) and, given the scarcity of the data in this case, a comparison of different registers for significant differences is utterly pointless and it is mentioned here only for the sake of completeness that both examples are from oral discourse: (67) It means you can pack in a lot more things in your day. (68) Kids could fill it in in pencil.
We will now investigate the impact of DISFLUENCY. Since disfluencies do of course not occur in written data, we shall limit our analysis to oral data only. Consider Table 6:22. Given the limited number of disfluencies, the overall picture is far from clear: Table 6:22 does not display a clear cut-off point for cither construction and the overall correlation is, accordingly, not significant at all (r2 = .007; t(198) = 1.19; p = .235), so it seems as if this variable docs not contribute to particle placement significantly (at least when corpus data are investigated, which, however, might not be an ideal source of data on disfluencies). Table 6:21 Distribution of constructions relative to Part = Prep
Construction,, Construction, Column totals
Part * Prep (0)
Part = Prep (1)
Row totals
193(48.13%) 208(51.87%) 401 (100%)
1 (50%) 1 (50%) 2 (100%)
194(48.14%) 209(51.86%) 403 (100%)
RKSULTS AND DISCUSSION
97
Table 6:22 Distribution of constructions relative to disfluency (DTSFLUENCY)
Construction,, Construction, Column totals
0
1
2
3
4
57 128 185
9 2 11
1 1 2
0 1 1
0 1 1
R o w totals 67 133 200
Table 6:23 Distribution of constructions relative to the register (REGISTER)
Construction u Construction, Column totals
Spoken
Written
Row totals
67 (33.5%) 133(66.5%) 200(100%)
127(62.56%) 76 (37.44%) 203(100%)
194(48.14%) 209(51.86%) 403 (100%)
Finally, let us turn to the influence of the register. Consider Table 6:23. Even at first glance, it is quite obvious that this distribution deviates from the one that would be expected if REGISTER did not correlate with the choice of construction: ^(1) = 34.08; p<.001 and A, = .263; p<.001. Moreover, all the contributions to Chi-square are significant so we can safely conclude that construction,, is strongly preferred in written language whereas construction, is similarly strongly preferred in oral language.
6.1.6 Interim summary The quite long treatment of monofactorial results in the preceding sections makes it hard to keep track of the individual results (i.e. the variables' contributions to particle placement and the question of whether they conform to the Processing Hypothesis or not). Therefore, this section will briefly review the multitude of results. We begin by considering Table 6:24 where all the correlation coellicients from the data are listed in order to get an overview of the degree of each variable's strength in isolation. The variables are sorted roughly according to their strength (only roughly because the sizes of r2, A. and Somer's d are also largely determined by the measurement scale of the variables and their way of calculation). The coefficients marked by ® are those that are significant while closer inspection reveals that they are not relevant or at least their relevance is questionable, given the results of partial correlations and contributions to Chi-square. Several conclusions can be drawn on the basis of the results. First, two overall patterns emerge: on the whole, higher correlation coefficients are
98
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
Table 6:24 Correlational strength of each variable Variable /VARIABLE: Value/Level Complexity of the direct object (COMPLEX)
COMPLEX:
Simple NP
COMPLEX: Intermediate JVP Last mention of the referent of the direct object (Lvi) NP Type of the direct object (TYPE) TYPE: Lexical NP TYPE: Pronominal .NP Idiomaticity of the verb phrase (!DIOMATIGITY) Concretencss of the referent of the direct object (CONCRETE) IDIOMATICITY: Literal VP Register (REGISTER) Length of the direct object in syllables (LENGTHS)
IDIOMATICITY:
Idiomatic VP
Correlation d = -.524
A = .49 A, = .412 A, = .387 A. = .366 X=.366 A, = .32 d = -. 319 X = .314 A, = .268 A. = .263 r2 = .254
A, = .253
Determiner of the direct object (DET) A, = .206 DET: Indefinite determiner A. = .206 Distance to last mention of the referent of the direct object (AcxPG) r 2 = .204 Length of the direct object in words (LENGxnW) r 2 = .202 DET: jVo determiner A. = . 191 Cohesivencss of the referent of the dir. object to the preceding r2 — . 184 discourse (GonPC) Times of preceding mention of the referent of the direct object r 2 — .172 (TOPM) Directional adverbial following the direct object (PP) A, — . 16 ns Overall mention of the referent of the direct object (C*M) r2 — . 1288 COMPLEX: Complex JVP A =.077 Next mention of the referent of the direct object (Nsi) A. = .072 ns Animacy of the referent of the direct object (ANIMACY) A, = .057 ns Times of subsequent mention of the referent of the direct object r2 = .037 (TosM) I D I O M A T I C I X Y M : e t a p h o r i c a V l P A = ,0 .1 5 n s C o h e s i v c n e s s o f t h e r e f e r e n t o f t h e d i r o .b j e c t o t h e s u b s e q u e n t r 2 ;
discourse (ConSG) Distance to next mention of the referent of the direct object
r2 = .01 ®
(CujsSC) Production and planning effects (DisFLUENCY) TYPE: Semi-pronominal JVP TYPE: Proper noun DET: Definite determiner Particle equals the preposition of the following PP (PART — PREP)
r 2 = .007 ns A, = 0 ns A, = 0 ns 1 = 0 ns A. = 0 ns
observed for morphosyntactic variables since, with the exception of LAI and CONCRETE, the variables at the top of the table are COMPLEX, TYPE, LENGTHS and LENGTHW. The morphosyntactic variables are, again generally speaking, followed by some discourse-functional variables concerned
RESULTS AND DISCUSSION
99
with the preceding context, namely ConPC, TOPM and AcrrPC. No consistent pattern is visible for semantic variables, but it can still be said that the discourse-functional variables concerned with the subsequent context are, on the whole, of quite limited relevance. Furthermore, if these latter discourse-functional variables show an effect at all, it is not the one argued for by Chen (1986): if the referent of the direct object is mentioned again frequently and soon after the VPC, it is construction, that is preferred. In other words, as far as these variables are concerned the present study yields exactly the opposite results of Chen (1986). What do we make of ihis difference? A possible answer goes along the following lines. If, on the one hand, a direct object's referent that has not been mentioned before is to be introduced into the discourse, the speaker will choose construction0 for the reasons outlined in the Processing Hypothesis. At the same time, it is of course to be expected that a speaker will not introduce a newconcept into a discourse by positioning it in the canonical (sentence-final) position for focal elements if he did not also want to refer to it again in the subsequent discourse. Thus, we would expect a correlation between construction,, and frequent mention of the following discourse if the referent of the direct object has not been mentioned before the VPC. This is roughly what Chen (1986) expected to find, although he did not argue that this was to be found only for discourse-new referents of the direct object. Looking at the data, this is indeed what we find: CLUsSC (DTNM), TOSM and ConSC all prefer construction,, if the referent has not been mentioned before although the correlations fail to reach significance. If, on the other hand, a referent has been mentioned quite often before the VPC, then it is little newsworthy (which is why construction, will be chosen), and the referent of the direct object can safely be assumed to be, among other things, the topic of the text/discourse. Since it is unrealistic to assume that each time after a topical referent has been used as a direct object in a VPC the topic will change (so that the referent will not occur any more), it will quite naturally still be referred to even after the VPC (the more often it is mentioned, the less likely construction,) will become). Therefore, we would expect, and do even find, that if a referent is already familiar, then construction, is chosen even if the referent is mentioned in the discourse following the VPC. In sum, Chen's (1986) claim is only partly correct since it only holds for contexts where the referent of the direct object has not been mentioned before. The second conclusion to be drawn is that nearly all of the variables that are relevant, according to the Processing Hypothesis, showed significant effects in the predicted direction and even those that did not (e.g. DISFLUENCY), at least did not show an effect running counter to my hypothesis, but rather showed no effect at all. Moreover, most of the variables that are, according to the Processing Hypothesis, irrelevant could indeed be shown to be irrelevant. In other words, on the basis of the monofactorial data at least, the Processing Hypothesis is clearly borne out by the data and has received
100
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
overwhelming support: it could not be falsified and should, thus, be accepted. Lastly, several variables differ with respect to their behaviour in the two different registers: some morphosyntactic variables (TYPE and DET), the semantic variables (i.e. IDIOMATICITY, CONCRETE and ANIMAGY), the discourse-functional variables concerning the preceding context and, finally, PP are stronger in written language than in spoken language whereas COMPLEX, LENGTH\V and LENGTHS arc stronger in oral language.19 This might be partly due to the fact that some variables1 values/levels are much more frequent in one register than in the other so that a significant effect is more likely to be found in the corpus data. For instance, idiomatic verb phrases and abstract referents are much more common in written texts. Moreover. Chi-square tests are sensitive to the number of items entering into the analysis: if all frequencies in a table were doubled, the Chi-squarc value would also be doubled although the strength of the relationship between the two variables would be the same. However, this is probably not the main reason for our findings because the correlation coefficients given for each table are not as sensitive to the number of items investigated. Another more plausible explanation is concerned with the fact that the production of written texts is much more planned than the often spontaneous online production of oral discourse. It is possible that spontaneous production is less sensitive to discourse-functional variables concerning the preceding context since difficulties in understanding (such as, e.g., assigning reference and accessing referents) on the part of the hearer could be resolved immediately; in carefully planned written texts, however, a writer must be unambiguous without having the possibility of immediate conversational repair of mistakes. So far, quite a lot has been achieved: we now have quite a clear idea of particle placement as we know how every single value/level (e.g. no determiner, indefinite determiner and definite determiner) of each variable (e.g. DET) influences the choice of construction. In this respect, note that it is not enough to simply claim that IDIOMATICITY yields a preference for construction,, - only the stepwise investigation of every level of a variable has shown that not all levels arc equally powerful (note the lack of influence of metaphorical verb phrases). Additionally, we have, extending the work by Biber et al. (1999), also shown how REGISTER influences particle placement to such a degree that, I dare say, analyses of particle placement neglecting register influences can definitely not be descriptively adequate, if, by descriptively adequate, it is meant that we really know 'what is going on1. Again, simply claiming that following directional adverbials prefer construction, is too general since we have found that this only holds for written data. Of course, the question arises as to why the register has the influence it has - one would probably hesitate to assume that there is a direct causal relationship between REGISTER and particle placement. Apart from the characteristics of different registers, one can examine whether REGISTER is related to other variables that in turn causally influence particle placement. Upon doing that, we find that REGISTER correlates very significantly with
RESULTS AND DISCUSSION
101
IDIOMATICITY (Somer's cl = .309; p<.001), COMPLEX (Somcr's d = .144; p = .003) and LENGTHS (r|)his = .237; t 401 = 4.88; p<.001). That is, in oral language short, simple direct object noun phrases and literal verb phrases prevail, whereas the opposite kind of noun phrases are more frequently used in written language. Since we have seen that exactly those variables arc among the most influential ones determining particle placement, the correlation between REGISTER and particle placement comes as no surprise: it follows naturally from characteristics of oral and written discourse. Lastly, we have for the first time a hint of an idea as to the power of the individual variables. In other words, I would argue that we are close to a description of particle placement that I would call 'descriptively adequate'. However, more detail is necessary since, strictly speaking, one must not compare the strengths of variables in terms of the absolute values of the correlation coefficients. While this is because of purely mathematical reasons, the exact nature of which is not our concern here, it nevertheless forces us to find another way of assessing the variables' power in a different way. This will be the focus of the following section. 6.2 Pair-wise comparisons: the relative strengths of variables 6.2.1 Introduction After the monofactorial analysis of section 6.1, this section will discuss a different method to quantify variables' strengths, namely assessing the relative strength of the variables by investigating what happens in cases of conflicting values/levels,20 i.e. in cases where one variable's value/level prefers one construction while another variable's value/level prefers the opposite construction (note the example in section 2.6.2). Consider (69) and (70). (69) Fred turned down an offer. (70) He brought back the big black table into his apartment.
As was briefly mentioned previously, in (69) the simple direct object would prefer construction, while the indefinite determiner would prefer construction,). Likewise, in (70) LENGTHW/LENGTHS would prefer construction,,, while the definite determiner and PP would prefer construction[. So which will be the most natural choice for a native speaker of English? This is quite difficult (if not impossible) to decide introspectively, and to date no analysis of particle placement (or many other cases of syntactic variation, for that matter) has ever dealt with these and related questions in spite of their frequency. This section will investigate cases with two21 conflicting values/ levels of variables (using multi-way cross-tabulation) in order to assess the relative strength of each variable. Before we look at the results, 1 will (1) define the set of comparisons that were performed and (ii) outline the general procedure of investigating oppositions of variables by discussing two comparisons in some detail.
102
MULTI FACTORIAL ANALYSIS IN CORPUS LINGUISTICS
I will compare only those variables that have a highly significant influence on particle placement (that was miproblematic even on closer inspection) and whose correlation coefficient is larger than .22 or smaller than - .22. The general reason for this is, first, to ensure that I deal with really relevant variables only; the reason for the seemingly odd choice of .22 as a threshold value is that this value guarantees that only variables accounting for about 5% of the variance of particle placement will be included. Second, from each of these variables, I will only take those values/levels into consideration that contribute to particle placement significantly. Therefore, while we will look at DET, TYPE and IDIOMATICITY, we will not investigate definite determiners, proper names and metaphorical senses of the verb phrase. Lastly, I will not consider: 1 oppositions of variables whose intcrcorrelation is larger than .7 since similarly high intercorrclations normally result from the fact that, in the terminology of factor analysis, these manifest variables measure the same latent factor: it would, for instance, be quite senseless to contrast LENGTHS with LENGTH\V or COMPLEX; 2 oppositions of variables where one variable restricts the set of possible values/levels of the other; e.g. it would be futile to contrast TYPE with DET as the level $ pronoun, semi-pronoun and proper name of TYPE, at least generally, rule out any kind of determiner. Thus, we need to analyse 88 different situations where values of two variables can conflict with one another. Table 6:25 shows which comparisons have been performed. Finally, it needs to be explained on what basis the scalar variables COMPLEX, LENGTHS, LENGTHW, AcTPC, TOPM and ConPC are dichotomized. Two options are possible. One option would be to calculate the arithmetic means of these variables over all instances and take this to be the cut-off point for the dichotomization. For instance, the arithmetic mean of LENGTHS is 4.57, so one could say that every direct object longer than four syllables is long while each direct object with less than five syllables is short. However, I have not chosen to proceed this way since (i) the arithmetic means of these variables have very large standard deviations (e.g. 4.56 for LENGTHS) and (ii) the distribution of, e.g., LENGTHS is highly skewed (rather than normally distributed). Therefore, I have chosen a second option that dichotomizes these variables at the cut-off points they display with respect to particle placement (cf. the discussion of the individual variables in the preceding section). This procedure corresponds maximally to the purpose of the contrastive comparisons as we are interested in potentially strong values conflicting with one another — it would be implausible to use strictly mathematically determined means (unrelated to the choice of construction) rather than cut-off points (which arc, by definition, relevant for the choice of construction). For ease of reference, Table- 6:26 summarizes the two
103
RKSULTS AND DISCUSSION
Table 6:25 Variables and values/levels to be contrasted Variable
Values 1 Levels
Contrasted with the values / levels of these variables
COMPLEX
simple complex lexical pronominal indefinite
DET, LM. AcrPC, TOPM, ConPC, CONCRETE, IDIOMATICITY, PP, REGISTER LM, AcTPC, TOPM, ConPC, CONCRETE, IDIOMATICITY, PP, REGISTER COMPLEX, LENGTHS, LENGTHW, LM, AcxPC, TOPM, ConPC, CONCRETE, IDIOMATICITY, PP, Register DET, LM, Ac/rPC, TOPM, ConPC, CONCRETE, IDIOMATICITY, PP, REGISTER Det, LM, AcTPC, TOPM, ConPC, CONCRETE, TDIOMATICITY, PP, REGISTER COMPLEX, TYPE, DET, LENGTHS, LENGTHW, CoH,PC, CONCRETE, IDIOMATICITY, PP, REGISTER COMPLEX, TYPE, DET, LENGTHS, LENGTHW, CouPC. CONCRETE, IDIOMATICITY, PP, REGISTER COMPLEX, TYPE, DET, LENGTHS, LENGTHW, CONCRETE, IDIOMATICITY. PP. REGISTER COMPLEX, TYPE, DET, LENGTHS, LENGTHW, LM, AcTPC, CONCRETE, IDIOMATICITY, PP. REGISTER COMPLEX, TYPE, DET, LENGTHS, LENGTHW, LM, AcTPC, TOPM, ConPC, PP, REGISTER COMPLEX, TYPE, DET, LENGTHS, LENGTHW, LM, AcTPC, TOPM, ConPC, PP, REGISTER COMPLEX, TYPE, DET, LENGTHS, LENGTHW, LM, AcTPC, TOPM, ConPC, CONCRETE.
TYPE
DET
LENGTHS LENGTHW
LM
short long short long new familiar
AcTPC
low high
TOPM
low high low high
ConPC CONCRETE IDIOMATICITY
PP
abstract concrete literal idiomaticyes
IrjIOMATICTTY, REGISTER
REGISTER
oral written
COMPLEX, TYPE, DET, LENGTHS, LENGTHW, LM, AcTPC, TOPM, ConPC. CONCRETE, IDIOMATICITY, PP
Table 6:26 Division of variables into two classes (dichotomization) Variable
COMPLEX LENGTHS LENGTHW AcTPC TOPM^' ConPC
Class 1 (simple /short /low)
Class 2 (complex/long/high)
simple < 4 syllables < 3 words <7 0 <2
intermediate and complex >3 syllables >2 words >6 >0 >1
104
MULTIFACTOR1AL ANALYSIS IN CORPUS LINGUISTICS
classes into which each dichotomized variable has been divided on the basis of its above-mentioned cut-off point. Let us now first consider a simple case, namely the variables mentioned for (69), here repeated as (71), namely COMPLEX: simple and DET: indefinite. (71) Fred turned down an offer.
In order to find out which of the two levels has a stronger influence on particle placement, we need to find the preference for a construction in the combination simple direct object vs. indefinite determiner of the direct object. But what would be a suitable way of analysis. One possibility is to simply list each variable together with its number of significant wins and losses (and draws) against other variables. Consider Table 6:27. There are 20 sentences with simple direct objects having an indefinite determiner in the corpus data. This small number is due to the fact that indefinite determiners are normally used for referents to be introduced into the discourse, which in turn arc often heavily modified and thus quite complex - as a comparison, observe that there are more than twice as many cases of complex direct objects having indefinite determiners. The question now is whether the observed distribution (11 vs. 9) deviates significantly from the expected distribution (10 vs. 10). In this case, it is quite obvious that the distribution is not significant at all (pblllomiai = .824) so that neither of the two levels (simple or indefinite) is stronger in enforcing its usual constructional preference. A similar strategy could be applied to the case of COMPLEX vs. LM. Two pairs of conflicting combinations are possible: simple (requiring construction,) vs. discourse-new (requiring construction^ and complex (requiring construction^ vs. discourse-familiar (requiring construction,). Thus, consider Table 6:28. In this case, we have to test whether the observed distributions (27 vs. 7 and 50 vs. 50) differ significantly from the expected distributions. The first of these pairs is indeed significant (p<.001), whereas the second one is obviously not. From this it follows that (i) the level complex of the variable COMPLEX 'wins significantly against' the level discourse-familiar of the variable LM (because complex prefers construction,,, which is indeed much more frequent), whereas (ii) the level simple of the variable COMPLEX is as strong as the level discourse-new of the variable Lm. Once all comparisons have been Table 6:27 Distribution of constructions with COMPLEX: simple and DET: indefinite Constructionn Construction, Column total
11 9 20
RESULTS AND DISCUSSION
105
Table 6:28 Distribution of constructions relative to COMPLEX and LM Discourse-familiar
Construction() Construction, Column totals
Discourse-new
Simple
(Complex
Simple
Complex
26 136 162
27 / 34
50 50 100
91 16 107
Row totals
194 209 403
performed we would simply have to add the numbers of significant wins and losses (and draws) and compute an index of strength for each variable that could be compared to all others. Unfortunately, this way of analysis is problematic in that (i) some variables enter into this comparison with only one level and some with two (it might be the case that this representation misses the fact that some levels of variables are more powerful than others), and (ii) this analysis implies that each variable is equally strong. A second possibility, therefore, is to simply sum up the significant wins and losses and draws for each level (rather than each variable) in a similar index of strength. Nevertheless, this way of analysis is also inadequate for a simple statistical reason. Note the discussion of TYPE on particle placement in section 6.1.2, where it was shown that construction, is obligatory with pronominal direct objects regardless of what values/levels the other variables take on.'1 However, since the two ways of analysis just suggested involve testing the distributions of the two constructions for significance, it may (and does in fact) happen that the number of constructions for a possible opposition is too small to reach significance, thus refusing to credit the variable with a The final and most viable way of analysis is, therefore, the following. For each variable's level it is counted how often this level has been found in a construction entering into a contrastive comparison and how the constructions were distributed in all these comparisons. Let us briefly go through an example, namely that of TYPE: lexical. As was shown in Table 6:25, this level is contrasted with the levels of seven others. When TYPE: lexical is contrasted with LM: new, then we find 109 instances with this combination in the data: 51 instances of construction,, and 58 instances of construction,. If, likewise, TYPE: lexical is contrasted with AcrPC: high, then we find 87 instances with this combination of levels in the data: 38 instances of construction(1 and 49 instances of construction,. Thus, after having performed two comparisons of TYPE: lexical with AcrFC: high, we find 89 (51 + 38) instances of construction,, and 107 (58 + 49) instances of construction,. If we continue to do this for all variables, we finally get the numbers of construction,, and construction, enforced by a particular level (here TYPE: lexical) if it is contradicted by another level. The distribution we get can then be tested for significance and summarized by an index of strength. 2 ' Thus, consider Table 6:29.
106
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
Table 6:29 Strength of levels of variables in terms of construction distributions Variable: Level
Construction,/
Construction!
pih
Strength
Rank
TYPE: pronoun
0 221 137
63 89 59 107 184 87 163 159 289 339 264 343 179 285 313 387 199 199 350 363 344 493 409 565
.000 .000 .000 .000 .000 .000 .000 .002 .018 .560 .726 .788 .873 .469 .416 .120 .175 .175 .023 .012 .009 .002 .000 .000
I .426 .398 .365 .329 .253 .237 .159 .105 .024 .017 .012 .011 -.033 -.035 -.059 -.068 -.068 -.084 -.090 -.096 -.105 -.180 -.257
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
COMPLEX: complex DET: indefinite LENGTH\V: long PP:j« IDIOMATICITY: idiom LENGTHS: long CoHPC: low ' LENGTHS: short COMPLEX: simple IDIOMATICITY: literal LENGTH\V: short AcTPC: high TOPM: low LM: new AcxPC: low LM'.familiar TOPM: high CONCRETE: concrete, REGISTER: oral ConPC: high TYPE: lexical
CONCRETE: abstract REGISTER: written
230 93 146 264 219 234 323 255 335 175 267 292 344 228 228 414 435 417 399 284 334
]i 7/
19 20 21 22 23 24
The morphosyntactic variables are the most powerful ones (average strength .274), but some levels of semantic variables also play a comparatively important role (average strength: .002).27 Least powerful arc, on the whole, the discourse-functional variables (average strength: -.024). What is more, for most variables, there seems to be one powerful level and one much less powerful level, e.g. TYPE ranks 1st and 22nd; COMPLEX ranks 2nd and 10th; IDIOMATICITY ranks 6th and 11th; ConPC ranks 8th and 21SI. That is, some encoding types, some semantic properties and some informational characteristics of referents are quite important for the choice of construction some, although attributes of the same variables, are not. Finally, the strongest levels (apart from TYPE: pronoun and PP: yes] are levels preferring construction(). 6.2.2 Interim summary Alter having dealt with absolute strengths of variables in terms of correlation coefficients in section 6.1, we have seen in section 6.2 that it is also
RESULTS AM) DISCUSSION
107
possible to examine variables according to their relative strengths by looking at the consequences of conflicting levels of variables. On tVie whole, morphosyntactic variables turned out to be the most powerful ones (basically supporting the monofactorial results), followed by semantic and other variables (PP and REGISTER), followed by the discourse-functional variables. However, it was also shown that the values of individual variables are not uniformly powerful in that there are considerable differences concerning their impact on particle placement. While my initial criticism of the descriptive power of the results from previous research might have sounded overly harsh, I think it has become clear how little was known about particle placement from these studies: admittedly, we knew that some values/levels or variables contribute to the choice of construction and many of the variables introspectivcly postulated were indeed found to be relevant but observe also that: • nothing was known about the degree/strength of the influence; • nothing was known about how each value/level affects particle placement in isolation or in (at least binary) conflicting situations; • it was unclear whether all sub-parts of variables are equally strong; • it was unclear to what extent speakers' subconscious decisions for either construction are to a large degree also determined by general properties of the register. Lastly, we have also seen that the method of pair-wise oppositions (especially the third way of calculation) enables us to identify strengths of variables that are not evident from correlation coefficients alone (note TYPE: pronoun} and cannot be detected by the otherwise also problematical use of minimal pair tests (cf. section 2.6.2). The following section will explore this in more detail in that 1 will show how a more adequate way of analysis can account for a better decision process. 6.3 Multifactorial results 6.3.1 Introduction Until now, we have investigated both the absolute monofactorial strengths of variables as well as their relative strengths in pair-wise oppositions. This has provided a multitude of descriptive information extending our knowledge about particle placement considerably. Yet this way of analysis is still not sufficient since, whenever speakers subconsciously decide which construction to choose in a particular situation, they do not consider monofactorial frequency information, correlations or frequency outcomes of oppositions of conflicting values/levels — rather, in actual discourse, all the values/levels are given at the point of time where a speaker finally has to decide for either construction. More precisely, when the speaker starts
108
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
formulating an utterance with a VPC, then for all the known variables the values/levels are set: e.g., the speaker knows whether the referent of the direct object is abstract or not, active or not (depending on the preceding context), etc. Thus, in this section I will illustrate how all the variables simultaneously yield a preference for one construction over the other. For such an analysis, previously used methods in syntactic research are, as we have seen, totally inadequate so more advanced techniques are required. One of the most broadly applicable techniques to problems of this degree of complexity is the so-called General Linear Model (GLM). The GLM is a technique that is based on and derived from simple correlational measures, but can be applied to designs of practically unlimited complexity: the influence of many independent variables (on both a nominal and an interval level) on many dependent variables (again, on both a nominal and an interval level) can be quantified precisely. The most important result for the given study is a multiple correlation coefficient R that indicates how well the independent variables relate to the dependent variable, the choice of construction that is. A particular sub-model of the GLM is called discriminant analysis. In discriminant analysis, one tries to predict which value a single dependent nominal variable will take on, given a particular constellation of nominal and interval variables. In other words, with a discriminant analysis an analyst tests how well we can discriminate between different values of a dependent nominal variable, given some set of independent variables. The parallel between this statistical procedure's overall purpose and the question of predicting the choice of one construction as raised in section 2.6.2 is obvious: we try to predict which construction a speaker will choose given that he participates in a particular discourse situation or, put differently, we test whether the variables we have differentiate well between the two constructions under investigation. As a result, a discriminant analysis provides the analyst with some statistics representing the fit of the discriminant function (Wilks' Lambda and a canonical correlation coefficient) and, much more importantly, a classification function according to which a constructional choice is predicted a priori for each of the sentences. This prediction can then be validated in order to assess ;i) which of the variables are most important in determining the correct classification (on the basis of factor loadings that are interpreted in the same way correlation coefficients are; cf. above), and (ii) whether the choice of a construction can indeed be predicted successfully. In other words, a discriminant analysis provides exactly the results we need for a complete and most realistic assessment of how speakers decide on a construction and enables us to predict what speakers will do in future situations; cf. also McDonald and MacWhinney (1989) for a similar approach. The following section on multifactorial results is organized as follows. First, we will very briefly review the results of applying the GLM to particle placement in order to find out how much variance of the data we can account for. Second, we will discuss a discriminant analysis that includes
RESULTS AND DISCUSSION
109
nearly all the previously analysed variables28 in order to determine (i) which of those variables are relevant if we look at the variables simultaneously just as the situation manifests itself for the speaker, and (ii) whether the Processing Hypothesis and the remaining line of reasoning in this study is supported or not. Then, we will discuss the GLM and the results of a second discriminant analysis where only those variables are considered that are needed according to the Processing Hypothesis. These results will then show (i) whether the present analysis was correct in ruling out several variables, and (i) whether the predictive power is satisfactory. The section will then conclude with the results of a so-called CART analysis.
6.3.2 Results Applying the GLM to the question of particle placement means to find out how much (in percentage) of the variance of the dependent variable (i.e. different choices of construction) can be accounted for. In a first analysis, I included nearly all variables and their interactions up to three-way interactions. The result shows that there is a highly significant multiple correlation between all the variables and particle placement: multiple R = .79; F(94, 308) = 5.58; p<.001; given the high intcrcorrclations of the independent variables, the correction for shrinkage by Wherry was applied (cf. Werner 1997: 91). yielding the following result: adjusted multiple R = .719. This multiple correlation coefficient shows that the relation between all variables and particle placement is still strong and particle placement can be accounted for quite well with the set of variables linguistic research has come up with so far. Let us now turn to the first discriminant analysis in order to subject the Processing Hypothesis to a more rigorous test and to show how linguistic analyses can benefit from such techniques. The initial results of a discriminant analysis are statistical in nature and mainly provide some figures assessing how well the discriminant analysis fits the data. The first analysis yields the results summarized in Table 6:30. These values (most notably, the p-value) show that all the variables postulated so far do indeed discriminate highly significantly between the two constructions: the two constructions differ highly significantly with respect to the independent variables. This is already a result that supports at least the variables postulated in previous analyses. However, it docs not suffice to Table 6:30 Overall results of the first discriminant analysis Wilks' Lambda Canonical correlation Chi-square (df'= 24) P
.465 .731 297.58 0(5.15-10-'")
MLLTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
110
simply know that there is a difference - rather, we would also like to know where the difference comes from. The discriminant analysis of the corpus data also provides a so-called classification function, which weighs each variable in such a way that the two constructions can be classified with the smallest degree of error. In other words, the classification function assigns, among other things, to each variable a factor loading representing its importance for the differentiation between the two constructions:29 variables that are important for the distinction between construction,, and construction, receive a high positive or negative value (high means approaching 1 or -1), whereas variables that are not important receive a value close to 0. Table 6:31 provides an overview of these results, which can be interpreted as follows:
Table 6:31 Factor loadings of the discriminant analysis for all variables Variable
Loading
LENGTHS TYPE: lexical COMPLEX: intermediate LENGTHW IDIOMATICITY: idiomatic DET:indefiniteDET:indefinite
-.545 -.496 -.477 -.47 -.323 -.281
COMPLEX: complex
-.184 -.044 -.016 -.006 -.002
IDIOMATICITY: metaphor DET:
definite
DlSFLUENCY
PART = PREP TYPE: proper name TYPE: semi-pronominal CLUsSC "NJrn 1^111
ConSC ANIMACY TOSM DET: no determiner PP IDIOMATICITY: literal CONCRETE OM LM TOPM ConPC AcTPC TYPE: pronominal COMPLEX: simple
.021 .086
Kind of variable
Choice of construction*
morphosyntactic
high variable values => construction,,
semantic: morphosyntactic semantic morphosyntactic other morphosyntactic
.093 ,\jOQ7 j1 .134 .157 .183
semantic discourse-functional (s. c.)
.222 .277
morphosyntactic other
.308 .336 .357 .42 .426 .443 .473 .494 .571
discourse-functional (subsequent context)
semantic discourse-functional (s. c.) discourse-functional (preceding context) morphosyn lactic
low variable values => construction.
due to the low factor loadings (-.22<x<.22) these variables do not discriminate well between the two constructions
high variable values => construction. low variable values => construction,.
RKSULTS AND DISCUSSION
111
First, looking at all variables simultaneously,1' we can try to rank order the variable groups (morphosyntactic, semantic, discourse-functionalpr(,,. ront(.xl, discourse-functional.,,,,, ,.(,,,,(.x(, other) according to their importance for the constructional choice. For this, all loadings are sorted in the order of their absolute magnitude and assigned ranks. Then, the median ranks of the above variable groups result in the following ranking of variable groups: discourse-functiona]pm. ,,,„,„„ morphosyntactic, semantic, discourscfunctionalsllhs amc , M . Second, all the variables required by the Processing Hypothesis (apart from DISFI.UF.XCY) show significant effects in the predicted direction supporting the analysis in terms of processing requirements and activation no variable included in the Processing Hypothesis shows significant effects contrary to my prediction. Third, most of the variables that are predicted by the Processing Hypothesis to be irrelevant display very low factor loadings, namely GujsSC, TOSM, CcmSC and ANIMACY. This is quite an important finding: note that in section 2.6 1 argued that the subsequent discourse should not affect particle placement. Then, however, in section 6.1.5 it was found that, even on closer inspection, TOSM did correlate with particle placement in a monofactorial analysis, contrary to what T had argued before. The multifactorial analysis now shows that the contribution of TOSM is fairly small, if worth mentioning at all. This again shows that one might be led to postulate a certain effect of a variable in a monofactorial analysis which turns out not to be supported by a cognitivcly more realistic multicausal approach. In sum, the Processing Hypothesis is overwhelmingly supported by the data: we know which variables are most important for a speaker to decide on one of the two constructions if we know what kind of discourse situation he is in and what he is talking about. We can now use these results for a classification of the sentences: if the analysis is precise and reflects the situation of the speaker much more than the monofactorial analyses criticized above (as I have already argued), then this analysis should also be able to classify the sentences from the corpus data correctly as the construction type with which it occurred. How does this work? As was explained previously, the discriminant function assigns to each independent variable a numerical weight representing its importance in the context of the other variables for the choice of construction. For each sentence, the analysis then multiplies the weight of each variable with its value (e.g. 6 for LENGTH\\' for a direct object consisting of six words or 1 for TYPE: lexical for a lexical direct object) and adds up these products. The final result of this is a so-called discriminant score either above or below a cut-off point (that is normally zero or very close to zero). This discriminant score represents the most probable choice of a construction by the speaker. It is essential to note here that this decision output of the analysis is not to be mistaken as a probability statement: the final decision-output of the analysis is a categorical eitherconstructiori|,-or-construction| decision just as the speaker decides categorically for either of the two constructions (even though, of course, the result is
112
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
arrived at on the basis of [Bayesian] posterior probabilities). That is, we can now test whether the discriminant function classifies the 403 sentences of the corpus data in accordance with the choice of the native speakers. The classification accuracy we find is 86.4 per cent, which means that by simultaneously considering all variables, we can then correctly classify the choice of construction in 86.4 per cent of all cases. This classification accuracy is surprisingly high - such an accuracy in the prediction of complex human behaviour is seldom if ever achieved (cf. also section 6.4). The attentive reader might criticize this procedure by raising the objection that what we have just done is not really prediction - we have classified 403 sentences on the basis of the analysis of just the same 403 sentences so, strictly speaking, what we did is an a posteriori classification rather than a real prediction of what happens in yet unanalysed sentences. This objection is correct, so two additional tests of the predictive power of the analysis have been performed. First, I split the sample into two parts, one consisting of 350 cases, the other of 53 cases. The former sample was a learning sample to which T applied a discriminant analysis to obtain a discriminant function; the latter sample was a test sample whose sentences were to be predicted on the basis of the analysis of the learning sample. Thus, the sentences whose constructions are to be predicted are not the same ones that were analysed. In orderto anticipate criticism of my possibly biased choice of the learning sample, I performed this test three times with different learning and test samples. Table 6:32 shows the composition of the samples and the results. We find thai, while the results are fairly diverse and slightly worse than those above, the classificatory results are supported by those of the actual prediction. An interesting effect to note is that oral data seem to be more difficult to predict since we find that in the sample of oral data the prediction accuracy is considerably lower, which is probably due to the flexible, interactive and situation-dependent nature of discourse. Another very widely used way of testing the predictive power of models is via cross-validation, using the leave-onc-out method (also called U-method). Applying this procedure to the present data means to perform 403 analyses, predicting in each one the choice of construction in a single case on the Table 6:32 Prediction accuracies of three analyses'2 Learning sample
Test sample
A priori correct predictions
200 oral sentences and ,• loUn written sentences 150 oral sentences and ... , or.A 2(l(i written sentences 174 oral sentences and 176 written sentences
.„ 5.1 written sentences
_„ _ n , ,,.,* 73.6% ***
_„ . 53 oral sentences
_ _ _.. 56.0% ns
26 oral sentences and 27 written sentences
^ M ( ) I")
/n ^^^
RHSl'LTS AM) DISCl.SSIOX
113
basis of the remaining 402 cases. This again guarantees that no case is used for its own prediction. The result of this cross-validation for the present analysis is a prediction accuracy of 82.9 per cent. This result is for all practical purposes very much similar to the one for the classification accuracy and the split-sample technique, which shows that the results are quite robust and the predictive power of all the variables together is indeed exceptionally high. The results so far suggest that applying the GLM and discriminant analysis to cases of syntactic variation answers a lot of previously unanswered questions: we now know exactly how much of the variance of particle placement can be accounted for and which variables are responsible for the different constructional choices by speakers; moreover we can formulate reasonable predictions of the constructional decisions speakers will make subconsciously. However, strictly speaking, for the present analysis it is of course more important to see whether the Processing Hypothesis achieves similar results so far we have only looked at all variables ever postulated. A GLM analysis including only the variables included in the Processing Hypothesis (along the lines advocated in Chapter 4) yields the following results: multiple R = .761: F(53, 349) = 9.074; p<.()01; adjusted multiple R = .718. That is to say, the adjusted correlation of the variables of the Processing Hypothesis is as high as the adjusted correlation for all variables ever suggested to be important for the alternation. In other words, excluding those variables from the analysis that I have excluded on theoretical grounds leads to virtually no decrease in explained variance. This shows that, given the variables of the Processing Hypothesis, these other variables contribute nothing but random noise to the overall correlation, decreasing the overall correlation. This supports the Processing Hypothesis and the additional arguments on which the exclusion of these variables was based. Next, we need to find out whether the predictive power of the Processing Hypothesis compares to that of all variables. First consider Table 6:33, which shows initial results of a second discriminant analysis where only those variables are included that are, according to Chapter 4, related to the Processing Hypothesis. In this analysis, the values are very similar to the ones of the previous discriminant analysis: the canonical correlation and the Chi-square test are for all practical purposes identical: however, the p-value is much smaller
Table 6:33 Overall results of the discriminant analysis for the Processing Hypothesis VVilks' Lambda Canonical correlation Chi-square (df = 19) p
.47 .73 295.68 0(1.53-10 -:> I \
114
MULT1FACTORIAL ANALYSIS IN CORPUS LINGUISTICS
than before. That is. again the independent variables discriminate very well between the two constructions. Now consider Table 6:34 for the individual variables. The overall picture has not changed markedly: the variables alreadyknown to be relevant are still the most influential and the ranking of variable groups has not changed; no further comments are necessary. Finally the most rigorous and, at the same time, most interesting test, namely the classification and prediction accuracy. Table 6:35 summarizes the results. We see that nearly nothing has changed. On the basis of the Processing Hypothesis we have omitted several variables from consideration, but this has not caused any substantial loss of explanatory or predictive power: the a posteriori classification accuracy has decreased only a negligible .5 per cent while the cross-validated prediction accuracy has even increased by 1 per cent. Moreover, the predictions of the smaller samples are, on aggregate, better than before. These differences support the Processing Hypothesis and do not force us to change the present account." Note in passing that all the multifactorial results also support my claim that the sample size of 403 Table 6:34 Factor loadings of the discriminant analysis for the Processing Hypothesis Variable COMPLEX: simple TYPE: pronominal
AcrPC CtmPC TOPM LM CONCRETE IDIOM: literal PP DET: no determiner
Factor loading
.576 .499 .477 .447
.43 .424 .339 .311 .279 .225
TYPE: semi-pronominal TYPE: proper name PART = PREP DISFLUENCY DET: definite IDIOM: metaphor COMPLEX: complex
.087 .021 -.002 -.006 -.017 -.045 -.185
DET indefinite IDIOMATICITY: idiomatic LENGTHW COMPLEX: intermediate TYPE: lexical LENGTHS
-.283 -.326 -.474 -.481 -.501 -.55
Kind of variable morphosyn tactic
Choice of construction
high variable values => construction.
discourse-functional (preceding context) low variable values => construction,. st*m
due to the low factor loadings these variables do not discriminate well between the two constructions
semantic
high variable values => construction,.
morphosvntactic
low variable values => construction.
RESULTS AM) DISCUSSION
115
Table 6:35 Classification/prediction accuracy of three analyses Classification method A posteriori classification learning sample 200 oral sentences and 150 written sentences 150 oral sentences and 200 written sentences 174 oral sentences and 176 written sentences
A priori correct predictions 85.9%
Text sample
53 written sentences
81.1%***
53 oral sentences
67.9% **
26 oral sentences and 27 written sentences
Cross-validated a priori prediction accuracy
88.7% *** 83.9% ***
sentences is not too small (cf. Chapter 5, n. 66): the results of the discriminant analyses are highly significant (cf. Tables 6:30 and 6:33) and the crossvalidations on the basis of three smaller samples show that even smaller learning samples of 350 sentences yield comparable prediction results (cf. Tables 6:32 and 6:35). From this we can conclude that the Processing Hypothesis is strongly supported as it allows us to predict the subconscious decisions of speakers for a construction with more than 80 per cent accuracy. One methodological caveat remains, however. There are scholars who might argue that one must not apply a discriminant analysis to my data since discriminant analyses require (i) a multivariate normal distribution of the variable values, and (ii) homogeneity of variances of the variables under consideration. Moreover, some statistician might also add that discriminant analyses where categorical variables have been receded using nominal 0,1dummies can produce skewed results because of the resulting intercorrelations. As a consequence, these scholars might argue that the results of the discriminant analysis are not reliable and parameter-free/distribution-free techniques such as CART (Classification and Regression Trees) should have been chosen instead. In what follows I will briefly address these concerns. First, while many researchers tend to emphasize the importance of distributional assumptions (such as normality, homogeneity of variances and the like), there is also a number of scholars who argue that, in practice, these assumptions are not as essential as they might seem on a purely mathematical basis (cf. also Winer et al. 1991: 5). Second, it has even been claimed that there is no test that reliably identifies multivariate normal distributions (cf. Bortz 1999: 435). Third, the difference between discriminant analysis and CART is not just a statistical/mathematical one - rather, there is also a conceptual difference: while a discriminant analysis includes all variables simultaneously in the calculation to compute a prediction for one of the two constructional choices, the trees resulting from CART analyses include variables sequentially. For a native speaker, however, I believe that the model
116
MULT1KACTORIAL ANALYSIS IN CORPUS LINGUISTICS
underlying discriminant analyses is more realistic: it is intuitively more plausible to assume that all the variables' values/levels I have discussed are simultaneously available at the point of time the speaker chooses the word order rather than that the values/levels are included one by one sequentially. Moreover, while there is still considerable debate whether psycholinguistic theories of speech production should incorporate parallel or serial models of processing, I, following Berg (1998), consider parallel processing theories more rewarding. I have decided, for these reasons, to predict native speakers' choices with a discriminant analysis, which, as opposed to CART, comes closest to predicting choices on the basis of a simultaneous/parallel inclusion of the relevant data. Nevertheless, it might still be the case that these reasons do not satisfy my critics. I have, therefore, also analysed my data using the CART module of Statistica 5.5; the algorithms used therein are based on CART by Breiman et al. (1984), where CART and QUEST algorithms are used to classify and predict data in the absence of distributional assumptions. My CART analysis of the data was based on the parameters and settings listed in Table 6:36. The result of the analysis can be summarized as follows: as to classification, out of all 403 sentences, 351 (87.1 per cent) were classified correctly while 52 (12.9 per cent) were classified incorrectly, again a result that is extremely unlikely to be obtained randomly (according to an exact binomial test). However, we must also determine the prediction accuracy by crossvalidation. First, I used the split-sample technique analogous to the LDA, where T split the whole sample into a learning sample and a test sample three times; the samples and results are listed in Table 6:37. Second, on the basis of a 15-fold cross-validation, the technically most interesting statistic is the average misclassification cost which is .207 (SD = .02). As a rule of thumb, this figure can roughly be interpreted as the percentage of misclassifications in 15 splits of the sample into learning samples and test samples to be predicted given the respective learning samples. We may conclude that a reasonably good result has been obtained again. Admittedly, the cross-validated prediction accuracy of CART is not uniformly as high as the LDA results, but, apart from the test sample for oral data alone, it is still way better than what might be expected by pure chance. Moreover, there is a reason for these minor differences. Given the above Table 6:36 Parameters and settings of the CART analysis Parameter
Setting
Method Stop rule Prior probabilities Goodness-of-fit index
CART-style exhaustive search for uiiivariate splits FACT-style direct stopping fraction of objects = .5 identical: .5 for both constructions Gini measure
RESULTS AM) DISCUSSION
117
Table 6:37 Cross-validated prediction accuracies of CART for split samples Learning sample
200 spoken sentences and 150 written sentences 1 50 spoken sentences and 200 written sentences 174 spoken sentences and 176 written sentences Average
Test sample
Correct predictions for test samples
53 written sentences
81.1 % ***
53 spoken sentences
56.6% ns
26 spoken sentences and 27 written sentences
71.7%**
70% **
parameter settings, the CART technique does not utilize all variables for the prediction of a choice of construction but only the most important ones as determined by the analysis. Thus, for constructional choices where variables of an overall minor importance arc decisive, false predictions are more likely. As far as the importance of the individual variables of the Processing Hypothesis is concerned, the overall picture docs not differ strongly from the results of the discriminant analysis; for the sake of completeness, Figure 6:2 shows the results for the individual variables. Obviously, the overall picture has changed, but if we, as previously, investigate the ranking of the variable groups by comparing the median ranks of these groups in the discriminant analysis and in CART, then no
Predictor variable
Figure 6:2 Importance of predictor variables for CART
118
MULTIKACTORIAL ANALYSIS I.\ CORPUS LINGUISTICS
differences emerge. In other words, although the data analysed in this study do not necessarily meet the mathematical criteria required by discriminant analyses, this technique can still be fruitfully applied to our research questions, and distribution-free techniques yield, lor all practical purposes, virtually identical results. This is not to say that discriminant analyses can always be successfully computed for such questions one should always decide that on a case-by-case basis after subjecting one's findings to non-parametric techniques as well. 6.4 Further evaluation To the reader not familiar with empirical investigations of this degree of complexity, the correlation coefficients and the prediction accuracies achieved might not seem as convincing as I portray them to be. However. I think this assumption would result from a serious misunderstanding as several things have to be borne in mind. We are dealing here with behavioural data which are influenced by a large number of factors and thus, on the whole, far from being straightforwardly predictable. In this particular analysis, the variables sometimes opcrationalize phenomena or aspects of human cognition (such as, e.g., activation) which are very difficult to measure directly; rendering it extremely difficult to achieve an even higher degree of variance explanation: the number of potential reasons for why some concept is more or less active than another one is so high that we often have to live with a certain degree of variance due to additional factors or idiosyncrasies on which no valid generalization can be based.31 What is more, in the majority of cases of particle placement, the two options (construction,, and construction,) are not mutually exclusive such that in each situation only one choice is theoretically possible (i.e. both constructions arc grammatical): note the fact that there is only a single nearly categorical rule governing speakers' choices, namely that pronominal direct objects require construction, (leaving aside the rare instances of contrastively stressed pronouns). Thus, in the vast majority of cases (namely in the about 81 per cent of non-pronominal objects), the phenomenon of particle placement theoretically leaves the speaker with both constructional options and the analyst with a situation where the choice of construction is determined by (i) the set of underlying cognitive preferences I aimed at investigating, (ii) idiosyncratic preferences, and (iii) other factors clouding the picture. In other words, the fact that in most cases the speaker can only rely on inaccessible cognitive preferences and idiosyncratic behavioural patterns rather than clear-cut grammatical rules corroborates my claim that my prediction accuracy concerning these inaccessible cognitive routines is indeed quite high. But is there any empirical evidence supporting this justification of the (already low) error rate? Let us briefly examine the constructions that the analysis failed to predict correctly by comparing them to the correctly predicted ones.3'1 Table 6:38 shows the means and standard deviations for the
RESULTS AND DISCUSSION
119
Table 6:38 Differences of average values for correct and false predictions Variable
AM (SD)for false predictions
AM(SD)far correct, predictions
LENGTHW LENGTHS TOPM AcrPC ConPC
2.3 (1) 3.8(2.1) .8(1.2} 3(4) 2.8 (2.8) O(.l)
2.9(2.7) 1,9 (5) 1.3(1.8) 4.3 (4.5) 3.5 (3.9) .1 (.4)
DlSFLUENCY
'lliM
df
P
3.03 2.75 2.31 2.30 1.73 1.68
210 179 101 82 94 211
.003 .007 .023 .024 .087 .094
interval variables for both correctly and falsely predicted choices of construction. What these values show is that the sentences that are predicted correctly have more active direct objects (since they were mentioned more often and closer to the VPC); additionally, there is a significant tendency for correctly predicted sentences to have longer direct objects - ConPC and DISFI.UF.NCY do not discriminate significantly between correct and wrong predictions. Likewise, if we test whether the quality of the prediction is crucially influenced by other (non-interval) variables, we find that false predictions are rare with idiomatic constructions and VPCs without determiners (especially pronominal ones) or with definite determiners. In sum, the prototypical instance of a wrongly predicted VPC has a short lexical direct object that is not very active and has most likely a literal meaning (no errors with idiomatic verb phrases and following directional PPs); some examples of such instances are given in (72). (72) a. ... before you put the next bit of a package containing a bandage off. b. East German security merely wanted to take away their identity cards, e. You'll open up the wound again. That is, the analysis fails to correctly predict mainly those instances of VPCs for which there is no obvious choice of construction since there are no definite rules governing these cases;*'' given the utterance and its discourse context, even native speakers would say that both constructions are possible and hardly differ in acceptability. In these cases, i.e. where the variables' values/levels do not make (strong) predictions, the decision for either construction is entirely due to the barely falsiliable semantic focus of the utterance (cf. examples (35) and (36) and n. 72), idiosyncratic behaviour of particular speakers or some unknown variables (whichever these may be). Since I would not wan! to rule out the possibility of additional variables, let us briefly look at two examples. One variable that might be relevant is concerned with the priming of syntactic structures. In several analyses (of, e.g., transitive clauses and dative
120
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
structures), Bock (1986) has demonstrated that subjects' choices of construction can be manipulated by exposing the subjects to sentence structures prior to their decision for a particular constituent ordering. In my data I tested whether structural priming has influenced the speaker's/ writer's decision for a construction by counting whether the analysed VPC was preceded by another VPC in the preceding three clauses and, if so, whether the two VPGs had the same word order or no(. Consider Table 6:39. Obviously, structural priming is irrelevant in the vast majority of cases (333 out of 403, i.e. about 83 per cent), where no VPC precedes another one within a window of three clauses. If, however, the VPC under investigation is preceded by another one (70 cases), then in the bold printed 53 cases (75.7 per cent) the constructions arc identical. The distribution of constructions in the partial table of (24 + 46 =) 70 cases where structural priming makes a prediction is highly significant (x"(l) = 15.24; p<.001; A. - .32)). That is to say; structural priming seems to be relevant, but we need to bear in mind that structural priming can be better investigated with an experimental paradigm in order to account for many confounding variables in text production.37 At any rate, if the effect of structural priming could be supported, then it could be readily integrated into the Processing Hypothesis: if a speaker has already assembled a VPC in a particular discourse, then, all other things being equal, accessing the same construction again is less costly than accessing the other construction; however, further research is necessary to support this stipulation; cf. also below. Another variable that might be important for the constructional choice is phonology. Schliiter (2003) has shown that variation of the verb to be in Middle English is governed by a preference to guarantee ideal syllabic structure (where consonantal segments alternate with vocalic ones; cf. Selkirk 1984; Vennemann 1988). Thus, I tested whether this preference can also be observed in the present data set (cf. also Browman 1986: 319-20). We know VPCs come in two different constructions (once again schematically represented in (73)), differing in the order of post-verbal constituents. (73) a. [s [ N H ] [VP V [ NP ] Part]]= construction,, b. [s [ N p] [ V I >VPart [N1,]]]= construction.
Table 6:39 The effect of structural priming on particle placement Prediction following from structural priming Construction/, Construction/
Row totals
No prediction
Observed in Construction,, 16(66.7%) 9(19.6%) 169(50.8%) 194(48.1%) the data Construction, 8(33.3%) 37(80.4%) 164(49.3%) 209(51.9%) Column totals 24 46 333 103
RESULTS AND DISCUSSION
121
In order to assess the influence of ideal syllable structure, one needs to test how the choice of construction affects the transition of segments. Since the different orderings of post-verbal elements do not alter the constituents themselves, we only need to investigate the three segment transitions between the constituents that may be placed in alternative ways in which each constructional choice results. These transitions are accordingly marked in (74). (74) a. John picked |, up | , the book | , from thefloor.= construction,, h. John picked , the book |., up j , from the floor. = construction.
In (74a), the transitions one, iwo and three can be characterized as GV ([piktAp]), CC l'[Ap8aJ) and CC ([bukfrtmij) respectively; in (74b), on the other hand, the transitions one, two and three are CC, GV and CC respectively. In other words, when we focus on segment transitions only, no word order in (74) is superior to the other as both feature iwo non-optimal transitions and only one optimal CV transition. Let us now turn to the authentic data analysed so far. Assuming that this variable is, if at all, more likely to be influential in oral language, I have investigated all 200 examples of the spoken data and compared the segment transitions resulting from the actual choice of the speaker to the segment transitions resulting from the other construction. If there is indeed a tendency lo avoid CC transitions, then the constructions chosen by the speakers should exhibit significantly less CC transitions than those not chosen by the speaker. However, the data show absolutely no tendency in this direction, as is shown in Table 6:40, where '<' and '>' compare the numbers of CC transitions in the chosen and not-chosen construction. In most cases (89 per cent), both word orders result in identical segment transitions. What is more, the number of cases where the construction produced by the speaker has a higher number of optimal transitions ('chosen > not chosen') is outweighed by an equal number of cases where we find the exact opposite ('chosen < not chosen'). In other words, in the present data, segment transitions do not play a role whatsoever. In sum, while I am the first to admit that other variables may very well play a role in determining speakers choices (especially since this is the first comprehensive multifactorial analysis of particle placement), it is obvious that such a stipulation needs to be supported by empirical data. For one variable (structural priming), such evidence was provided, for another Table 6:40 Comparison of chosen vs. non-chosen constructions (segment transitions) Chosen > not chosen
Chosen = not chosen
Chosen < not chosen
Row totals
11(5.5%)
178(89%)
11(5.5%)
200
122
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
candidate (segment transitions), no such effect could be found. Therefore, given both the exceptional prediction accuracy as well as the fact that my hypothesis accounts for all relevant variables, the burden of proof lies with those researchers postulating the effect of such additional variables and, more importantly, they would also have to develop a theory that can incorporate all variables found to be affecting particle placement. As long as such empirical evidence, let alone a more comprehensive theory, does not exist, I take it that my approach is well supported. Finally a sceptic might argue that 83.9 per cent is not a very convincing result. He might argue that while we have seen previously that 338 correct predictions out of 403 trials is very unlikely to be obtained by chance, even an uninformed linguist would, on average, already achieve a prediction accuracy of 59.55 per cent: he knows that speakers would correctly choose a pronoun in almost all cases of pronominal direct objects (7 7 cases out of 403) and. from the remaining 326 cases, he could simply guess 50 per cent of the cases correctly (since there are two possible constructions to choose from), i.e. 163 instances. Thus, the average uninformed analyst would, on the basis of a single rule (pronouns require construction,) and pure chance, already predict 77 + 163 = 240 cases out of 403 correctly, the 59.55 per cent just mentioned. Therefore, it might seem as if 83.9 per cent is not really a result worth mentioning. However, this conclusion is faulty. As was just said, the competing uninformed linguist has to choose a construction in 403 - 77 = 326 cases (omitting the virtually obligatory cases of construction] with pronominal direct objects). My analysis predicted the choice of construction correctly in 338 cases - but these 338 cases include the 77 cases with pronominal objects where there is no real choice so they need to be left out in the evaluation of my analysis since it is no great achievement of mine to predict these cases. That is, my analysis made 338 — 77 = 261 correct predictions for a construction where there was a choice. Therefore, we have to find out how high the probability is that uninformed analysts can correcdy guess the choice of construction 261 times or more in the 326 cases where there is a choice (261 -I- 65) just by chance. According to the exact binomial test, this probability approaches zero (2.86-10"™). This line of reasoning is summarized in Figure 6:3. It is still conceivable that linguists can achieve predictions rates that go beyond 59.55 per cent on the basis of their knowledge of the literature prior to my study - however, we have seen that they would then face the difficulty of not knowing how important each variable is, which in turn has only led to concessions of 'truly complicated array of facts' rather than more sophisticated analytical techniques. Likewise, it might be possible that native speakers could also achieve good prediction rates when they try to predict which construction another native speaker will choose even then we would only know which rate speakers can achieve, but we would not know how they do it as they cannot tell us in any scientifically valid way. To sum up, I am convinced that neither native speakers nor informed
RESULTS AND DISCUSSION correctly predicted (338)
pronouns (77) no choice
correctly predicted utterances where there was a choice (261)
123
falsely predicted (65)
falsely predicted where there was a choice (65)
Figure 6:3 Distribution of construction predictions relative to kinds of direct objects linguists would achieve such a high degree of accuracy in predicting other native speakers' decisions on the basis of the results of previous studies since the only virtually obligatory rule cannot guide speakers' or analysts' decisions in the majority of cases. Given these particular characteristics of particle placement (the non-exclusiveness of the two constructions in most cases and the large number of variables that needs to be considered) and comparable results from other behavioural disciplines, I am convinced that 84 per cent prediction accuracy is an exceptional result. In this connection, let me briefly speculate about the relevance of a variable that could not be investigated with the corpus data used here, namely the (mostly contrastive) stress on the direct object (STRESS). On the basis of the previous results we can estimate fairly well what findings we would have made. STRESS is not indicated in written data and probably not often found in oral data (just like DISFLUKNCY). Thus, we would expect a fairly low correlation coefficient (in the direction unanimously mentioned in the literature) ior the whole of the data and a slightly higher one for the subset of oral data. When it comes to pair-wise oppositions, however, we would probably find that STRESS is the strongest variable of all since (as was argued in the literature) it can at times even overrule TYPE: pronoun. Given the expected limited frequency of (contrastively) stressed expressions in the corpus data, however, STRESS would not receive a high factor loading, but would probably increase the prediction accuracy at least slightly. Two final problems need to be addressed. First, one question unfortunately remains unanswered. While the Processing Hypothesis receives unanimous support, it was deliberately formulated in such a way as to include the processing effort of both speaker and listener. Now it would be desirable to find out which of these two perspectives on processing is more relevant for particle placement. This, however, is not possible here since nearly all the variables we have shown to be relevant can exert their influence on both the speaker's processing cost and the (speaker's assumptions about the) hearer's processing cost. Consider, e.g., discourse-functional determinants of processing:
124
MULTTPACTORIAL ANALYSIS IN CORPUS LINGUISTICS
• postponing new objects to sentence-final position facilitates processing for the hearer since the newest information is given only after all other information has been provided; • postponing new objects to sentence-final position facilitates processing for the speaker since the speaker affords himself more time to plan and produce cognitively demanding parts of the utterance (cf. Wasow 1997a, b, as well as Arnold et al. 2000: 33 and the references cited therein). The same holds for morphosyntactic variables: • postponing long and complex objects to sentence-final position facilitates processing for the hearer since the most difficult structural constituents are expressed only after all other information relevant for the parse has been provided; • postponing long and complex objects to sentence-final position facilitates processing for the speaker since the speaker affords himself more time to plan and produce the most demanding structural parts of the utterance (cf. Arnold et al. 2000: 32). Thus, on the basis of my own analysis and the experimental data discussed in Arnold et al., it seems impossible at present to decide whether processing is more important by virtue of its cost for the speaker or for the hearer: very often, both perspectives make identical distributional predictions. It is interesting to note, however, that although my study has not investigated processing in real time experimentally, the methodology nevertheless yields very similar results. In other words, provided that the methodology is sophisticated enough, corpus analyses allow for interesting insights into linguistic processing as well. Second, let me also anticipate another objection. The statistical techniques used so far make it difficult to choose among competing hypotheses that make identical claims as to the strength and direction of the relationship between the independent variables and particle placement. Yet, this valid objection is much less of a problem than it might look at first: it holds for nearly every analysis using inferential statistics. Whenever a researcher has some hypothesis (H,), analyses some data using inferential statistics and obtains a result supporting his H,, it is, with hindsight, possible to argue that the researcher's H, is not the correct one, but some other H, that is equally compatible with the data. That is. if we take this to the extreme, it would not be possible to ever accept some H, as there are, at least theoretically, always competing alternative hypotheses that could be responsible for the outcome of the investigation and cannot be ruled out because they make fully identical predictions. Strictly speaking, one might even say that this dilemma is inbuilt in the paradigm of falsificatory testing: since we only attempt to falsify H0, the logical counterpart of our own hypothesis, a result conforming to our hypothesis only rules out H0 but does not pick out one among many particular alternative hypotheses making identical predictions. Never-
RESULTS AND DISCUSSION
125
theless, it is common practice to accept one's own hypothesis if the corresponding null hypothesis is not supported significantly by the data, and in this study, I simply follow this generally acknowledged practice.3" Apart from this general comment, we should also briefly return to the advantages of the present analysis. Until now, only a handful of studies have attempted to unify the distinct findings concerning particle placement - the majority of analyses did not even recognize various variables already discussed in traditional grammarians' works. The Processing Hypothesis readily incorporates all the relevant variables and excludes all the irrelevant ones. Given the fact that in more than 100 years of research not a single analysis has managed anything only slightly similar (as I have shown previously, there was not even a single descriptively adequate analysis) or has been subjected similarly successfully to prediction tests with hundreds of examples, I would not hesitate to place the burden of proof on the linguist who, after having seen my results, postulates the existence of another hypothesis that has the same or even better descriptive, explanatory and predictive power; Chapter 8, however, will elaborate on such an alternative approach. The following chapter will discuss some general implications of the present study, i.e. consequences that go beyond particle placement proper and are concerned with how linguistic categories are conceptualized, what kind of explanation we seek for our findings and how these results relate to other research on linguistic structure. Notes 1 Note that, in order to test conservatively; all significanre tests are two-tailed even if previous hypotheses/results would also have allowed for one-tailed tests. 2 Here it becomes obvious why the labels construction// and construction/ are necessary even if continuous construction and discontinuous construction are more mnemonic (ef. n. 5): without such an 'at first glance counter-intuitive labelling, the interpretation of all the correlation coefficients would not be possible without further explanations. 3 At this point, it is worth addressing some objections that are likely to be raised. 1'rom time to time, methodological criticism has been levelled at quantitative statistical analyses oflinguistic data. In a more recent study, Givon (1992a) has addressed potential problems of using Chi-squarc analyses and the correlations deriving from them. Using data from text-distributional analyses of word order in several languages, he has shown that interpreting contingency tables and their Chi-squarc values can he complicated or distorted by the fact, that the 'Chi-square test for correlations is blind to skewed directionality' (Givon 1992a: 308): this potential flaw has been briefly mentioned in this study, too, cf. section 2.6.2. While this is in principle correct, it is not a problem underlying statistical analyses in general or Chi-squarc analyses in particular rather, a careful analysis can easily account lor skewed directionality by (i) testing the relation found for significance using the so-called contributions to Chi-square (cf. Zofcl 1992: 187) or by doing a configural frequency analysis (cf. Bortz et at., 1990: 155(T.) and (ii) observing the relation between
126
4
5
6
7
8
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS observed and expected values as will be done later. In this respect, therefore, Givon's criticism of the Chi-square test and the resulting correlation coefficients docs not apply to the computations and their subsequent interpretations in the study at hand. In this respect, it should be noted that, although Givon argues for the use of 'some reasonable statistics' (1992a: 317; a line of reasoning to which I would also subscribe), his own paper does nevertheless fail to economically utilize available statistical techniques. His claim that the skewed correlation between OV word order and the function-class definite in Mandarin is mediated by Y-movement could have been tested more easily, more economically and in a fairly foolproof way using the technique of partial correlation introduced above (cf. Chapter 2, n. 25). In this sense, Givon's evaluation suffers from an admittedly only minor methodological lapse similar to the ones he himself criticizes in the course of his paper. The values given here as expected frequencies are rounded integers. However, for reasons of accuracy, the calculation of Chi-square as well as the contributions to Chi-square are based on the exact non-rounded values. This holds for all the cross-tabulation tables to be discussed. There are various coefficients of correlation for contingency tables of nominal variables that are mentioned in the literature; the ones most frequently found are probably <j>, Cramer's index ('s generalization to k X m tables) and the contingency coefficient C. Unfortunately, they arc all problematic to some extent. For example, it is not always possible to compare the distributions in two tables using these coefficients of correlation since the range of values that these coefficients can take on differs across tables. While Cramer's index would be well suited for comparisons to Pearson's r (both can be derived from the General Linear Model), I have nevertheless chosen to use the correlation coefficient A.. While it is much less frequently found than other coefficients of correlation, it suits our purposes ideally since it assesses the percentage our prediction accuracy increases when the independent variable is known rather than just quantifying something as abstract as shared variance. The notion of interaction is defined as an effect that goes beyond the main effects (i.e. those of single variables) and can only be explained with reference to a particular combination of levels of several variables (cf. Bortz and Doring 1995: 617; Bortz 1999: 285). For this study this means to test whether, e.g., some variables are influential only in oral language or in written language. Since both Lcvene's test and Brown and Forsythe's test were statistically highly significant, the homogeneity of variances assumption of the commonly used t test was violated, which is why Welch's t-tes.1 was computed here and in the remainder of this study. However, in the interest of readability I decided to limit the number of statistics in the following text by omitting the results of the / tests since they do not differ from the result of the correlational analysis. The r2 value is the most relevant one here because it denotes the percentage of common variance of the two variables. Put diffcrcndy, r" denotes the proportional reduction of error in predicting the value of the dependent variable when we know the independent variable's value. It is, thus, comparable to A,; cf. above. In what follows, thus, only r2 will be given.
RKSULTS AND DISCUSSION
127
9 For this cross-tabulation and all following ones of interval/ratio variables, no column percentages will be displayed for expository reasons. A further comment is necessary concerning the use of the notation 8+ in this table and following ones (the notation means 'eight words and longer'). It is of course possible to use the formation of such groups in these tables quite misleadingly: in Chen's (1986) study, e.g., the empirical results for individual values are often summarized using such groups of values. For instance, in the presentation of the results for length Chen divides the data 'into five groups according to the number of their syllables' (1986: 86), namely 1 2, 3-5', 6-10, 11 15, \6+. No reason is given why this was done or why this classification (five groups with the group sizes as just given) was used rather than other equally possible ones (e.g., 1—3, 4-6, 7 9, 10—15, 16+). Even more noteworthy is his discussion of distance to last mention for short objects, where an even more extreme classification has been chosen (1 2, 3-5, 6-20 [!], 21+). It is therefore quite possible that Chen's arbitrary classification procedures mask results by accident, especially since, as we will see later, results can be extremely complex at times so that one must be careful not to subsume many different results under a single group. In this study we will consider such larger groupings of values for expositoryreasons only (otherwise, the resulting table would be too large), and, what is more important, groups will be formed only if all the individual values constituting the group behave identically (i.e. have a preference for the same construction). Thus, no information is accidentally omitted. 10 These results might appear strange since, e.g., semi-pronominal nouns intuitively seem to have a strong preference for construction,, 10 out of 13 occurring in construction |. However, the contingency table shows that the knowledge of whether the direct object is semi-pronominal or not does not improve our prediction of the construction the speaker will choose for this direct object: we would always predict construction,, which occurs in 10 out of 13 and 199 (77 + 115 + 7) out of 390 cases respectively. Thus, this variable cannot improve our prediction accuracy. Moreover, if the distribution 3 vs. 10 is subjected to a binomial test with 48.14 per cent (the row total) as probability of success, then the result turns out to be insignificant, too: p lnl ,,, m i a i „,, = .061. These remarks equally pertain to proper names as direct objects. 11 At first glance, this correlation coefficient for pronominal direct objects might also seem suspicious: on the one hand, pronouns exclusively occur in construction!, so the variable is extremely powerful; on the other hand, however, the coefficient for pronominal objects is lower than the one of, say. NPs of intermediate complexity, which do not have such an unanimous effect on the choice of construction. This paradox is due to mathematical reasons, and we will return to some more interesting implications later. 12 This might be indicative of the fact that the database investigated here is indeed (i) fully representative and (ii) sufficiently large as the findings by Biber et al. (1999) are based on the evaluation of the Longman Spoken and Written English Corpus (I,SWE) with about 40 million words. 13 This result is in a way quite astounding since correlations between definite determiners and givcnness (i.e. little processing cost) have been found in a large number of studies. Therefore, one might also have expected to find a correlation between definite determiners (indicating the givenness of the direct object's referent) and particle placement. Thus, I wanted to find out what is responsible for this finding.
128
MULTIFACTORIAL ANALYSIS IX CORPUS LINGUISTICS
To my surprise, 1 found that definite determiners occurred more often in cases where the referent of the direct object had not been mentioned before (98 cases) than in cases where the referent had been mentioned before (77 cases). The average length of these 98 discourse-new definite direct objects (5.7 syllables; SD: 4.6), however, was significantly longer than that of the 77 discourse-familiardefinite direct objects (3.8 syllables; SD: 2.7) and nearly significantly longer than all 175 definite direct objects (4.9 syllables; SD: 4.0). From this it follows that speakers do indeed use definite determiners for discourse-new referents, but if they do, then they provide additional information about them with extra modification. For instance, the sentence I was running into the man who came out of the grocery store is perfectly acceptable even if the man has never been mentioned before. An analogous sentence in the corpus data is The commission has also thrown out the idea, put forward by some industrialists and businessmen, that special provision is needed for computer fraud: the referent of the direct object has not been mentioned before, but is so extensively modified that the use of the definite determiner is still licensed. In other words, the fact that definite determiners do not correlate with particle placement is simply due to the fact that the correlation between definiteriess and givcnness is, in my corpus data, not that strong. 14 Another technique, which is less advanced but easier to understand, yields the same results: if we take out this influence of pronouns by looking only at the correlation between non-pronominal direct objects with and without determiners and the choice of construction, then no determiner again has no further influence on particle placement: X = 0; p = 1. 15 Note that LM is identical to the independent variable in Peters (1999, 2001). For Petcrs's analysis, however, I have shown that her data, albeit significantly deviating from a chance distribution, do not allow a prediction of the choice of construction (X = 0). The data in Table 6:11, on the other hand, do allow us to predict constructional choices since no construction is uniformly preferred across all conditions. 16 Recall that this variable was receded to avoid missing data; cf. Chapter 5, n. 12. Thus, large/small values indicate small/long distances to the last mention respectively, and 0 stands for no prior mention. 17 The standardized betas for TOPM and OM are .4 (p<.001) and .03 (p = .69) respectively: TOSM is eliminated. OM'S partial correlation with the choice of construction shrinks to .02 (p = .699). Interestingly enough, given that the previous discussion of OM has shown that OM could in principle be indicative of both givenness and importance (cf. above), the result of the multiple regression arid the fact that large values of OM prefer construction^ one could suggest either that the importance component of OM seems to be less dominant than its givenness component or that the importance of the direct object's referent is less relevant to the alternation than its givcnness; however, on the basis of these data, no final conclusion can be reached. 18 I should note in passing that, since PP docs not causally influence particle placement (cf. section 4.5), I also attempted to relate the effect of PP to some variable(s) that is/are causally related to particle placement, namely CONCRETE, LENCTHW and IDIOMATICITY. The following results were obtained: • there is an insignificant tendency of PPs to preferably occur after concrete NPs (60 per cent), so this relation is probably not very strong;
RESULTS AND DISCUSSION
129
• if PPs form a PartP with the particle, then the PartP constituent following the DO is longer than the NP. If this difference in length would have any effect along the lines of, say, ETC1,, then we would expect that construction, with following PP might be licensed with longer noun phrases than construction, without following PPs. There is in fact such a tendency in the data, but it fails to reach significance, too. On the basis of the present data, it is, therefore, unclear whether there is something to this proposal or not; • we have seen previously that literalncss/lack of idiomaticity correlates positively with the utterance referring to cither literal movement or situations where the speaker construes the situation metaphorically as involving movement (cf. section 4.2). If we look at all utterances with PPs, then we find that all but 1 (98.6 per cent) are non-idiomatic, which shows that the absence of literal (caused) or metaphorical motion seems to rule out PPs. It is, thus, this variable that is the most plausible candidate for explaining the relation between PP and particle placement. 19 Recall that LENGTHS can in fact also be considered a phonological variable (cf. Chapter 2, n. 6). 20 Cf. Siewierska (1993) for a very similar, though less comprehensive, way of analysis. 21 Although, of course, all variables are at work at the same time (cf. section 2.6.2), we will for the moment confine our analysis only to two simultaneously conflicting variables since: • the number of coiitrastive comparisons grows exponentially with the number of variables included, rendering this analysis extremely time-consuming; • the number of examples needed in order to still get significant results also grows exponentially so that even several thousand examples analysed with respect to all the variables might not be sufficient; thus, restricting our attention to two conflicting values/levels is a means of using this analysis economically. We will discuss more complex cases in section 6.3, where the multifactorial procedures put to use will analyse situations where all variables are included simultaneously. 22 Note that this cut-off point renders TOPM identical to LM for this part of the analysis. 23 It is at least theoretically possible that pronominal direct objects arc used with construction,,, but this is only possible (and very infrequently so) if the pronominal object is contrastively stressed (cf, e.g., Bolinger 1971: 177 and Yeagle 1983:8-9).' 24 For instance, in the course of the analysis, TYPE: pronoun is contrasted with AcTPC: low. Since this is a very infrequent combination (quite naturally so: one would not expect to find pronouns referring to something hitherto unmentioncd) only five instances of construction, were found, i.e. a categorical distribution underscoring the importance of TYPE: pronoun. However, a distribution of 0 vs. 5 is not significant (pm,,.,^,.,, = .0625, given prior probabilities of .5). Thus, although pronouns categorically win out, this distribution cannot be counted as an instance where pronouns win out significantly. The same problem arises with the correlation coefficient for TYPE: pronoun (A. — .32), which, for purely mathematical reasons, also does not represent this
130
MUI.TIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
catcgoricality of construction, with pronouns icf. n. 12). This is one of the rare instances where the use of significance values can be inconsiderate or overestimated; this shortcoming will be dealt with shortly. 25 This index (ranging from + 1 to - I) was calculated as follows: (Nconstruction| — reconstruction,,) (reconstruction,, + Nconstruction,). 26 The p-valuc is the two-tailed probability that the distribution of the two constructions for a given level is due to chance, as computed with the exact binomial test. 27 Depending on the exact nature of the impact of PP on particle placement, these overall figures might change slightly: if one could show (possibly experimentally, if at all) that PP contributed to particle placement due to its morphosyntactic contribution to the length of the particle, then the average strength of the morphosyntactic variables would be .279 - if PP contributed to particle placement because of its semantic contribution (increasing the likelihood of a literal meaning of the verb phrase), then the average strength of the semantic variables would rise to .067. 28 The variables not included in the analysis arc habitual sense of the verb phrase, stress pattern of the verb phrase, phonetic shape of the verb phrase and semantic modification of the particle] cf. Chapter 5, n. 13. 29 Sometimes, it is argued that a discriminant function should be interpreted on the basis of the so-called standardized discriminant function coefficients. In this analysis, I follow, among others, Bortz (1999: 588, 595-6) and rely on the factor loadings instead. 30 For understanding the results in this column, recall from section 5.2 that nominal variables arc coded using 0 and 1. For instance, if TYPE: lexical (having a negative factor loading) has a high value (i.e. 1), representing that the direct object is a full lexical noun phrase, then construction,, is preferred - if it has a low value (i.e. 0) representing a non-lexical direct object noun phrase, then construction, is preferred. 3 1 1 have performed one additional analysis including the variable FREQUENCY in order to test for its multifactorial relevance (this variable is not included in Table 6:31 since FREQUENCY was never postulated to have an effect, but cf. Chapter 2, n. 26.). However, the factor loading of FREQUENCY (—.14) was far beyond my threshold value for relevant factor loadings of ±.22. 32 The asterisks indicate the level of significance for the obtained prediction accuracies as determined by an exact binomial test. 33 For the sake of completeness, I note that I have also tested the Processing Hypothesis in each register separately. However, the results equally support the present account: leaving aside the (highly significant) initial results of these two additional discriminant analyses, the cross-validated prediction accuracy (which is the figure most relevant to us) for oral data is 79.5 per cent - the corresponding accuracy for written data is 85.2 per cent. Moreover, no critical differences between the two registers are found with respect to the factor loadings (i.e. the strengths) of the variables; the only minor difference worth mentioning is that, in written text, the functional factors are stronger lhan in oral discourse. 34 In fact, in any particular discourse, a concept X that has never been mentioned before might be active (due to spreading activation) in one interlocutor's mind just because the facial expression of a hearer reminds the speaker of some other person who has recently talked about X. Such facts can hardly be predicted.
RESULTS AND DISCUSSION
131
35 I should note that the wrongly predicted utterances are virtually identically distributed across the two constructions. 36 Note that there is empirical evidence for this claim: the average characteristics of the wrongly predicted utterances (LENGTHS/I .ENGTHW: short, AcTPC: low, TOPM: low; IDIOMATICITY: metaphorical, TYPE: lexical) arc exactly those that were shown to neither win nor lose very often (in which case their effect would have been predictable). 37 For instance, the correlation between structural priming and the choice of construction could result from structural priming as such, but also from the fact that the direct object of the VPC under investigation is identical to the one of the first construction so that the observed effect might in fact reduce to the givenncss of the direct object. In other words, we need to partial out the influence of identical (and. thus, given) direct objects on the correlation between structural priming and the choice of construction. Thus, consider the following table. The interaction of structural priming and object identity
Identical constructions Different constructions Column totals
Different direct objects
Identical direct objects
Row totals
27(73%) 10(27%) 37 (100%)
26 (78.8%) 7(21.2%) 33 (100%)
53 (75.7%) 17(24.3%) 70(100%)'
Obviously, the influence of structural priming on the choice of construction cannot be reduced to givenness: choosing a given construction again (that is, to have identical constructions in both sentences) is strongly preferred irrespective of whether the direct object is also identical in both sentences. The influence of structural priming is, thus, neither artcfactual nor limited to experimental settings and should be considered in future investigations of syntactic variation (see recent experimental work by Goldberg and her collaborators). 38 One reason for this generally acknowledged practice is concerned with the difference between a priori and a posteriori hypotheses. It is always more valuable to support empirically an a priori hypothesis than formulating an a posteriori hypothesis on the basis of some existing data (ef. Bortv, 1999:1—2). The Processing Hypothesis is based on a former study (Gries 1999). but the general conceptual content of the present study, the set of variables hypothesized and investigated therein and specific findings differ considerably (cf, e.g., the role of entrenchment in Gries 1999 and in this work). Thus, supporting the a priori formulated Processing Hypothesis is more valuable than the formulation of a different a posteriori hypothesis on the basis of my results.
7
General discussion
We now have shed light on many aspects of particle placement. We have seen that the present analysis seems to be descriptively much more adequate than previous analyses and, at the same time, offers an explanation of the patterns found. Even the most rigorous test of the theory at hand, namely prediction of unknown cases on the basis of the simultaneous consideration of about 20 independent variables, fully supported the Processing Hypothesis. ' However, the methodology employed here has even more to offer than just the previously discussed results concerning particle placement. In this chapter, we will discuss some more general implications of my results: section 7.1 deals with the question of prototypical instances (of the two constructions), a question that is addressed in many cognitive-linguistic analyses of linguistic phenomena. Section 7.2 briefly addresses the question of the status of the linguistic rules following my results and their role in linguistic theories in general. Finally, section 7.3 is devoted to the discussion of how this study relates to other analyses of syntactic variation; a special focus will be on showing in what respect this analysis is superior to another analysis of particle placement in terms of processing cost, namely Hawkins (1991, 1994). 7.1 Prototypes 7.1.1 Two prototypes for the two constructions We have seen that, on the basis of some very complex calculations, one part of the output of a discriminant analysis is a prediction of one construction for every conceivable discourse situation. In this connection, it was previously emphasized that this prediction, although ultimately based on probabilistic techniques, is a categorical cither/or decision. While this is exactly what we need for predicting constructional choices (since speakers also decide categorically), it also implies that all instances of a construction belong equally to a clear-cut category constituted by the instances of this construction. However, this contrasts with some of the assumptions underlying this study as it has been convincingly demonstrated many times that human beings do not necessarily conceptualize categories as clear-cut and
GENERAL DISCUSSION
133
as being made up of exemplars equally representative for these categories. As evidence for these two claims, consider, e.g., that the use of hedging expressions such as strictly speaking and as such presupposes fuzzy boundaries of categories (cf. Lakoff 1972: 184 and Taylor 1995: 80) and that subjects' reactions to categories where all members do have equal status (e.g. the category of odd numbers) yield prototype effects (cf. Armstrong et al. 1983). While this does not necessarily point to an underlying graded representation of categories (a false inference termed the 'effects = structure fallacy'), it would still be desirable to determine the prototypical exemplars of the two constructions, each of which will, for the moment, be considered to instantiate a single category. This section will deal with the question of how the prototypical exemplars (i.e. constructions) of each category can be determined. In this respect, it should be noted that this kind of analysis will also address a much more general question. In section 1.3,1 argued that this study is based, among other things, on the assumption that categories can display prototype effects. If this study could show that there are indeed prototype effects with respect to the two constructional categories, this would provide further strong evidence against models based on clear-cut categorization and decision processes. In order to approach this topic, we first need a definition of what a prototypical member of a category is. In linguistics, most notably Cognitive Linguistics, the notion of prototype has been applied to a multitude of phenomena (cf. Lakoff 1987: Ch. 3 for a brief survey of applications in various sub-branches of linguistics or Taylor (1995) for a book-length treatment of this subject matter). However, while there is frequently a considerable degree of agreement on the definition of one entity as being prototypical for a category, it is often very difficult to devise operational and non-circular definitions of how to define prototypical members of a category.2 For instance, it does not make sense to simply say that the prototype is the most representative exemplar of a category if one lacks a way of measuring representativity that does not rely on typicality again. Likewise, it is equally problematic to simply define a prototype in terms of (i) the frequency with which members of the speech community encounter a particular instance, or (ii) the familiarity of a particular member of the category; for instance, the now classic example of the category bird has as very typical examples the members sparrow and eagle, although eagles are not frequently encountered by most people likewise, people arc much more familiar with, say chickens, but chickens are not good exemplars of the category bird. In this study, I follow Rosch and Mervis (1975: 575) and define the prototype of a category as the abstract cluster possessing (he characteristics with the highest cue validity for the category;1 this abstract cluster can then manifest itself in a particular entity exhibiting these highly typical or important attributes. FA-'en with this definition of the notion prototype it is still quite difficult to estimate and compare the distance of other category members to the prototype objectively while in everyday conversation it suffices to say 'I feel entity E is somehow a better example for category A than entity Z' or 'I believe that
134
MULT IFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
attribute X is somehow more central for category C than attribute Y', this is insufficient for a scientific study of categorization. In this section, I will showhow the methodology employed here makes it possible to (i) identify the prototypical (i.e. most representative as defined above) instances of a category objectively on the basis of non-introspective data, and (ii) quantify and compare the distances of particular instances of constructions from the prototype of the respective construction. Two ways are possible to identify the prototypical members of construction,, and construction,. One possible way would be to examine every variable's monofactorial preference for a construction and, for each construction, combine these values/levels into an abstract cluster of attributes. Each cluster would then possess all the attributes requiring one of the two constructions and could thus be argued to be prototypical for it. In a way, we would simply have to resort to Table 2:2 or Table 6:24 and combine the values/levels in the columns 'Value/Level for construction,,' and 'Value/ Level for construction,' or infer the values/levels associated with a construction from the correlation coefficients respectively. While this would definitely yield two abstract clusters of attributes that would indeed be very typical of the corresponding type of construction, on the basis of this cluster it would still be quite difficult to identify a concrete sentence of the corpus data for two reasons. First, we would again rely on the results of the monofactorial analyses, which deprives us of the opportunity to include the weighting/importance of variables into our search for a prototype. That is, what do we do if no instance of a construction is found with all attributes but two competing instances where in each just a single attribute is missing: how would we decide whether sentence A (where, e.g., the referent of the object is abstract) or sentence B (where, e.g., the object is lexical) is more prototypical for construction,? We would lack a principled basis for this decision. Second, it might be difficult or even impossible to find a single concrete example displaying all the attributes for a construction since such instances (i.e. those that display a particular value in each of about 20 variables) will occur only rarely, if at all. Thus, I argue for a second possible way As I have explained previously, the discriminant analysis calculates a discriminant score for each sentence in my data. This score indicates categorically which construction the speaker will choose, depending on its value relative to the cut-off point determined by the analysis: the more the discriminant score deviates from the cut-off point, the more certain the categorical decision of the discriminant analysis for a certain construction is made. Thus, if we sort all the sentences analysed according to the size of their discriminant scores, then we get the continuum represented in Figure 7:1. This graphical representation is organized as follows. Each single dot stands for one of the 403 sentences analysed. The horizontal axis displays the range of discriminant scores (from —3.1 to 3.16) with the broken vertical line indicating the cut-off point (—0.039) for the decision for a construction - the vertical dimension distinguishes correctly predicted sentences from incorrectly predicted ones. The numbers in
GKNKRAL DISCUSSION
135
Discriminant scores of sentences
Figure 7:1 Discriminant scores of sentences in relation to the prediction accuracy brackets refer to sentences to be discussed below. But how is this to be interpreted and what does it have to with prototypicality? First, Figure 7:1 underscores the general point that the decision for a construction is categorical: if the analysis yields a discriminant score larger than -.039 for a particular sentence, then the analysis will predict that the speaker will choose construction,; if the discriminant score is lower than —.039, the analysis will predict, construction,,. Second, Figure 7:1 shows that, the larger the absolute values of the discriminant scores, the less frequently wrong predictions occur; for discriminant scores larger than .9 or smaller than —2.07 no erroneous prediction occurs. Third, the two most extreme discriminant scores (that is, —3.1 and 3.16) are assigned to those two sentences that are beyond any doubt and correctly predicted to be instances of construction,, and construction, respectively because, from all the sentences investigated, it is these two that display either all or most of the most important defining attributes for the construction they instantiate. Thus, I claim that these sentences are the most prototypical instances of the two different constructions, which is supported by looking at the actual sentences belonging to these extreme discriminant scores, namely (75) and (76). (75) . . . to take up erm an interest or activity which will channel them into other activities.
136
MUI.TIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
(76) Then we got him back to the Prince of Wales Theatre.
The sentence in (75) can be most unambiguously predicted to be constructionu for obvious reasons: the direct object is quite long and of high complexity, its referent is abstract and the determiner of the direct object is indefinite. The meaning of the verb phrase is not transparent and no prepositional phrase follows. Finally, the referent of the direct object noun phrase has not been mentioned or hinted at before and is, thus, inactive. Equally straightforward is the classification of (76) as a very representative exemplar of the category construction,: the direct object is extremely short, pronominal and has a concrete referent, the meaning of the verb phrase is literal and the VPC is followed by a directional PP. Lastly, the referent of him has been mentioned seven times in the preceding discourse and is, thus, highly active. In sum, the discriminant analysis enables us to determine the most prototypical instances of the constructions in the corpus data objectively on the basis of solid data; thus, discussions concerning the prototypicality status of instances or the degree to which particular attributes (i.e. variables) contribute to the identification of the prototype are ruled out. The discriminant analysis takes into account the importance of each variable's value/level much more precisely and objectively than any human being or necessarily biased linguist could do. What is more, we can also objectively and precisely determine the distance between the prototype and each of the analysed sentences as well as any other VPC we may be interested in in the future. Consider the following two sentences. (77) . . . and was quick to put together [^-p a $164 billion rescue package). (78) . . . if it's that easy for the Labour leader to give up [NP the principles in which he does believe].
If we now try to answer the question 'which of the two sentences is more typical of construction0?' without the results of the present study, then we run into lots of difficulties: (77) is longer in syllables, but (78) is longer in words, so what happens if we do not know which variable is more important? Both direct objects are lexical and have abstract referents, both sentences are from written texts and both VPCs are not followed by directional adverbials, so these criteria do not provide a clue. Equally unhelpful in this case are the discourse-functional variables since, in both cases, the referent of the direct object has not been mentioned before. In (77), the direct object has an indefinite determiner, while in (78) it has a definite determiner, so why not claim (77) is more typical of con struct ion,,? Well, we cannot be sure about this because in (77) the direct object is only intermediately complex (NP + modifier), whereas it is highly complex (NP with embedded clause) in (78). What this boils down to is that without multifactorial results, introspective linguists cannot identify the sentence more typical of construction,, on a principled basis since they cannot know how to weigh the variables
G E N E R A L DISCUSSION
137
discussed so far. In fact, most investigators of particle placement have not even been aware of all variables postulated so far, so it is not surprising to seethat the question of (the degree of) prototypicality is so difficult to answer with traditional techniques. Given the results of the present study, however, this question can be answered easily and objectively: we simply enter all the above-mentioned values and levels (and the ones not mentioned) into the discriminant function of my analysis and the importance of the variables is accounted for by the numerical weightings of the variables. As a result, we get the discriminant scores of—2.39 and —1.85 for (77) and (78) respectively. That is, • sentence (77) is more typical of construction because its discriminant score is lower (i.e. closer to the extreme, prototypical case) than the one of
(78); • the distance of (77) from the prototype of construction,, (cf. (75)), namely .71 (i.e. 3.1 - 2.39) is fairly similar to the distance between (77) and (78), namely .54 (i.e. 2.39- 1.85). These findings are also graphically represented in Figure 7:1. We see that, polemically speaking, pseudo-arguments such as 'But I think that sentence (78) is somehow more prototypical than sentence (77)' or vice versa cannot arise anymore: for any two instances A and B of a construction x , we merely need to look at the combination of variables making up these particular examples and enter these results into the discriminant function. From that, we get discriminant scores for A and B and a graphical representation of their degree of prototypicality. and the sentence that is assigned the higher/ lower discriminant score or is more positioned to the right/left margin ol the diagram is more typical of construction,/construction,, than the other one. Without any reference to introspective and hardly falsifiablc intuitions as to what is cognitively somehow simpler or more basic, we can reliably compare the prototypicality of members of a category simply on the basis of a large speech community's set of subconscious decisions for a construction.4 However, 1 want briefly to address a possible objection, namely that this identification procedure of prototypes may not be adequate since human categorization processes can function fundamentally differently. While T share the opinion that speakers do not decide on a construction on the basis of a discriminant analysis performed during online speech production, I nevertheless do think that such an analysis provides insights into what is important for the subconscious production processes. Even if the discriminant analysis as such does not model human categorization processes, the results of the statistical model nearly equal the result, of human categorization processes because the result of the statistical analysis is ultimately based on human categorization processes (namely those processes underlying the 403 choices of construction in our data). The latter point is stronglysupported by several arguments.
138
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
Let us first look at the two outliers in Figure 7:1. i.e. those sentences with fairly clear predictions for a construction which, nevertheless, turned out to be false. (79) . . . to turn [NP the male population] out in the middle of the night, discr. score: —2.07 (80) 'Hit squads' make regular weekend forays to pull down [XP signs] . . . discr. score: .9
The discriminant analysis falsely predicted that speakers would use construction,, for (79) and construction! for (80). If we look at the two sentences, we see that even native speakers would probably predict the wrong choice of construction of these particular speakers. In the case of (80), for example, the direct object is very short and simple, the referent is concrete, the meaning of the verb phrase is literal and the preceding context shows that signs has been mentioned before (even in the immediately preceding sentence). Correspondingly, the sentence is assigned a comparatively high discriminant score of .9 so as analysts we would expect construction]. Analogously, native speakers would probably also choose construction,.1 Nevertheless, contrary to what the majority of the other native speakers did, our own intuition and the result of my analysis, the speaker chose construction,, for some unknown reason.1' The important point is that, even though the prediction is false, the result of the discriminant analysis corresponds to (i) our intuition that the speaker normally should have chosen the other construction and (ii) general usage by other native speakers,7 so one cannot argue that the statistical procedure does not allow us to draw inferences about human categorization processes - rather, the technique used enables us easily to find those instances where speakers did something quite untypical in view of the vast majority of the data. As a second observation supporting my argument against the possible objection above, let us return to our previous discussion of instances difficult to categorize; consider (81) and (82). (81) . . . what about ladies . . . who just cannot put on weight? (82) I've put my name down.
The sentence in (81) has a discriminant score of-.086, i.e., according to the discriminant analysis the speaker should have chosen construction,, (which he did); (82) has a discriminant score of-.132 so, according to the analysis, the speaker should have chosen construction,) (which he did not). However, the closeness of these discriminant scores to the cut-off point determined by the analysis (-.039) suggests that in these two cases both constructions should be nearly equally possible, which is definitely the case as I (and presumably most native speakers of English) do not sec any reason why (83) and (84) (or other borderline cases) should be unacceptable.
GENERAL DISCUSSION
139
(83) . . . what about ladies . . . who just cannot put weight on? (84) I've put down my name
So again, the result of the discriminant analysis (both constructions should be possible) is supported by our intuition so that the possible objection against a supposed unnaturalness or inadequacy of the results has been further eroded up to the point of being pointless. Let us summarize the previous discussion. Using the methodology discussed so far: 1 we can unambiguously and objectively identify the prototypical instances of the two constructions on the basis of large data sets; 2 we can unproblematically and objectively quantify and compare the distance of any category member to its prototype; 3 most importantly we can achieve the two just mentioned objectives in a way which is cognitively adequate in that it fully conforms to most native speaker intuitions. Furthermore, this approach also shows that it makes sense to approach the problem of categorizing linguistic data from a perspective that does not rely on classical Aristotelian clear-cut categories. Put differently, the analysis provides strong empirical support for cognitively more realistic approaches to categorization; namely those allowing for gradicnce of (linguistic) categories and incorporating prototype effects. 7.7.2 One prototype for the two constructions? So far, we have only considered the possibility of determining the prototypical instance of the two categories made up by construction,, and construction,. We have not yet addressed the question of whether it is possible to determine an overall prototypical instance of a single category VPC: is construction,) or construction, more basic or more typical of the category of VPCs?" This question already posed considerable difficulties to most Transformational-Generative analyses and another processing-oriented approach, namely the one by Hawkins (1994); cf. Chapter 4, n. 12. But howdoes the present account approach this issue? The present analysis takes a fundamentally different approach so that problems similar to those of Hawkins do not even arise: in this study, I do not claim that the two constructions necessarily constitute a single category in the same sense in which, say, all instances of construction,, form a single category. Thus, no construction is considered to be basic. Rather, both constructions exist independently because of the cognitive-functional properties manifesting themselves in the variables' values/levels, i.e. many of those aspects of particle placement that Hawkins has deliberately left aside. The two constructions instantiate what Lambrecht (1994: 34-5) has called information structure constructions: while they are often semantically
140
MULTI FACTORIAL ANALYSIS IN CORPUS LINGUISTICS
synonymous, they are not pragmatically synonymous as they indicate the cognitive status of the direct object's referent along the lines of the Processing Hypothesis.9 That is, the two constructions exist due to their different functional motivations, i.e. the communicative requirements they satisfy (c£, e.g., Halliday's analysis), without any inherent priority of either construction. Now, it might seem strange to most readers that I argue that the two constructions (that can frequently be interchangeably used) do not form a single category of VPCs as has almost always been argued before. I would assume, however, that the conceptual pressure to combine the two different constructions into a single category of VPCs derives from the prominence of generative work where the two constructions are posited to be transformationally related on the grounds of the general structural similarity in straightforward cases such as (85). (85) a. John picked up the book, b. John picked the book up.
But let us investigate this question of the categorization of VPCs rather than just blindly follow previous and totally differently motivated approaches. Psychologically/cognitively speaking, categories are generally formed to maximize intracategorial similarity and intercategorial differences, so let us examine the two constructions in this respect in a thought-experiment. Put differently, imagine some linguist (not a priori committed to the exclusion of [groups of] variables) would investigate the characteristics of the two constructions empirically. He would find two things: • the two constructions are similar with respect to their classical constituents since both constructions consist of a verb phrase made up of a verb, a noun phrase and a particle; • the two constructions differ in many respects; namely phonologically (cf. section 2.1), morphosyntactically (i.e. the exact characteristics of the constituents, e.g. their average lengths or the determiners commonly found; cf. sections 2.2 and 6.1.2), semantically (cf. sections 2.3 and 6.1.3) and discourse-pragmatically (cf. sections 2.4 and 6.1.4). That is, the number of differences between the constructions far outnumbers the sole (namely structural) commonality of the two constructions. In other words, while all instances of construction,, and all instances of construction, share many characteristics (yielding the high prediction accuracy found in section 6.3), there is absolutely no conceptual basis (in terms of distributional criteria) for combining construction,) and construction, into a single category of VPCs, given the above-mentioned cognitive function of categorization. Thus, the unbiased linguist would have to conclude that the two constructions do not form a single category. Additionally, this account also explains why form-oriented linguists would of course claim
GENERAL DISCUSSION
141
that the two constructions form a single category: the only kind of variables they commonly admit (namely structural ones) is the only one where a similarity is found. However, one might still object that the two constructions can in fact be often used interchangeably (cf. (85)) so that, in spite of the just-mentioned differences, it would be futile to claim that the two constructions do not constitute a single category. But this is false since, as we have seen over the last 100 or more pages, the constructions are not used interchangeably - if they were, how could one predict 83.9 per cent of the speaker decisions correctly? Thus, it is admittedly correct that in artificially decontextualizcd examples such as (85) both constructions are grammatical and/or perhaps equally likely (i.e. in free variation). In natural discourse, however, we have seen that in the vast majority of cases the distributional pattern is very regular so that the constructions are far from being used interchangeably. So, what, from a purely structural point of view, must appear to be random or negligible variation of two syntactic variants is in fact highly (though not totally) rule-governed and extraordinarily predictable. Thus, many formal approaches only account for what is theoretically and structurally possible without taking into consideration what actually happens, namely that speakers have strong contcxtually dependent preferences for the constructions. In sum, • each construction constitutes a category in its own right as in each of the two categories we find a strong similarity across all (groups of) variables and common rules governing their actual usage; the existence of two categories simply derives from the variety of differences the two constructions display and the communicative purpose to which they are put to use; • the two constructions do not form a single category since, in terms of both category organization and actual usage, we find more differences than similarities between the two constructions; the traditional claim (that the two constructions form a single category) results from a preponderance of relying on structural criteria (permeating syntactic research within theoretical linguistics) and, at the same time, a negligence of relying on other criteria for the definition of an alleged category of VPCs.1" Finally, for those readers who, lor whatever reason, still insist on combining the two constructions into a single VPC, I would like to outline, at least briefly, what facts would have to be accounted for. On the one hand, one might assume that the two constructions form a single category and that construction,, is basic for the following reasons: 1 construction, i often instantiates the cognitivcly very basic scenario of transitive events; 2 construction,, has the optimal structural configuration for structural processing and parsing; 3 construction,, is the unmarked construction in that it is subject to fewer
142
MULTIFACTORIAL ANALYSIS I\ CORPUS LINGUISTICS
distributional restrictions: construction,, is only ruled out for unstressed pronominal direct objects - construction, is at least generally ruled out for long/complex, pronominal or (contrastively) stressed direct objects; 4 when speakers arc asked to form sentences out of a verb, a particle and a noun phrase, construction,, is formed significantly more often than construction). This has been found in an experiment by Dehe (2000) where native speakers were offered the three constitutive parts of VPCs in order to assemble sentences out of them; the result was that speakers mostly assembled construction,,, irrespective of the meaning of the verb phrase and the order in which the elements were offered; 5 this assumption is diachronically more plausible: construction,, existed prior to construction, and can clearly be related to the well-supported First Particle Shift and Second Particle Shift and the fact that, after the two shifts, construction,, occurred more frequently than construction, whereas there is no equivalent clear historical motivation for construction,. On the other hand, one might, with some equally good reasons, assume that the two constructions form a single category and that construction, is basic since: 1 construction, is used for a cognitively very basic scenario, namely caused motion; 2 construction, is also more often used for cognitively more basic (concrete) direct objects; 3 construction, is much more frequently found in oral language, which is uniformly taken to be more basic than written language (cf. Chapter 2, n. 18); 4 on the basis of data from the CHILDES corpus collection, Hyams et al. (1993:7-8) and Broihier et al. (1994: Ib) have shown that children pass through an initial stage in which particle constructions appear only with the order of construction,, their so-called 'stranded particle stage', i.e. construction, is acquired earlier." That is, this list of criteria shows that, in addition to my previous reasoning, the arguments supporting the postulation of a single category with a prototype are ambiguous at best; a purely argumentative account does not suffice for a decision as to which construction is basic, which renders the (empirically motivated) assumption that no construction is basic even more plausible. Thus, from all three possible arguments (there is no single category of VPCs; there is one such category with construction,, as its prototype and there is one such category with construction, as its prototype), the third one, i.e. Hawkins's claim, receives the weakest support while, to me, the first one seems to be most plausible, especially given the empirical results. Thus, the results of my analysis also open up a new perspective for looking at syntactic constructions that is usage-based and cognitively
GENERAL DISCUSSION
143
realistic and does not rely on artificial and purely theoretical notions such as transformations relating otherwise widely disparate constructions. 7.2 Variability and grammar Let us now deal with the nature of the linguistic rules or the nature of the grammar that is required to accommodate the results of this study. There is a striking similarity between the present analysis and a notion familiar from quantitative sociolinguistic studies, namely that of variable rule. Variable rules were introduced in a now classic study by Labov on Black English vernacular (Labov 1969). They were among the first rule types to lie somewhere between categorical rules (i.e. rules that always apply, e.g., the transformational rule of affix hopping in early transformational grammar) and optional rules (i.e. rules that may or may not apply, e.g., the passive transformation) and, thus, foreshadowed recent frameworks such as the Competition Model. 12 Variable rules deal with linguistic variation and use features from the environment of the locus of variation in order to 'specify at least the relative likelihood that the rule will actually be applied in environment A or environment B; either way the result is in the language, but the rule is more likely to be applied in A than in B' (Fasold 1990: 245). Traditionally, a variable rule consists of a specification of the input to and the output of the rule plus a set of so-called variable constraints, which either increase or decrease the likelihood of the rule's application. The parallel to the present approach is probably all too obvious: particle placement is governed by a set of variables (i.e. the variable constraints) and each variable has an effect on particle placement in that it specifies the likelihood of construction,) or construction, within a given contextual environment. Put differently, '[a] variable rule attempts to make a statement about the conditions under which these alternative forms are more or less likely to be used by speakers of a language" (Fasold 1990: 253).Thus, on the whole. 1 think this work is definitely compatible with previous sociolinguistic quantitative studies of variation on language.1'' The similarity between my analysis and variable rules analyses (or any kind of probabilistic analysis; cf. further extensions in Chapter 8) might be problematic to some readers because (i) since its inception, the notion of variable rules has come under almost immediate attack, and (ii) more generally, mainstream syntactic research has been dominated by anti-probabilistic approaches. Thus, I would like very briefly to discuss at least some objections to variable rules in order to comment on this aspect of the present work and the status of more general conclusions following from it. Bickcrton (1971), e.g., has claimed that variable rules postulated on the basis of data from several subjects could mask different kinds of categorical behaviour of individuals if they are clumped into a single group. The logic of this argument can equally be applied against the present corpus-based analysis. However, while this is theoretically indeed possible, Bickerton has not provided any empirical evidence for his claim (which has been refuted
144
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
by Guy 1980) and, more importantly for the present study, it is highly implausible to assume that speakers have a fixed choice of construction for each possible set of values/levels for two reasons. First, this presupposes a behaviour that is much too rule-governed, given the inherent variability of human behaviour (as was shown for linguistic acceptability judgements, speaker judgements for identical sentences can vary over time). Second, Bickcrton faces another insurmountable problem, namely that his claim of human behaviour being categorical is not falsifiable: whenever we come up with two instances of human behaviour (e.g. choices of a VPC) where all parameters we consider important are identical but the speaker's choices nevertheless differ, Bickerton could simply claim that we forgot some other variable, namely one whose influence would show that the observed behaviour is still categorical. Since he could continue to do this ad infinitum for each counterexample, it is not possible to falsify Bickerton's claim without him offering any coherent apnori motivated limit concerning the range of variables whose inclusion he would allow for. Furthermore, in the same paper, Bickerton has claimed that variable rules are inherently unlikely as they require speakers to keep track of their relative frequency of usage in order to achieve, in the long run, the generally observed frequency of usage. This claim betrays a very serious misunderstanding of probabilistic reasoning: the variable-rule model does not require speakers to keep tally all that it requires is a constant probability of rule application which will, in the long run, yield an actual frequency of occurrence approximating the probability of rule application just as tossing an honest coin will, in the long run, yield heads 50 per cent of the time. Thus, this argument poses no threat to the probabilistic tendencies underlying the results of Chapter 6. An additional point of criticism levelled against probabilistic rules is that they are argued to offer no guide to conduct in specific instances, but serve only to summarize general trends. In this respect, it has also been argued that variable rules could hardly be integrated into then current syntactic theories, most notably Chomsky's Extended Standard Theory However, we have seen above that the present analysis incorporates something closely similar to variable constraints and is nevertheless able not only to make veryprecise claims as to the specific conduct of individuals but also to justify speakers' choices with reference to cognitive capabilities and processes. Actually, this objection fails to notice two essential points. First, a probabilistic account does not rule out categorical statements (i.e. cases where p = 0 or p = 1), it simply allows intermediate statements as well. Second, probabilistic rules (as defined above) definitely can offer a guide to conduct in specific instances: as in this study, we only need to include a threshold level or a cutoff point (of probability, activation or processing cost) in the analysis which triggers a specific decision when exceeded." Given the cognitive orientation of this study, one might even add that the existence of such thresholds is well-documented for one of the most essential notions of this study, namely activation: just as neurons only fire (i.e. generate a nerve impulse) if the
GKNKRAL DISCUSSION
145
incoming electrical signal reaches a specific electrical strength (the so-called threshold potential; cf. Birbaumer and Schmidt 1991: 205), the exceeding of a particular activation threshold of the referent or a certain degree of processing effort (linguistically indicated by the variables influencing particle placement) triggers construction,; cf. also Chapter 8. More generally, note that this objection is based on the assumption that even variable linguistic phenomena are ultimately explainable in categorical and absolutely predictable rules. Moreover, this objection originated in a climate where syntactic theory was primarily devoted to the study of competence and the generation of all and only all the grammatical sentences of a language, i.e. the objective was to specify which sentences are possible and which are not - frequency information and probabilistic variable rules (belonging to performance) were apodictically considered lo contribute nothing to linguistic theory. In other words, this point of criticism against probabilistic rules is probably due less to some inherent irnplausibility of such rules and more to the dominant intellectual climate in syntactic theory. 1 ' This argument is supported by a recently renewed interest in variation and its investigation in probabilistic terms (even outside of areas where probabilistic analyses have always remained important such as corpus linguistics, computational linguistics or quantitative linguistics). One example of this is Beals el al. (1994). a collection of papers from a CLS parasession on variation in linguistic theory - other examples include the study of Leech et al. (1994) already mentioned, explicitly and fruitfully discussing the alternation between the j-genitive and the ^/-genitive in probabilistic terms. In this respect, 1 totally agree with Hudson (1997: 105). who has argued that 'linguistic theory should be confronted with the statistical data of variation studies' and hoped 'for a fruitful meeting between variation data and theories of language structure' while using the findings of Guy (1994) and Kroch (1994) lo support cognitive-linguistic analyses of t/d deletion and do insertion. That is, the present study provides, following Hudson (1997: 74). results supporting a comparable theoretical position on the nature of language and its relation to other cognitive mechanisms, namely the one outlined in section 1.3 (since the inclusion of cognitive-functional notions has improved our understanding of this instance of syntactic variation). On the other hand, my results weaken theories of syntactic variation that (i) incorporate only morphosyntactic variables, and (ii) rely on clear-cut categorization as opposed to allowing for gradience of linguistic categories. In sum, I hope to have shown that the probabilistic approach to syntactic variation adopted in the present analysis nevertheless contributes to the issue as: • not all points of criticism against variable rules and similar probabilistic accounts are as severe as they might sound initially; • the present approach does not share some of the weaknesses of variable rules (cf. mv detailed discussion of interactions between variables and.
146
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
more importantly; the introduction of a threshold level yielding categorical speaker decisions as output rather than probabilities); • in contemporary corpus linguistics and quantitative linguistics, probabilistic analyses are far from exotic - in fact, they are standard in these areas as well as in psycholinguistics and work in natural language processing, i.e. in areas of linguistic research whose methodology and objectives partially resemble the methods and objectives of the present study: • statistical analyses of variation data do contribute to theories of language: 'we study the observable reality of "parole" (or "performance") as an indirect means of reaching the underlying reality of "langue" (or "competence")' (Leech et al. 1994: 58)."' Having discussed the results and some their implications of the present study in quite some detail, the following section addresses the question of how this work relates to other studies of determinants of grammatical variation, most notably Hawkins (1994). 7.3 Competing approaches to syntactic variation Broadly speaking, performance analyses of syntactic variation come in two different kinds: discourse-functional (as represented in numerous publications by, e.g., Givon) and morphosyntactic based on syntactic complexity (e.g. Hawkins's 1991, 1994 EIC analyses). Given the limited number of analyses going beyond simple monofactorial analyses, previous studies have by arid large focused on either category of variables - only a small number of works has investigated which of the two approaches seems more appropriate. Siewierska (1988: Ch. 2) has investigated the impact of linearization hierarchies on constituent ordering. More specifically, she distinguishes three kinds of hierarchies: • formal hierarchies concerned with matters of constituent length and constituent complexity; • dominance hierarchies concerned with perceptions of natural salience; • familiarity hierarchies concerned with mailers of topicality givenness and defmiteness. She has drawn the conclusion that '[t]he data presented in this chapter clearly support the superordinate nature of the familiarity hierarchies over the dominance and formal hierarchies on a cross-linguistic basis' (Siewierska 1988:83). In a later study, Siewierska (1993) has analysed Polish texts with respect to the question which kind of variables (given>new [where '>' means 'precedes'], measured using referential distance, as opposed to short>long [EIC], as described in Hawkins 1991) is more important in determining constituent ordering. She has found that Hawkins's EIC is not as powerful as
GENERAL DISCUSSION
147
he would like us to believe (1993: 247). However, 'no definite conclusions about the strength of the weight and pragmatic principles are in fact possible' (Siewierska 1993: 263). As was previously indicated, the third study attempting to compare morphosyntactic and discourse-functional variables is Hawkins (1994). He has devoted quite some space to a comparison between these two different approaches and has come to the conclusion that 'pragmatics appears to play no role whatsoever' (Hawkins 1994: 240-1) for linear ordering since his EIC makes better predictions than pragmatic principles (even if ETC and pragmatic variables might correlate with each other). What is the basis for this rather radical claim? On the basis of a cross-linguistic comparison of EIC predictions vs. discourse-functional predictions, Hawkins (1994) found that (i) in languages where ICs are constructed on their left (e.g. English, German, Hungarian), there are preferences of short before long and given before new, and (ii) in languages where ICs are constructed on their right (e.g. Japanese), there are preferences of long before short and new before given. Since a pragmatic theory that has to incorporate both given>new and new>given predictions seems contradictory and barely falsifiable while at the same time his EIC fares fairly well, Hawkins arrives at the above-mentioned conclusion of the irrelevance of pragmatic variables. Since the present analysis also relies on processing as the key determinant of particle placement and has also shown that some morphosyntactic variables (such as COMPLEX) are more important than discourse-functional variables, one might wonder why I do not (at least partially) follow Hawkins's line of reasoning and draw the same conclusion, but rather stick to my theoretical assumptions as outlined in section 1.3. The reason for my decision is that Hawkins's analysis is fraught with some important drawbacks to be discussed in what follows. First, Hawkins's decision to ultimately reduce the choice of construction to morphosyntax (i.e. syntactically determined processing effort) is problematic since, for instance, the morphosyntactic factors that Hawkins holds responsible for particle placement alone can easily be overridden by the variable STRESS, which can obviously be related to processing issues because of its functionally motivated impact (e.g. emphasizing referents) rather than some contribution to morphosyntactic complexity.17 The present analysis, by contrast, argues for a broader view of what constitutes processing in order to account for all the other findings Hawkins does not discuss. This difference between Hawkins (1994) and my analysis can best be summarized by returning to Figure 4:2, repeated here as Figure 7:2. My analysis seeks to integrate all variable groups in Figure 7:2 whereas Hawkins includes only morphosyntactic variables and rules out other possible determinants of processing effort. Second, on a methodological level, his quantitative analysis is of little statistical sophistication since its results are not even subjected to tests of significance. Consider, e.g., Table 7:1 from Hawkins (1994: 181) and slightly adapted for expository reasons.
148
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
phonological aspects, e.g. • stress of the direct object
construction0
morphosyntactic aspects, e.g. • length and complexity of the direct object • early/late completion of the phrasal verb
high
semantic aspects, e.g. • idiomaticity of the VP • concreteness of the referent of the direct object
processing (cost) of the utterance
low information-structural aspects, e.g. • last mention of the direct object's referent (identifiablity) • times of preceding mention of the direct object's referent (degree of inactivity) construction..
Figure 7:2 Determinants of processing effort and particle placement
Table 7:1 Hawkins's results (1994: 181) concerning particle placement in English j\P-1 word NP=2words J\'P= 3 words ,A']P= 4 words J\rP= 5+words Constructiotif, Construction, Column totals
51 (94.4%) 21 (31.8%) 3 (5.6%) 45 (68.2%) 54(100%) 66(100%)
3 (18.8%) 13(81.2%) 16 (100%)
1 (7.1%) 13(92.9%) 14(100%)
0 (0%) 29(100%) 29(100%)
On the one hand, no single test of significance is computed and given the overall tendency, however, this is probably no severe drawback, but there is something else to be mentioned here: Hawkins concludes that (54 + 45 + 13 + 13 + 29 =) 154 out of 179 orderings (86 per cent) are most optimal, i.e. as predicted by EIC. From this result it emerges that Hawkins counts as successful predictions even those cases where EIC does not make any prediction as to which construction the speaker will choose because either ordering is optimal. Personally, I find it somewhat strange to count something as correctly predicted if my theory has not made a specific prediction of a speaker's constructional choice. Put differently, the 54 cases where the NP is just one word long are counted as supporting EIC, although EIC did not make any specific prediction so there was no possibility for empirical falsification — a rather blunt way of boosting success rales. From my point of view a more adequate way to report the success rate for the above results
GEXI'RAI, DISCUSSION
149
would be by calculating a (still convincing) adjusted rate of 80 per cent, namely 100'(i.e. 45 + 13 + 13 + 29) out of 125 (66 + 16 + 14 + 29). Third, there is absolutely nothing in Hawkins's (1994) analysis of particle placement in terms of the EIG principle that can account for why idiomatic verb phrases, indefinite determiners and abstract direct objects prefer construction(). For instance, what about the (21 + 3 + 1 =) 25 instances in Table 7:1 that Hawkins's analysis fails to predict correctly? It is astounding (and, at the same time, revealing) that no explicit proposal whatsoever is made as to the reason(s) why these instances do not conform to EIC. Obviously, some other variable(s) must play a role here, and T submit that these other variables comprise the remaining variables discussed here (and perhaps others), i.e. morphosyntactic variables other than length/complexity as well as phonological, semantic and discourse-functional variables. These variables are interrelated in a complex manner (recall the proposed relation between IDIOMATICITY, CONCRETE and PP), but it is plausible to assume that they will also influence particle placement directly to some extent. Fourth, interactions of different variables are apparently ruled out a priori as they arc not considered at all. Fifth, arguing that discourse-functional variables are either mere correlates of syntactic weight and epiphenomenal or speculating that discourse is only of relevance when Hawkins's EIC does not make strong predictions (cf. Hawkins 1991: 197, 209 and Hawkins 1991: 241; this latter claim is rejected later by Hawkins himself) does not really explain anything. More specifically. Hawkins's just-mentioned empirical correlations make it impossible to decide on the real-world relation between particle placement, morphosyntactic variables and discourse-functional variables. As is known to every statistics beginner, the correlations observed"' could be indicative of each of the following causal relationships in the figures below: arrows symbolize causal relationships; simple solid lines stand for relationships that can, but need not, be causal; broken lines symbolize non-causal relations which might exist but need not; the thickness of any arrow/line represents the strength of the correlation.1'1 On the basis of Hawkins's results, one could, e.g., argue that morphosyntactic variables influence particle placement strongly, but there is not necessarily a relationship between morphosyntactic and discourse-functional variables on the one hand or discourse-functional variables and particle placement on the other; this seems to be Hawkins's interpretation and could be graphically represented as in Figure 7:3. discourse-functional variables
morphosyntactic dsicourse-funco tinal
. particle placement
Figure 7:3 Possible explanation of Hawkins's findings 1
150
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
However, on the basis of the very same results one could also claim that morphosyntactic variables influence particle placement and are in turn influenced by discourse-functional variables without there being any direct causal relationship between discourse-functional variables arid particle placement, as represented in Figure 7:4; in this case, discourse-functional variables would correlate with particle placement indirectly only (due to their influence on morphosyntax). Lastly, one might assume that discourse-functional variables influence morphosyntactic variables, which in turn determine particle placement, and there is also some causal relation between discourse-functional variables and particle placement, as in Figure 7:5. Since all diagrams would predict that there is a strong statistical correlation between morphosyntactic variables and particle placement and a weaker correlation (or none at all) between discourse-functional variables and particle placement, there is no force to Hawkins's contention that only pragmatics plays no role until these matters are cleared up. The finding that the correlation between particle placement and morphosyntactic variables is higher than the one between particle placement and discourse-functional variables need not indicate that morphosyntax is also more likely to be the cause for the effect (particle placement). This is supported by the fact that length/complexity is something that can be operationalized very easily and directly whereas a latent factor such as news value can only be inferred on the basis of manifest variables; since this is likely to be less precise than the clear measurement of syntactic variables, there will be some degree of inaccuracy that might cloud the picture considerably. discourse-functional variables morphosyntactic variables particle placement
Figure 7:4 Possible explanation of Hawkins's findings 2 discourse-functional variables
morphosyntactic variables particle placement
Figure 7:5 Possible explanation of Hawkins s findings 3
C K N K K A L DISCUSSION
151
In this connection, note that there is a study that has also investigated which kinds of variables contribute to syntactic variation. On the basis of a corpus analysis of the Dative Alternation and Heavy NP Shift and an elicitation experiment for the Dative Alternation, Arnold et al. (2000) have shown that both LENGTHW (called 'heaviness' in their paper) and newness (a measure of referents accessibility with the possible levels new, inja-able and given) significantly govern syntactic variation. They concluded that neither of the two groups of variables can be reduced to the other. These results also seriously undermine Hawkins's claims and provide additional and independent support for my analysis where the contributions of all variables are analysed. In fact, the results of Arnold el al. are strikingly similar to the ones of my study although they only investigated three variables: they also found that the morphosyntactic variables account for more variance than the discourse-functional ones but. still, do not conclude from that that the former are inherently more important than the latter. Their conclusion is that 'when two factors are found to influence a particular choice or interpretation, the effect of each is usually stronger when the other factor is less constraining' so that 'the role of each factor [variable, in our terminology] depends in part on the strength of competing factors' (Arnold et al. 2000: 49-50; cf. also MacDonald 1996). a finding totally conforming to the multifactorial findings of ihe present study. Thus, the results ol the present study seem to be very solid. So far we have been dealing with points of critique which the present study has avoided in the first place. For the next issues at hand, we need to return to the Processing Hypothesis of Chapter 4. It was argued above that the Processing Hypothesis has intentionally been phrased in such a way as to accommodate both processing effort of speakers and hearers simultaneously, cf. Chapter 4, n. 1. In a thoroughly functionalist tradition, it was then shown how most of the variables found in the literature influence the hearer's comprehension of the speaker's utterance in a way suggesting the Processing Hypothesis: speakers try to account for their hearer's needs by choosing the 'right' construction for the particular discourse situation. In some respects, this way of analysis, in terms of processing effort, is partly compatible with a vast body of functional/psycholinguistic literature, some of which has been discussed in detail. Two questions, however, arise with both Hawkins's (1994) approach and the analysis advocated here. First, many of the variables can influence both the speaker's and the hearer's processing effort and it was shown that the question whose processing effort is decisive cannot be answered on the basis of the available data. Hawkins (1994) is also somewhat unequivocal on this issue: while E1C is in general justified by referring to efficient recognition of phrase structure only (i.e. hearer perspective), at times he also refers to the production effort byspeakers: I believe that words and constituents occur in the order they do so that syntactic groupings and their immediate constituents (ICs) can be recognized (andproduced)
152
MULTIFACTORIAL ANALYSIS IX CORPUS LINGUISTICS
as rapidly and efficiently as possible in language performance. (Hawkins 1994: 57; my emphasis)
The ambivalence concerning this issue does of course neither endanger Hawkins's approach nor the theory proposed here, but it would be preferable to know whose processing effort is truly responsible for the constructional choices. Second, if the choice of construction is truly dependent on the processing effort the construction requires in a particular discourse situation (and, given the evidence, we have every reason to believe so), then this entails that the speaker somehow (i) constantly monitors the discourse and constructs models of the hearer's representations of the discourse situation, (ii) weighs the potentially resulting processing effort of the two constructions, and (iii) cooperatively and a little cgoistically chooses the one that he suspects fits the (needs of the hearer in the) discourse situation better. For instance, Hawkins (1994) has argued that speakers actually compute optimal orders as is evident from the following quote: The speaker will have less reason for choosing the order that is completely optimal, and one can imagine him/her expending less effort or making more errors on-line when trying to calculate the most advantageous order from among a set of alternatives whose scores arc very close. (Hawkins 1994: 85; my emphasis)
While this is probably correct in many cases (e.g. written discourse, planned speech, etc.), it is extremely unlikely to hold across all discourse situations. An ideal account would thus avoid placing such a computational burden on the speaker who, seemingly without effort, talks to his iriterlocutor(s) (cf. Wasow 1997a: 350-1). In addition, an ideal account would not have to attribute such a cooperative strategy to the speaker but rather be able to explain the choice of word order without such an assumption, especially since there is ample evidence that speakers do not take listeners' needs into consideration (cf. Brown and Dell 1987; Horton and Keysar 1996; Wasow 1997b: 94-7; Arnold et al. 2000). Thus, while the present study is alreadysuperior to Hawkins's analysis in several respects, in the chapter to follow I would like to suggest an account of particle placement that is fully compatible with the preceding analysis based on the Processing Hypothesis (compatible in the sense that it makes identical predictions), but avoids these pitfalls. Notes 1 It is worth remembering here from Chapter 3 how seldom linguistic accounts of syntactic variation have ever been subjected to such a rigorous test. Granted, several analyses have already gone beyond introspective data - but testing one's account by actually predicting speakers' subconscious decisions in several hundred eases is of a different quality than a posteriori cross-tabulation and correlation coefficients (even if checked for significance). 2 Note especially that, in many cases, prototypes can be relatively easily identified
G K N K K A L DISCUSSION
153
on the basis of examples or representallvity judgements by native speakers (e.g. questions in the area of lexical semantics) - in cases of linguistic categories, however, this is not an easily available option: naive native speakers of English would face great difficulties in, say, identifying prototypical instances of the category of transitive clauses (let alone defining these instances) or elaborating on the question of whether the use of the English pas! tense form as a pragmatic softener is closer to the prototypical past tense meaning than its use in hypothetical conditional clauses (subjunctive). Thus, in these cases, more indirect means (e.g. perception tests), introspection or other arguments based on markcdness, order of acquisition, cognitive ease, etc. are used. 3 '[T]he validity of a cue is defined in terms of its total frequency within a category and its proportional frequency in that category relative to contrasting categories. Mathematically, cue validity has been denned as a conditional probability specifically, the frequency of a cue being associated with the category in question divided by the total frequency of that cue over all relevant categories' (Rosch and Mervis 1975:515). 4 This is riot only preferable because the database on which the basis of the conclusions are drawn is large. It is also preferable since the data arc not skewed as the speakers providing the data cannot possibly manipulate the data coherently in one direction: they do not know what exactly will be investigated with the data. As was already mentioned above, a linguist investigating his own theory on the basis of his own data is much more likely to be biased (even if only unintentionally) than an uniformed native speaker (cf. again Labov 1975 and Schiitze 1996 for illuminating summaries of the large differences between the intuitions of linguists and native speakers of English). 5 The reader might ask for the basis of this claim since I am no native speaker of English. I can nevertheless offer strong support for my claim, namely the corpus data. The direct object in (80) is one word long and concrete. In my corpus data, I have 86 examples of VPCs with such direct object noun phrases (including (80)). Of these, 77 (89.5 per cent) occur in construction, while only 9 (10.5 per cent) occur in construction,,. Thus, in similar cases, most native speakers have decided as I argued they would. (If we were to also include the information that the direct object's referent has been mentioned before in our calculation, then the ratio of choices of construction, rises further to 94.4 per cent.) Further support for the accuracy of this kind of analysis comes from another study. In (iries (forthcoming), the technique of discriminant analysis is applied to what is widely known as the Dative Alternation, yielding similar results and prediction accuracies. The results of that study arc, however, also subjected to another test, namely acceptability judgements collected with a questionnaire study, where many experimental controls were implemented in order to obtain valid results. The test items were added to a list of several hundred filler items (experimental items with varying though balanced degrees of acceptability from other ongoing projects). The items were also ordered pseudo-randomly such that no subject judged more than one sentence from each token set (using Cowart's 1997: 48 9 terminology). Thirty-six native speakers of British English received questionnaires with 32 sentences each, where the judgement process was explained and exemplified; the examples anchored only the endpoints of the rating scale. The subjects were then asked to provide acceptability- judgements ranging from —3 (strange/unnatural) to +3 (natural English/perfect). The middle of—3 and +3 was considered to be '()' and the subjects were additionally
154
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
offered the opportunity to answer 'I don't know.' On the results, a two-way ANOVA (construction X discriminant score) was computed, and the predicted two-way interaction between the two variables was significant (F(2, 173) = 17.56; p<.001) such that each construction's degree of acceptability correlated positively with its discriminant score, strongly supporting the corpus-based technique. 6 The only idea why the speaker might have chosen construction,, that I can think of is concerned with a variable briefly mentioned previously, namely priming of syntactic structures (cf above). In this case, the NP clothing has only been mentioned once before, namely in the immediately preceding clause, which also happens to be a VPC with loosen off. Since clothing was inactive in the clause preceding the VPC in (80), there was some motivation to choose construction,, in this clause, which the speaker in fact did. Then, in the following clause (our example (80)), the speaker might have chosen construction,, again because the syntactic structure was primed. This seems especially plausible, given the fact that the clause preceding (80) and (80) itself share several linguistic characteristics: both are imperatives, both have the same TPV and both refer to the same referent. In view of these commonalities, which might have further reinforced the priming effect, this variable might have been responsible for the otherwise totally unexpected choice of construction in (80). 7 The word should is of course not to be interpreted in a prescriptive sense. 8 The question of which construction is basic/derived is largely equivalent to which construction is prototypical: In general, markedness is a term used by linguists to describe a kind of prototype effect — an asymmetry in a category, where one member or subcategory is taken to be somehow more basic than the other (or others). Correspondingly, the unmarked member is the default value. (Lakoff 1987: 60- 1; Thus, the following discussion, although partly phrased in terms of the dichotomies basic vs. derived and marked vs. unmarked, is ultimately concerned with matters of prototypicality. 9 According to Goldberg (1996: 69), one would also posit these two constructions since 'a construction may be posited because of something not strictly predictable about its frame semantics, its packaging of information structure, or context of use' (my emphasis). 10 It shall not go unnoticed that there arc analyses of instances of syntactic variation that adopt a similar perspective: in Construction Grammar, an analysis of the Dative Alternation (He gave John a book vs. He gave a book to John) has also gone beyond postulating a transformationally motivated relation and has argued for different semantic conslruals of the two alternative constructions (cf. Goldberg 1992). Also, for adherents of Cognitive Grammar my assumption (that there is no category comprising both construction,, and construction]) will not be surprising, given my arguments on the constituent structure advocated for in Chapter 2, n. 12: if one believes that the constructions differ in terms of what Langacker (1997) has called semantic and phonological constituents, nothing is gained by positing a superordinate category for both word orders anyway. 11 This order of acquisition ties in with other developmental observations: since construction, is the construction typically used for causcd-motion of concrete objects by a human agent, the initial acquisition of construction, relates to the fact that the concepts of concrete objects and their caused motion is acquired
GENERAL DISCUSSION7
12
13
14
15 16
155
earlier than more abstract concepts and processes such as those frequentlydenoted by construction,,. However, as was already shown by Fischer (1971), a source both Hyams el al. (1993) and Broihier el al. (1994) arc unfortunately unaware of, the developmental primacy of construction, seems to hold mainly for literal VPCs (most of their examples are literal), i.e. those where the particle has its spatial/directional meaning — '[hjowevcr, this does not necessarily generalize to the idiomatic VPCs' (Fischer 1971: 144). That is, although construction, is acquired first, we do not find it with idiomatic meanings as in He gave il up. However, the first examples of construction,, in child language after the stranded-particle stage also contain instances where the meaning of the particle is not spatial/directional, but resultative/completive, e.g. and hraser winding up my duck and Y'knaw he's beatin up the pony (Hyams et al. 1993: 8) so that the usage of the constructions during their acquisition matches my characterization of the constructions' semantics in section 4.2. Cf. also Bates and MacWhiimey's (1989) discussion of variable rules as one of two quantitative kinds of functional approaches (the other being Givon's discourse-functional approach). The parallel becomes even more striking if we consider the work by Gedergren and Sankoff (1974): in this study, they presented one of several versions of a computer program called VARBRUL. The analysis of variation data with this program yields weightings of variables reflecting (i) the importance of the variables, and (ii) the direction of the effect of the variable (i.e. whether the variable's value/level favours or disfavours the application of the variable rule) just like the GLM. However, VARBRUL is not equivalent to the GLM in an important respect: VARBRUL calculates the weighting of the variables under the assumption that the variables' effects are independent of each other while the GLM as a whole does not necessarily recjuire such an independence of variables. Thus, the GLM (although not necessarily discriminant analysis) is better suited for the analysis of particle placement - nevertheless, VARBRUL is still used in variationist corpus-based research (cf, e.g., Biber et al. 1999: 273: van Hout 1995: 625; Crawford, personal communication). Tn a way, one might even argue that if such criticism of probabilistic findings was taken seriously, not many studies of word order variation would be left since many (functionally oriented) studies have produced probabilistic noncategorical statements (e.g. 'the more often a referent has been mentioned before, the more probable is construction,') and topicality and/or givenncss hierarchies (cf. above) where positions of noun phrase referents on these hierarchies are associated with different likelihoods of constructional choices without categorical predictions. If probabilistic statements in linguistics were to be totally rejected, all these results would have to be dumped, a somewhat undesirable result. Cf., e.g., Abney (1996) for a discussion of (the merits of) probabilistic approaches in linguistics. One might object to this conclusion by claiming that whatever the results of this study, they are 'only' a model of language performance or language production. Even if the present results were 'only' indicative of performance nrcchanisms (as was argued by a confirmed generativist, namely Fabrice Nicol from the Llnivcrsity of Paris-Ill in personal communication), the analysis would still have shown many hitherto unknown facets of particle placement. What is more, it is
156
MULTIl'ACTORIAL ANALYSIS IN CORPUS LINGUISTICS
quite reasonable to assume that cognitive structures arc to some extent, if not substantially, reflected in actual linguistic performance the burden of proof clearly lies with those advocating the contrary position (i.e. that linguistic performance does not reflect any underlying principles of cognitive or linguistic organization). 17 Hawkins has acknowledged that STRESS needs to be incorporated more adequately than just by mentioning in passing that 'stressed pronouns are not subject to the grammaticalization effect' (1994: 182) and now assumes that 'there is some phonological weight analogue to syntactic weight going on here' (personal communication), which is as yet still too vague to be useful. 181 am referring to the correlations between complexity, particle placement and the 'epiphenomenaP discourse-functional variables. 19 Note that this strength of the causal relationship need not match the height of the correlation coefficients since correlation coefficients arc blind to whether they measure a causal or some other relationship. 20 Note that Hawkins (1994) is exclusively based on written data; cf. Hawkins (1994:452-3).
8
The activation of constructions'
8.1 Theoretical introduction The preceding chapter ended by pointing at two shortcomings of the otherwise so successful processing approach. In this chapter, I shall therefore attempt to develop an alternative though largely compatible analysis of particle placement. To that end. note that I argued in section 1.3 that my approach is also related to the Competition Model of Bates and MacWhinney (1982, 1989) and their co-workers as well as interactive activation models (TAMs) of the sort proposed by Rumclhart and McClelland (1981), McClelland and Rumclhart (1982), Dell and Reich (1981), Stemberger (1982K 1985), Dell (1986) and MacKay (1987), to name some important sources (cf. also Lamb 1999 for a recent very comprehensive discussion). It is this kind of model, I submit, that will prove ultimately to be the most attractive solution to particle placement in particular arid probably many cases of syntactic variation in general. The structure of this chapter will be as follows. Section 8.1.1 looks at these models on a very general level: I will briefly discuss several essential properties of one such model, the Competition Model,* and illustrate how my previous methods and results fit nicely into this perspective. Section 8.1.2 discusses lAMs in much more detail and explains their relation to particle placement. Section 8.2 returns to the empirical results of Chapter 6 to see whether lAMs can accommodate the results similarly well. Section 8.3 relates the discussion of lAMs to the multifactorial findings of section 6.3. summarizes the findings accordingly and compares the explanatory power of the present analysis to that of Hawkins on the basis of a LISREL analvsis. Section 8.4 concludes. 8.1.1 Particle placement and the Competition Model: general characteristics The Competition Model postulates a two-level structure: one level representing meanings and intentions speakers may want to express and another level representing surface forms by means of which the meanings/intentions can be expressed. The relations between forms and functions are mostly
158
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
not one-to-one relations; rather, polysemous relations are the rule. Speech production is a process where the particular functions x, y, z (meanings/ intentions) are mapped onto specific forms A, B, C (surface forms/ expressive devices available in the language). These form-function mappings are as direct as possible, which results in (i) the language processor receiving mixed input of different kinds (segmental and suprasegmental phonological, morphological and lexical input as well as positional frames; cf. Bates and MacWhinney 1989: 40 1), and (ii) a 'homogeneity of processing across different data types' (Bates and MacWhinney 1989: 40). The 'currency' that enables these diverse interactive mappings is activation. Given the polysemous relations between forms and functions, it regularly happens at a specific point of time during production planning that dilferent forms are being activated simultaneously and compete to be selected for production; in such cases of conflicting preferences, the grammar has to determine which form wins out and is produced. The relation of this model to the present question is probably all too obvious: the phenomenon of particle placement arises because the meaning the speaker intends to communicate can be expressed using (at least) two positional frames, namely construction,, and construction,. The advantage of this model over many other cognitive-functionalist approaches is that it incorporates deterministic all-or-none processes/decisions as well as context-dependent and probabilistic processes (cf. Bates and MacWhinney 1989: 13-14, 42); furthermore, its proponents have long argued for experimental and corpus-based approaches to qualitative and quantitative variation. Given these correspondences, it will be no surprise to see that many of the results discussed at length have straightforward correlates in the Competition Model. For instance, the monofactorial results of section 6.1 correspond naturally to the concept of cue strength (and the ideally isomorphic notion of cue validity), which is a quintcssentially connectionist notion, referring to the probability or weight that the organism attaches to a given piece of infbrmauori relative to some goal or meaning with which it is associated. In other words, cue strength is the weight on the connection between two units. (Bates and MacWhinney 1989: 42)
That is to say, each variable we have investigated above is a cue in the sense that its value/level is more or less strongly associated with a particular constructional choice, and the grammar has to determine which construction wins out. Put differently, we have determined weights of vertical associations between functions to be expressed and available (constructional) forms. However, in the majority of cases conflicting cues are present (cf. section 6.2). In such cases, cue strength alone is not decisive, especially since not all cues interact with each other in a linear fashion and absolute frequency information is of limited importance. Therefore. wrhat is at issue here is the notion of the conflict validity of a cue. which is 'the number of competition situations in which that cue "wins" [...], divided by the
TIIF, ACTIVATION OF CONSTRUCTIONS
159
number of competition situations in which that cue participates' (Bates and MacWhinney 1989: 41-2). In sum, in the terminology of the Competition Model, Chapter 6 did the following: The impact of cue validity on performance [was] assessed by evaluating the overall variance accounted for by each independent variable, as well as the extent to which each variable contributes to determining the 'winner' in situations of competition and/or cooperation [...]. (Bates and MacWhinney 1989: 43)
But let us now look at the relation between lAMs and particle placement more precisely5
8.1.2 Particle placement and IAMs: the details In what follows, I will briefly discuss the main characteristics of such lAMs, relying most heavily on Stemberger (1985) and Dell (1986).' Then, I will show how an approach on the basis of lAMs fully explains particle placement, but is embedded into a comprehensive theory avoiding the problems hinted at in the end of Chapter 7. Note that the main evidence that lAMs are originally based upon are speech errors and letter perception. That is to say, although I will show later that lAMs lend themselves very well to the description, explanation and prediction of particle placement, they are not motivated by cases of syntactic variation in the first place (which in turn decreases the possibility of circular reasoning). LAMs incorporate only two kinds of entities: nodes and links (between nodes). Each node in the network represents a linguistic unit (say, a word, a morpheme, a phoneme, a phonemic feature) and is connected to other nodes via links. A speaker's knowledge of his language is assumed to be organized in different hierarchical levels (semantic/conceptual, syntactic, lexical/word arid phonological), and the nodes representing the linguistic units are organized analogously. Thus, each node within a particular level is connected to other nodes within the same level as well as within neighbouring levels; the connections are mainly based on the frequency of contemporaneous activation and on the sharing of features (e.g. if ^animate] is activated on the semantic/conceptual level, then all words denoting animate entities will be activated to some degree, a pattern that explains why speech errors often involve the substitution of semantic-ally related words; cf. Levelt 1989:218-19). Nodes are processing units that receive/transmit activation from/to their neighbouring nodes via links. In the absence of activation flow, each node exhibits a particular resting level (i.e. a baseline activation corresponding to an a priori bias or probability of activation). Whenever a node X receives activation from a neighbouring node via the link connecting the two, the incoming activation stimulates on node X so it fires and thereby sends activation proportional to its own degree of'activation to all other nodes it is connected to, increasing their activation level, i.e., exciting them.' Three
160
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
things, however, are important in this connection. First, the activation sent out by X to some other node Y need not suffice to activate fully Y - very often, the degree of activation of Y has to summate given the activation of many of its neighbouring nodes. Second, the amount of activation passed from X to Y is also dependent on the strength of the link between X and Y: if X is highly activated, but passes activation to Y across a very weak (i.e. rarely used) link, then Y will receive less activation than if the link was stronger. Third, activation does not only spread across nodes within a single level of nodes (e.g. between nodes within the word level), it can also spread between adjacent levels and in both directions (e.g. upwards from the word level to the syntactic level and downwards from the word level to the phonological level).'1 Once a node has fired (upon having received enough activation), its activation level decays rapidly (due to self-inhibition7), going even below the resting level for a short time, before the activation rebounds (through feedback from neighbouring nodes just activated) slightly above the resting level (a phase called hyperexcitability phase, where the node can easily be activated again); if no other activation follows, the activation returns to the resting level again. This time course of activation served to explain various psycholinguistic phenomena, for instance misspelling patterns by dysgraphics and slips of the typewriter (cf. MacKay (1987) for a more comprehensive overview). Since the model assumes a parallel activation flow across levels, representations on different levels are constructed quasi-simultaneously (while representations on lower levels depend on the construction of corresponding representations on higher levels). Finally, the activation of a node is not only a function of the previously discussed activation processes. The system does riot always work perfectly, as is evidenced by various kinds of performance errors, tip-of-the-tongue states, etc. The activation of nodes is also influenced by what Stemberger (1985: 150-1) and Berg and Schade (2000) call background noise. Following Dell et al. (1997: 805), noise consists of intrinsic noise (normally distributed within the system arid resulting from random variation of nodes' resting levels) and activation noise (positively correlated with a node's degree of activation at some point of time). To reiterate, in the model presented here, the degree of activation A of some node Y at time step t (i.e. A (Y, t)) hinges on: • Y's resting level (depending on the frequency with which the node has been activated before, i.e. (A (Y, t-x)) and Y's degree of activation A at time step l-1 (i.e. (A(Y, t-1)); • the degree of activation of all k neighbouring nodes N, to Nk of Y; • the weights (i.e. strengths) of Y's k connections to its neighbours W, to Wk (each depending on the frequency with which the link between Y and Wx has been transmitting activation); • the speed with which Y's activation decays due to self-inhibition (q); • (intrinsic and activation) noise in the system.
Till-: ACTIVATION OK CONSTRUCTIONS
Ifil
Dell el al. (1997: 805) represent this by means of the following function (slightly adapted to fit the above symbols), parts of which arc central for the argumentation to follow:15 k
(86) A(Y t) = A(Y, t - l)-(l - q) + X (Wi ' A(i, t -1)) +noise i = 1
A (Y, t), in turn, determines which elements are actually produced by the speaker. The system regularly inspects the activation levels A (Y, t) of competing nodes on each level and selects the most highly activated nodes on each level. If, e.g., the word John is to be uttered, then several phonemes will be activated simultaneously, including the correct one ([(&]). If everything works out as intended, then, at the point of time the system selects a phoneme for motor encoding, [d3] will be the most highly activated phoneme; if, on the other hand, some other phoneme will be more active at the point of time of selection, a speech error will result.51 After this fairly general introduction into the entities figuring in IAMs. let us now look at details of sentence production in such a model and also cover some more details of IAMs by means of an example."' Let us assume a speaker has the intention to formulate a sentence describing a state of affairs where books that had been lying on a table were taken by an individual (known to the speaker as John) into his hand and lifted up; this could be denoted by a literal TPV, namely either John picked up books or John picked books up. or by John lifted books. Nodes on the conceptual level representing the concepts John and pick are activated." In a first step, their activation has several simultaneous consequences. First, their neighbouring nodes on the conceptual level (e.g. the nodes for lift (v.) and/or raise (v.)) receive a jolt of activation (clue to their sharing some semantic nodes with those representing John and pick). Second, the lexical (word) nodes I'orjohn and pick up (which we might call John, and pick upj also receive activation; the latter activates in turn the individual parts of the TVP. Finally, and most importantly, the intention to denote the conceptualized scene described above (rather than, say. question it) also activates the nodes representing several syntactic configurations of declarative clauses that fit the semantics and pragmatics of the intended utterance. As Stembcrger (1982b: 318) put it (cf. also Stemberger 1985: 148—50), the problem the production system tries to solve is 'locating the phrase structure that is most appropriate given the semantics'. 12 Among the phrase structures that are activated next are transitive sentences (for, say. John lifted books] and configurations with TPVs, which we might, for simplicity's sake, denote here on the basis of non-committal phrase structure configurations: [s [NT N] [V|, V Prt [N1, ]]] (for construction,,) and [s [ NF X] [vp V [ N p ] Prt]] (for construction,). Consider Figure 8:1 for a representation of the simultaneous processes just described. The diagram is organized into four levels (a conceptual, a syntactic, a word/lexical and a phonological level) and arrows stand for excitatory activation. 1 '
162
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
Figure 8:1 Step 1 of the generation of the utterance John picked up books/books up
In a second step, again several things happen fairly contemporaneously. First, the conceptual nodes representing John and pick up receive some feedback activation from the lexical nodes John, and pick upL as well as pickL. Second, the activated syntactic rules representing syntactic configurations have activated the highest nodes (S for sentence), which now pass on activation to the required nodes in order to generate slots to be filled by lexical elements. Since, in this case, the transitive construction and both VPCs require a slot for a sentence-initial NP,14 the S node activates the NP node and the N node faster than the VP yet to follow (picked up books/picked books up); cf. Stemberger (1985: 150).'' While the noun for which the slot is eventually intended isJohn L , many other nouns will be activated by the N node, too (e.g. John's brother Bill,, etc.). Third, John,, Bill,, pick, and other nodes (such as lift,, raise,, etc.; cf. above) pass on activation to the phonemic level, increasing the activation of the phonemes, thereby preparing pronunciation. Consider Figure 8:2 for a diagrammatic representation of these processes. Step three witnesses, among other things, the selection of a word. When the phonemes for John, are activated, [d3], [D] and [nj will provide
Figure 8:2 Step 2 of the generation of John picked up books/books up/John lifted books
THE ACTIVATION OK CONSTRUCTIONS
H>3
feedback to the word level and will again reinforce the activation ofJohn,, while also partially activating phonologically similar words like gone. At the point of time the system"' inspects the activation levels of all nouns currently activated (because the current syntactic node 1 ' is an NP and no production of a determiner has been 'scheduled'), it selects the most highly activated node representing a noun and integrates it into the N and NP slot currently processed at the syntactic level. Put differently, 'selection entails the linkage of a word to a slot in a syntactic frame' (Dell et al. 1997: 806); cf. Figure 8:3. After this selection of the noun and its integration into the syntactic frame, the production processes at the lower levels (word/lexical level, phonological level, etc.) proceed without further delay ('[e]xecution at the syntactic level entails immediate execution at all lower levels, leading directly and immediately to motor encoding' (Stembergcr 1985: 150)). Upon this, the word node provides feedback activation to the syntactic level so that the NP node self-inhibits and the next syntactic node (VP) can be activated in due course. Note also that, during all of these processes, the system already processes the upcoming items on the conceptual and syntactic levels. The process of selecting the verb proceeds in a similar manner to that of selecting the sentential subject. Let us lirsl look at this process in general before we tie it to our example John picked up books/John puked books up. In general, after the selection of the NP node the syntactic VP node of the verb becomes the current node and several lexical verb nodes (representing verbs sufficiently compatible with the intended meaning) compete until one of them gets selected (i.e. linked to the appropriate syntactic slot. i.e. Y in the VP) and scheduled for production. In our case, however, verb selection is more complicated. First, the choice of the syntactic configuration (e.g. a transitive sentence I John lifted books], construction,, [John picked up booh] and construction, [John picked books up\) is highly related to the choice of the verb. In the words of Dell (1986: 316):
Figure 8:3 Step 3 of the generation of the utterance John picked up books/ books up
164
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
Although it is clear that semantic and pragmatic considerations should guide the syntax, it is less clear that the activation levels of word nodes should, as well. There arc, however, some experimental findings that indicate that the retricvability of a particular word affects the syntactic structure of the spoken sentence (sec Bock 1982 for review). To model these effects, the activation levels of word nodes will have to be taken into account by the syntax.
In our case, if the nodes for the three possible syntactic configurations from above were partially active and the verb node for lift was activated most, then this should raise the probability of the transitive configuration to be selected over the VPCs (barring, for the sake of the argument, the lift of lift up}. If, however, upon a high degree of activation of pick,, pick is selected and scheduled for output, the system must also decide on a syntactic configuration of the VP. It is here where it becomes essential to examine the time course of activation precisely. Consider Figure 8:4. When VP is the current node on the syntactic level, several things happen. On the lexical/word level, activation summates in pick, until it becomes selected; upon selection, pick, sends feedback activation upwards to the V node in the VP and sends activation downwards to initiate phonological and motor encoding. Also, the upcoming concepts (e.g. hook and up) start activating their lexical nodes (i.e. book, and up, respectively). In the next step (i.e. during the summation of activation in phoneme nodes for pick), activation also summates in the nodes books, and up,. Whichever node accumulates enough activation until the activation levels arc inspected, gets selected. Assuming that books,, was selected, the node books, sends out (i) activation to the phonemic level to instigate motor encoding, and (ii) feedback activation to the VP where the noun precedes the particle, i.e. the currently partially activated node of construction,. It is this feedback from the selected lexical node that determines which constructional node is activated (cf. Dell's (1986) arguments quoted previously at length). That is, if book, has been selected, feedback activation can spread up to the N. NP and VP nodes of construction,, but not to the VP
Figure 8:4 Step 4 of the generation of die utterance John picked up books/ books up
THE ACTIVATION OF CONSTRUCTIONS
165
Figure 8:5 Step 5 of the generation of the utterance John picked books i
node of construction,,, because the particle of construction,, does not receive activation (cf. Figure 8:5). If the system selects a morphologically complex TPV to begin the VP. then some mechanism must be incorporated into the theory in order not to 'forget to insert the particle' at a later stage. This is additional prima facie evidence to stipulate constructional nodes (cf. n. 9): if it were not constructional nodes that, in our example of construction,, 'bear in mind' that a particle is yet to follow, one would have to make up some other device serving that purpose (such as morphological integration nodes [MI nodes]; cf. also section 8.2.5 for empirical evidence strongly supporting this hypothesis). With the present approach, the feedback from the lexical level to the constructional level decides on a construction and 'remembers' the particle; cf. the arrows pointing downwards to Pit of construction, in Figure 8:5. After the selection of the verb and the following constituent (e.g. books,), the system proceeds with the VP-internal structure in a way similar to that outlined above for the subject and verb. Since the VP node representing construction, has already received feedback from the NP node, the final node of this VP to be activated and related to a lexical node is that of the particle. Again, several lexical nodes (particles) will simultaneously compete with each other, and the one having the highest degree of activation at the point of selection will win out and become selected; cf. Figure 8:6. This process is reiterated similarly for all remaining words.18 While the preceding discussion has already explained lAMs with respect to particle placement, one still needs to show that the variables' values that we have found to correlate with construction,, and construction, also correlate with high degrees of activation of the nodes of the particle and the direct object respectively and the corresponding constructional nodes/syntactic frames. Obviously, we are not in a position to measure the degree of activation of nodes, which is unobscrvable. However, if other studies provided independent evidence for the degree of activation of some constituent in some discourse situation in the direction we have observed in the data, then the 1AM provides an independent \\o\\-ad-hoc explanation of particle
166
MULTIFACTORIAL ANALYSIS L\ CORPUS LINGUISTICS
Figure 8:6 Step 6 of the generation of the utterance John picked books up placement. For instance, we have seen in section 6.1.2 that previously mentioned referents of direct objects correlate with construction,. If it. was possible to show that such referents are highly activated for reasons independent of the principles governing particle placement, then we would have a non-circular argument in favour of integrating this variable into the IAM analysis proposed here. In the following sections, I will show how the variables' influence on particle placement can be explained in the IAM outlined above. 8.2 The relation of variables to activation 8.2.1 Discourse-functional variables The discourse-functional variables concerning the preceding discourse are very straightforwardly related to matters of activation.19 As to the variable LM, if the referent of the direct object has been mentioned in the preceding discourse, the node for this referent will be more active than a referent that has not been mentioned previously. This follows logically from properties of lAMs mentioned previously, namely the fact that the degree of activation A of a node Y at time step t A (Y, t) is a function of Y's previous level of activation. When the node representing a particular referent has been activated during the previous discourse, its activation will, after the short phase of self-inhibition, be above resting level (hyperexcitability phase) and thus be more likely to be activated again as compared to cases where there was no previous mention of this referent. Thus, according to the activation model, we would expect to find previously mentioned referents more often in construction,. This is exactly what we have found, so our prediction about the role of LM in terms of activation is borne out. The line of reasoning for the other functional variables concerning the preceding context is virtually identical. If the referent of the direct object has been mentioned very often before (i.e. high values of TOPM), then it is more highly activated (and/or easier to activate again) for the reasons just given and, according to the IAM, more likely to be produced early. Again,
THE ACTIVATION OF CONSTRUCTIONS
167
this prediction is supported by our corpus data. The same explanation can naturally be extended lo AcrPC: short distances between the last mention of the direct object's referent and the VPC (i.e. high values of AcrPC) should result in a higher activation of the corresponding node since the node's activation has riot yet completely fallen back to the resting level. This, in turn, leads us to expect the preference for construction, we found. Finally, let us turn to CoHPC. In this case, no direct reference to the direct object's reference is necessary, so how do we relate this aspect of CoHPC to activation? In lAMs, a semantic concept docs not only activate a single word node; rather, a single semantic concept activates several conceptually related words simultaneously (cf. above). For instance, a semantic concept X activates the lexical node of X and the nodes of hypcronyms, hyponyms. homonyms of X and other semantically related words. Thus, even if a particular word node has not been activated by the corresponding semantic concept, this word node might nevertheless be active due to the excitatory activation it has received from, say, one of its hyperonyms. In that case, it would be cohesive to the preceding discourse, so construction, would be expected; since this is supported by the above results, the activation account is borne out. In sum, all of the discourse-functional variables concerning the preceding context investigated in the previous chapters and their relation to particle placement can be explained with reference to lAMs. Given that the time course of activation (including self-inhibition and hypcrexcitability) has been motivated before independently, the IAM account of particle placement presented in section 8.1 receives independent empirical support. As to the variables concerned with the subsequent context, they cannot be related to the activation processes occurring during the selection of the VPC since their values/levels occur temporally after the choice of construction has been made, i.e. when all activation processes determining the choice of construction have ceased. Put differently, according to the activation model proposed here, no effect of these variables is to be expected; this claim was clearly borne out by the data; cf. section 6.1.4. This last point is of particular importance as it shows that the activation model cannot only explain why the relevant discourse-functional variables have the impact on the choice of construction they have, it can also be used to predict which of the variables suggested in the literature should not have an influence on the choice of construction.
8.2.2 Semantic variables The semantic variables' impact on particle placement can also be connected to activation in an attractive way that is supported by several observations independent of the current focus of interest. It was already mentioned in the introductory section on lAMs that complex forms pose a particular challenge to lAMs. The ensuing discussion starts out from the variable IDIOMATICITY, which is strongly connected to the issue of complex forms.
168
MULTIFACTORIAL ANALYSIS IX CORPUS LINGUISTICS
On the basis of speech error evidence, lAMs were argued to represent knowledge of complex forms in a way that is, at least from the point of view of parsimony, not ideal: morphologically complex forms (such as idiomatic expressions) that can be analysed as being formed on the basis of a regular rule are nevertheless listed as individual complex forms, thereby introducing an element of redundancy (cf. Stemberger 1985: 172-7). These forms arc acquired as lexical units (form-meaning pairing) and given an analytic representation. For production, the conceptual node activates the lexical unit node for the morphologically complex form, which in turn jointly activates the component parts/nodes into which it can be formally analysed. There is some experimental evidence from studies on Dutch particle verbs that seems to support this approach: Schreuder et al. (1990) found that the mental lexicon of Dutch native speakers seems to have a full listing of decomposed entries for separable particle verbs, i.e. separate entries for verbs and particles (together with MI nodes linking both ol them whenever they can form a separable complex verb). But how exactly does this relate to particle placement in English? The meanings of idiomatic TPVs are not compositional: the verb and the particle do not each contribute an independent component to the overall meaning of the VFC, although an analysed representation of the phrasal verb might also be available. Therefore, both the form and the meaning of the idiomatic TPV are acquired and stored together as a unit, which in turn entails that once such a verb is needed for an utterance, it will be accessed as a whole along Sternberger's lines: the conceptual nodes activate several syntactic configurations compatible with the content to be conveyed as well as the lexical unit node representing the idiomatic phrasal verb (e.g. eke out L ), which, in turn, activates the syntactic component parts (V and Prt, that is) it requires. The factor that is responsible for the close relation between idiomatic TPVs and construction,, is that the idiomatic phrasal verb is stored as a unit and, thus, docs not favour independent access of its two component parts: when the lexical unit node is activated, its component part nodes receive a stronger jolt of activation than the component parts of compositional TPVs. The justification for assuming that this jolt of activation is stronger is their unit status, i.e. the fact that the non-compositional semantics of the TPV have lead to the storage of a complex form. This in turn motivates the quasi-joint selection of the two lexical items and results in strong feedback activation of the syntactic slots V and Prt respectively; as a result, construction,, is produced. With literal phrasal verbs, the situation is somewhat different. As was already discussed in section 4.2, the meaning of literal TPVs is compositional and commonly denotes a process where the referent of the direct object undergoes a movement process, the path or end of which is denoted by the particle. In Talmy's (1985) terminology, the action denoted is decomposed into the meaning contribution of the verbal nucleus and the additional semantic feature contributed by the particle (cf. Chapter 4, n. 8). In other words, while literal TPVs may well also have a node for the
THE ACTIVATION OK CONSTRUCTIONS
169
complex form (cf. the representations above), the fact that both meaning components are easily isolable licenses separate access of both verb and particle. Put differently, the particle can be accessed separately for the production of an utterance and is, as a separate lexical entry with its own semantic contribution, not so much constrained by the degree of activation of the verb with which it happens to make up a TPV Thus, with literal and metaphorical phrasal verbs the IAM predicts that there will be no strong preference for construction,, (at least on the grounds of the variable IDIOMATICITY), which is what we found. If the above analysis of the production of TPVs is correct, what implications would that have for comprehension? The processes would have to work the other way round: the listener hears the verb, upon which all verb nodes with this verb (simplex, phrasal, prepositional, phrasal-prepositional, etc.) are (partially) activated. There is indeed some experimental evidence for this process: in experiments with German native speakers, Hillert (1998) found that verb stems (such as geben) which can be continued as several different separable particle verbs (e.g., idiomatic: aujgeben vs. literal: weitergeben) prime both verb meanings irrespective of the immediate context, supporting the redundant representations of complex verbs assumed by Stemberger and the activation processes described previously. Thus, the analysis of the production of idiomatic TPVs receives some independent evidence from comprehension studies. Let us look at CONCRETE. Concrete referents tend to occur in construction, (i.e. early in the sentence), whereas abstract nouns tend to occur in construction,! (i.e. late). In order to integrate this finding of the present study into the IAM analysis, we need to show and explain this tendency on independent grounds. To that end, recall that, in lAMs linguistic elements are argued to occur early in sentences if their level of activation A (Y, t) is high enough to be selected early, while linguistic elements occurring late(r) arc less active at early stages of production of an utterance. Since A (Y, t) is determined by A (Y. t-x) or Wx, we need to show that there is reason to assume that, with referents of concrete nodes, cither A (Y, t-x) or Wx is generally higher, thereby leading to an increase of A (Y, t) at the point of time t when the system inspects the activation levels of nodes for selection. One reason is that concrete concepts are. in general at least, acquired earlier as well as manipulated and spoken about more often (entailing their nodes and the links connecting to these nodes are activated more frequently), thereby increasing their resting levels. This increase of the resting level in turn results in their need of less activation in order to reach a level high enough to be eligible for selection. The converse holds for abstract referents: they are acquired later, spoken about less often and never manipulated, so their nodes are active less often and their resting levels are not. likely to facilitate high activation levels. There is some experimental evidence supporting this way of treating CONCRETE in lAMs. As was briefly mentioned in section 4.2, Bock (1982: 17, 20—1) summarizes a number of experimental studies showing convincingly
170
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
that concrete referents of" words as well as sentences containing concrete referents are recalled more often and more reliably than abstract referents. Since, in IAMs, recall of referents corresponds to repeated activation of nodes representing these referents, these findings support the claim that the resting levels of nodes of concrete referents are in general higher, thereby facilitating activation (and early production). Since this is exactly what the IAM account of CONCRETE needed as support, the interpretation of my findings appears strongly supported on independent grounds.2" Finally, let us turn to ANIMACY. In principle, ANIMAOY can be related to activation along the same lines as CONCRETE. On the basis of the findings reported in Bock (1982: 17, 20-3), one might suggest that animate referents are easier to retrieve and, thus, easier to activate than inanimate referents. However, while this is inherently plausible, the empirical findings suggest otherwise: ANIMAOY had only a very limited influence on particle placement (cf. Table 6:10). Nevertheless, this need not be taken as evidence against the activation model: the limited impact of ANIMACY on particle placement is probably due to the fact that ANIMACY influences phrase order only in conjunction with changes in the semantic role configuration (cf. McDonald et al. 1993: 202) and, thus, not in the case of particle placement.
8.2.3 Morphosyntactic variables Among the most powerful variables are LENGTH\V, LENGTHS and COMPLEX. I see two ways in which they can be related to activation. First, these morphosyntactic variables are related to previously discussed discoursefunctional variables along the lines discussed in section 4.3. Information that has been mentioned more or less often is related to activation levels in a way argued for in section 8.2.1 and is in general encoded with a larger or smaller amount of linguistic material respectively. This is also the case in the present data set: the correlations between LENGTHS, LENGTHW and COMPLEX on the one hand and LM, AcTPC. TOPM and CoaPC on the other are negative and highly significant without a single exception. According to this argument, there is, therefore, at least an indirect relation between lengths and complexity and particle placement - however, one might raise the question whether this is sufficient to motivate the strong correlations we have found and I am in fact not completely sure it does. Consequently, there could also be another explanation in terms of activation. Let us return to the process of producing an utterance after the selection of the verb, i.e. Figure 8:4. The concepts to be referred to in the remainder of the utterance compete for the next syntactic slot. Given the input from the conceptual system, the activation of the lexical nodes of the particle and the elements making up the direct object noun phrase summate. As the activation levels of these lexical nodes increase, so do the corresponding constructional nodes on the syntactic level (by spreading activation). But
THE ACTIVATION OF CONSTRUCTIONS
171
while activation of the syntactic node of the particle can summate fast (since (here is only a single lexical node for the particle), the activation level of the syntactic NP node may behave differently. If the NP is very long and complex, then all the lexical concepts belonging to the direct object NP arc activated sequentially. It will, thus, take longer to activate the syntactic NP node sufficiently for selection (as more information needs to be integrated into an increasingly complex direct object NP) than ii would take if the direct object would be very short and simple (e.g. one word), where the lexical node can immediately increase the activation level of a single NP node. In the words of Bock, 'representations with less information will finish the retrieval process faster' (1982: 31). That is, '[i]f some subcomponent of the interfacing representation is entered into lexical processing before other subcomponents, so that its semantic and phonological processing will tend to be completed earlier, its associated syntactic productions will also be activated earlier' (Bock 1982: 30). This lies in with some of the observations made in Chapter 6. For instance, construction,, is already preferred with direct objects consisting of more than two words. One might wonder whether the activation model would not predict that particles precede the direct objects even if the latter consist of only two words. But a closer look at the 132 cases of two-word direct object NPs shows that nearly 90 per cent of them consist of a noun preceded by a determiner. These determiners are. as function words and given their overall frequency in virtually every piece of discourse, very easily activated and, additionally, two-word NPs are obviously not very complex so no construction and activation of syntactic constituents within the direct object NP. which might delay the overall increase in activation, are necessary. Once, however, the NP becomes more complex such that additional modification leads to an increase of content words or even of syntactic nodes ((^constructions or embedded clauses), construe tion0 is strongly preferred, namely in more than 83 per cent of the analysed cases. As to the variable DET, in an 1AM I would propose that the choice of determiner is not causally, but only indirectly, related to the choice of construction. Rather, as has been proposed throughout the functional literature, the choice of the determiner is, on the whole, contingent on the degree of giveimess of the referent of the ensuing noun (along the lines of my argument in section 4.3). Thus, the degree of activation of the noun is responsible for the choice of the determiner and the choice of construction; the statistical correlation between DET and particle placement does not result from any causal influence of DET. Finally, the variable TYPE can also be related to the notion of activation along the line of reasoning in section 4.3. Pronouns arc exclusively used for given referents. It has already been argued that, therefore, their nodes have been frequently and/or recently activated in the discourse preceding the VPCs. Thus, their activation levels are already fairly high or just in the hyperexcitability phase where repeated activation is expected so they are
172
MU1.TIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
very likely to be selected at the point of time where the VPC is being proeessed. If the pronominal direct object can then be activated easily and fast, it is more likely to be selected when the system inspects the activation levels, so construction, has a greater chance of being used; note that the activation of pronominal referents seems to be even higher than the activation that particles of idiomatic VPCs receive from their unit node since even with the most idiomatic constructions, construction, is obligatory with pronominal direct objects: cf. (87). (87) a. *He eked out it. b. He eked it out.
Lexical nouns, by contrast, prefer to be used with brand-new or unused referents (whose activation levels are low) so, when it comes to selecting the construction, the activation of their nodes is thus too low, rendering these nodes unlikely to be selected early.
8.2.4 Phonological variables The only phonological variable that was theoretically included, though not empirically analysed, is the stress pattern of the verb phrase. At first glance, it seems quite difficult to relate the (iridings observed in the literature to the IAM elaborated above: since it is commonly argued that stressed items are more activated than unstressed ones (cf. Mac Kay 1971; Berg 1998: 107), one would expect stressed items to be produced early because insertion rules should select these highly activated items first. Previous works show, however, that stressed items (more precisely, contrastively stressed expressions) prefer sentence-final position in general. How do we reconcile this conflict? Possible answers to this question are along the following lines. It was previously argued that, in the lirst stages of generating an utterance, there is a non-linguistic representat ion of the information to be communicated. However, the information to be communicated does not only comprise the purely referential content, but also the communicative intention. For instance, different communicative intentions (such as, e.g., different speech acts) require the activation of different syntactic rules in order to achieve the desired communicative effect. In other words, the activation of syntactic productions is not only contingent on the particular semantic information to be communicated, but also on the pragmatics of the utterance. If, therefore, the speaker intends to strongly highlight the referent of the direct object or the denotatum of the particle, then the pragmatics of this utterance also differs from the standard utterance where no such emphasis/contrast is to be communicated. Therefore, just like the pragmatic; intention of asking a question activates the syntactic nodes representing questions, the pragmatic intention of contrastively highlighting some referent/denotatum also triggers associated syntactic nodes. Since end-focus is a very general characteristic of the English language, one may
THK ACTIVATION OF CONSTRUCTIONS
173
expect that the two syntactic nodes representing construction,, and construction, are more likely to be activated when the referent of the direct object and the denotatum of the particle respectively are to be contraslively stressed. However, one should also bear in mind that cases with coiitrastively stressed pronouns in construction,, are extremely rare. 8.2.5 Remaining variables On the basis of the Processing Hypothesis it was previously argued that PP has no causal influence on the choice of construction. This also holds, 1 submit, for the role of PP within lAMs. Given that the directional adverbial follows the VPC, I believe that no causal relation between activation and PP can be assumed in order to explain the correlation between particle placement and PP arid that the explanation for this correlation is the one proposed in Chapter 6, n. 18 namely the correlation between particle placement and IDIOMATICITY, a variable whose relation to activation is quite apparent. The variable DISFLUENOY can be naturally translated into a theory of activation. According to the above description of activation models, a case of disflucncy can be characterized as a state where the system is not able to select any one entity for production. In actual speech, then, such failures ol the system often manifest themselves as hesitations or filled pauses (e.g. er or um) serving to fill the silence and hold the floor until the next word is produced. Thus, although the present data show no strong correlation, findings of that ilk (e.g. the results of Arnold and Wasow 1996) can in principle be integrated into such an account without any problems. The translation of PART = PREP into activation models is similarly straightforward. It was previously mentioned that the selection of a node A is followed by a sudden decrease of A's activation level to zero (the so-callecl refractory phase; cf. above). While the activation level may rebound quickly, during this refractory phase A is highly unlikely to be selected again because its activation level is so low. Thus, once the particle has been selected for production, its activation level and, thereby, its likelihood of immediately repeated selection decreases. That is, if evidence for the relevance of PART = PREP had been found in the data it could be easily accommodated by an interactive theory of activation.2"' Finally, let us turn to a variable that has not been investigated in the main course of the analysis, since it has never been connected to particle placement, namely structural pruning. In the above characterization of lAMs, it was argued that there exists a level/network of nodes of syntactic productions (in the form of syntactic frames with slots) some of which could be activated by a matching semantic node even though this might constitute a violation of Occam's razor (cf. n. 9). Let us assume some speaker has produced a VPC in the form of construction,. After the selection of construction, the activation level of the corresponding node first decreases rapidly and then rebounds due to spreading activation from the less stronglyactivated neighbouring constructional nodes on the same level in the system.
174
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
Consequently, after the short refractory phase, the constructional node's activation is well above resting level again (hyperexcitability phase) and thus a likely candidate for repeated selection once a semantic representation is activated that is compatible with the semantics of the VPC. If, therefore, the nodes of construction,, and construction, are activated again from the higher semantic level in the not too distant subsequent discourse (since they match the semantics of the utterance to be produced), then construction, has a higher chance of being activated as its activation is still above the base level. My findings concerning structural priming reported in section 6.4, thus, come as no surprise, once an IAM is assumed, but are difficult to explain in other theories based on processing (e.g. Hawkins 1994). However, structural priming has also more general ramifications for lAMs. Note that the influence of structural priming on particle placement cannot be explained unless we assume that there is a level/network of syntactic productions: if the choice of construction was only due to the selection of a lexical node (either the particle or the first word of the direct object), then there would be no way to account for the preference of repeating a construction within an IAM. This finding, thus, constitutes additional evidence for this level. 8.3 A network of variables and weighted (causal) relations The last sections have shown how all the variables analysed so far can be interpreted within an IAM. Some of the variables' statistical correlations with particle placement were argued to result from these variables' causal influence on particle placement; other statistical correlations of variables with particle placement were explained by arguing that these variables do not causally influence the alternation but are in turn causally correlated with those that do. Finally, one variable (structural priming) was discussed that most other theories would find hard to integrate. I will now point out several other characteristics of the present analysis that fit nicely with an activation-based approach and are also more problematic for other accounts. First, it was previously argued that the selection of syntactic structure depends, among other things, on the association strengths of the links between semantic, syntactic and lexical nodes. The multifactorial analysis used in the present study allows for a simple way to translate the weights Wx of links between nodes into our empirical findings: we can simply interpret the factor loadings of the discriminant analysis (or the predictor strengths of the CART analysis, for that matter) in terms of (i) the strengths of vertical function-form associations as postulated in the Competition Model (cf. Bates and MacWhirmcy 1989: 49), or (ii) even as connection weights. For instance, Table 6:34 reported that for pronominal direct objects a factor loading of .499 was obtained, leading to a preference for construction,. For an IAM this would mean TYPE: pronoun excites the node for construction, (or, more precisely, the node for the NP in the syntactic VP-frame of the VPC) with a strength of .499; this strongly increases the
THE ACTIVATION OF CONSTRUCTIONS
175
likelihood of this NP node being selected first, thereby increasing the probability of construction,. Second, it was previously explained that nodes (the two construeuonal nodes as well as all other nodes) have baseline activation levels that determine the ease with which a node can be activated. Our earlier multifactorial analysis of particle placement can easily accommodate this property of nodes since both discriminant and CART analyses require the user to specify a prior probability of the outcome of the dependent variable. For our phenomenon, this means the user needs to provide the general probabilities of construction,, and construction,. In the analyses reported in section 6.3.1 set both prior probabilities to .5; i.e. I assumed that, on the whole, both constructions are equally likely to occur. Suppose, on the other hand, a corpus analysis would have shown that the two constructions are distributed such that construction,, is nine times as frequent as construction,. For a discriminant or a CART analysis, we would therefore set the prior probabilities of construction,, and construction, to .9 and . 1 respectively in order for the analysis to reflect reality. Suppose further that a construction is generally selected once its activation level exceeds a strength of 1.0. In such a situation, the node for construction,, requires little extra activation (namely . 1) whereas the node for construction, requires a lot more extra activation (namely .9). Given these prior probabilities, the factor loadings computed by the analysis weigh the variables' contribution in such a way as to maximally correspond to the utterances by native speakers. In sum, the notion of base-line activation is, so to speak, inbuilt into the multifactorial procedures used previously. Finally, let us briefly consider the error rate of my prediction of the constructions. We have seen that the multifactorial analysis resulted in a comparatively high cross-validated success rate of 83.9 per cent. In the processing account suggested previously, I demonstrated that the wrongly predicted cases are those that are characterized by variables' values/levels that do not possess a strong preference for one construction so that the no processing advantage is immediately obvious to the speaker and no immediately obvious choice of construction follows. Of course, the question arises as to how this error rate can be explained in an I AM. As was previously mentioned, part of the activation flow in the system at time step t is a certain degree of background noise within the system. At the time when the system inspects the activation levels of both constructional nodes, the background noise can influence the choice of construction in cases where the two constructional nodes have similar activation levels. Suppose again we have two constructional nodes C0 and C, for construction,, and construction, respectively. These nodes have, at time t where the system inspects the activation levels in order to activate a construction, activation levels of A (C(), t) and A (C,. t) respectively. When A (C,. t) is larger than A (C(l. t) (i.e. when A (C,, t) — A (C,,, t)>0), then construction, is activated; by contrast, when A (C,,. t) is larger than A (C b t) (i.e. when A (C,, I.) — A (C,,, t)<0), then construction,, is activated. This scenario is reminiscent
176
MULT1 FACTO RIAL ANALYSIS IN CORPUS LINGUISTICS
of my discussion in section 7.1.1: for the present line of reasoning, each utterance's discriminant score indicating the choice of construction can be understood as the absolute difference A (C b t) - A (G0. t) for that particular utterance (cf. Figure 7:1). 23 If we now assume a degree of noise equivalent to an activation level of, say, .1, then, whenever -. 1<[A (C,, t) - A (C , t)]<. 1, 0 the activation of a construction can be attributed to background noise. For instance, consider i (88) 1 still wouldn't put weight on.
In this case, the discriminant score (i.e. A (0,, t) - A (C (J , t)) is .05, which, in the absence of noise, would result in the activation of construction, (as predicted) — but a noise of. 1 could add to and increase the activation level of A (C,,, t) such that, then, A (C,, t) - A (C,,, t) = -.05, resulting in the activation of construction,, although the regular activation flow in the system on the basis of the above-mentioned variables should have resulted in the activation of construction,. To give an idea of the impact noise might have, consider that, in the data investigated above, there are 17 cases where discriminant scores range from —.1 to .1, i.e. where noise of a strength of .1 might change the activation of a construction. Of these 17 cases, eight constructions were wrongly predicted and an assumed noise of. 1 might be partly responsible for the erroneous predictions - on the other hand, if the discriminant scores are high (since ABS(A (C,, t) - A (C0, t))>l), then noise does have little influence on the activation levels and, therefore, out of 211 such cases only six errors occur. This does not constitute irrefutable evidence of the concept and influence of noise in the system, but it is yet another piece of evidence that (i) fits nicely into the analyses in terms of IAMs, and (ii) is difficult to account for, or less elegant to incorporate, in competing theories. In view of the complexity of the system of interrelated variables advocated in the previous sections of this chapter and the representations of activation flow between nodes, it would also be helpful to visualize this network of variables. Figure 8:7 shows how the variables and their relations to activation can be characterized in the form of a single diagram, representing the interconnections of variables on the basis of the arguments adduced in Chapters 2, 4, 6 and 8. Arrows symbolize causal relationships (from a cause to an effect), broken lines symbolize correlational, but not causal, relationships. The figures represent the strength of the vertical function-form relation of the Competition Model (at least partly intcrpretable as weights of links between nodes) as measured by standard coefficients of correlation on the basis of my data. The horizontal axis on the bottom represents a timeline along which the above activation patterns occur.24 Since the present study pursued two goals, a linguistic one and a methodological one, this diagram represents, in a way, the summary of my pursuit of the linguistic objective, namely to provide a truly multifactorial description and explanation of particle placement.
Figure 8:7 A network of variables with intercorrelations/association strengths (cross-validated prediction accuracy of the two constructions: 83.9 percent)
178
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
Given this representation, it would be desirable to decide whether this model is appropriate for particle placement in English, given our data. While we have seen that in general it is virtually impossible to decide on a particular cause-effect model on the basis of correlations alone, statisticians have developed techniques that are used for testing how causal models conceived by the researcher fit the structure of the data investigated. These models come under different headings, although their overall line of reasoning is quite similar: path analysis, structural equation modelling or LISREL (for Linear Structural Relationships). In view of the amount and the complexity of the calculations involved to test the above path diagram, causal modelling is, unfortunately, a statistical application for which a larger data set needs to be available. However, in order to provide an idea of what is possible once statistical techniques of such sophistication are used more often, let us briefly test one small subpart of the above diagram, namely the one that is needed to discuss/refute Hawkins's claim of the irrelevance of discourse-functional variables. In section 7.3, I argued that, on the basis of the correlations Hawkins (1994: section 4) presents, several cause-effect relationships are conceivable, which were then illustrated by Figures 7:3 through 7:5. In order to determine which of these diagrams is most appropriate for my data, I used the Structural Equation Modelling (SEPATH) module of Statistica 99, which is an extension of J0reskog's L1SREE. Eor Hawkins's EIC, 1 used the variables COMPLEX, LENGTHW and LENGTHS for the discourse-functional variables, I used AcxPG, TOSM and ConPC; the results are given in Table 8:1. These results are to be interpreted as follows: GFI and AGFI are historically the most widely known indices for structural equation modelling. They are goodness-of-fit indices representing how well the model fits the data: the value represents the percentage of variance of particle placement that can be accounted for by the model. The AGFI index is the more relevant of the two since it is adjusted for the complexity of the model and, thus, much more reliable. We see that the model argued for by Hawkins does not do well compared to the other two, which do not treat discourse-functional variables as merely epiphenomenal: the latter two models account for more variance of particle placement in the sample investigated. The two more modern Gamma indices (of which, again, the adjusted index deserves most attention) support this analysis by showing that we need not restrict this claim to Table 8:1 Structural equation modelling results Model
GFI
AGFI
Population Gamma Index
Adjusted Population Gamma Index
RMSR
Figure 7.3 Figure 7.4 Figure 7.5
.902 .917 .948
.805 .821 .879
.911 .925 .956
.821 .837 .897
21.1% 11.7% 6.3%
THK ACTIVATION OF CONSTRUCTIONS
179
the sample investigated but can also extend it to the larger population: again the latter two models (where discourse-functional variables are assumed to have a [direct and/or indirect] causal influence on particle placement) better fit the data than the model derived from Hawkins's claims where only EIC is causally relevant. This is especially obvious from the rightmost column in Table 8:1: the RMSR value states how much variance in the data the respective model cannot account for, and we see that the model based on Hawkins's claims leaves about three times more variance unaccounted for than the subpart of my model in Figure 8:7. In a way; this is not really surprising: if, as my data suggest, the relation between the variety of variables postulated in my analysis is as represented in Figure 8:7, then an analysis singling out a few of these variables and claiming that one of them is solely responsible is bound to fare worse than analyses having a somewhat wider scope. Is there any evidence supporting the sub-part model of Figure 7:5 (here repeated for case of reference as Figure 8:8) other than the results of the structural equation modelling analysis? Yes, there is. First, it is obvious that the correlations we find between discourse-functional variables concerning the preceding context and the morphosyntactic variables only support the causal relationship represented in Figure 8:8 because the converse cause-eilect relation would have to work backwards in time. That is to say, given the temporal unidirectionality of discourse, the morphosyntactic complexity of constituents C, and C 2 in some utterance U cannot influence the discourse status of C, and C,, before U was even produced. That is, morphosyntax cannot be the sole cause for everything; something must be located temporally/causally before it, even if this something is then in turn constrained by morphosyntax. Second, in some respects one would even intuitively expect such a result. If the morphosyntactic variables are not necessarily influenced by anything else (which would follow from Hawkins's claim that all variables other than his complexity variables are purely epiphenomenal), then how would he explain that speakers sometimes use pronouns for direct object referents in VPCs and sometimes not in the first place? Well, Hawkins cannot explain this since all variables he allows for arc morphosyntactic. With a cognitiveiunctional perspective, however, the answer is quite obvious: speakers'
discourse-functional variables^v.
morphosyntactic variables particle slacement
Figure 8:8 A subpart of the proposed causal activation network
180
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
choices of, say, pronouns are determined by discourse-functional factors. Thus, the present approach can explain what lies behind Hawkins's variables since it is not constrained by viewing everything apart from morphosyntax as epiphenomenal. Of course, apart from the correlation between morphosyntax and particle placement (arrow @), Hawkins is also aware of the correlation between length and, say, givenness (arrow ®). The point to be made is that he reduces the correlation to the question 'which is the chicken and which is the egg?', without acknowledging that a simple monodirectional causal explanation is not the only one let alone the most plausible one that is licensed by his observations. Finally, the discourse-functional variables cannot be purely epiphenomenal since they improve our ability to predict the choice of construction directly (arrow ®). Let us look at just a single and extraordinarily simple example supporting this claim. The corpus data show that short direct objects (i.e. object NPs with less than four syllables) prefer construction,: 168 (73.36 per cent) out of 229 VPCs with short direct objects occurred in coristruction,. Similarly, the corpus data also show that previously mentioned referents of direct objects prefer construction,: 143 (72.96 per cent) out of 196 VPCs with previously mentioned referents of direct objects occurred in construction,. Apparently, both variables' levels predict the constructional choice equally well, contrary to what has been claimed by Hawkins. If, however, both levels are combined (i.e. we look at the distribution of constructions for short objects and given referents), then the distribution is even more extreme: 126 (85.14 per cent) out of 148 VPCs are construction,. That is to say, from Hawkins's perspective, if LENGTHS is supplemented by LM, the prediction accuracy is improved by about 12 per cent, something that can hardly be explained by assuming that LM is purely epiphenomenal. Rather, we infer that givenness and particle placement are also causally related, and I have proposed that the relation is due to the degree of activation of the direct object (cf. section 8.2.1). In sum, the analysis of the model in Figure 7:5/Figure 8:8. making up only a small subpart of my overall causal activation model in Figure 8:7, lends support to Hawkins's results that the morphosyntactic variables are quite powerful. However, it goes beyond it by illustrating that Hawkins's conclusions (about discourse-functional variables being merely epiphenomenal) are fundamentally mistaken. More importantly, however, the analysis of the smaller model in Figure 7:5/Figure 8:8 also lends some credibility to the larger network of causal activation connections of which it is a part. Therefore, although a methodologically optimal analysis on the basis of structural equation modelling awaits a larger data set. we have obtained primaj'acie evidence for the causal network proposed above. 8.4 Interim summary
We have seen that virtually all variables' effects can be explained in an interactive activation model of sentence production as proposed at the
THE ACTIVATION OF CONSTRUCTIONS
181
beginning of this chapter. That is, lAMs can be fruitfully applied to the analysis of syntactic variation. It is important to note, however, that, while the Processing Hypothesis and the IAM do make virtually identical predictions, this is not meant to imply that there is by necessity an isomorphic relation between the two approaches; on the other hand, when we looked at the variables COMPLEX as well as LENGTHS and LENGTH\V. the relation between the two approaches was quite apparent. In a way, from the perspective of lAMs, processing is nothing but activation flow within the system, and what has traditionally been looked at in terms of processing effort might as well be more fruitfully described as the summation of activation in nodes, where amount of processing effort does not translate into, but is compatible with, the amount of nodes held active within a particular time interval. I consider it to be one of the most important findings of the present analysis that phenomena whose analyses have hitherto relied on traditional accounts in terms of processing effort can be more directly and, thus, rcwardingly explained on the basis of lAMs. From the present perspective, two other important advantages of the activation-based approach are also evident once we return to the two questions initially raised in this chapter. It is now possible to predict speakers' choices of construction by clearly relating all variables to the mental processes taking place in only one of the interlocutors, namely the speaker; at the same time, the activation account also relieves us of requiring the speaker to (i) compute the processing effort associated with the two constructions, and (ii) select a construction on the basis of the results of his computation. By going beyond simpler activation models such as the one underlying my discussion of variables in section 4.1 (based on Givon 1992b and Lambrccht 1994), it is now possible to predict speakers' choices without the two slightly questionable assumptions implied by accounts in terms of processing cost, while at the same time integrating findings that processing accounts have difficulties in explaining (e.g. structural priming). Notes 1 This chapter has benefited greatly from many stimulating discussions with Thomas Berg (University of Hamburg), who of course might not agree with all of what follows. '2 This presentation is based on Bates and MacWhinney (1989), whose discussion has focused on comprehension, and Bates and Dcvcscovi (1989), who are concerned with production. 3 Cf. MacWhinncy (1989) on the relation between connectionism and the Competition Model. 4 The selection of these two authors is not meant to imply that their models are completely identical, but an exhaustive discussion of the different kinds of lAMs and the data motivating the different proposals are well beyond the scope of the present work as is a comprehensive justification for the nature of the model that I will adopt. 5 Some authors have postulated activation thresholds which have to be exceeded
182
6
7
8 9
10
11
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS by incoming activation or priming before a node can fire and thereby pass on activation to its neighbouring nodes (cf. e.g. MacKay 1987); the height of such a threshold of some node is determined by the frequency with which this node has been activated previously. However, I follow Stemberger (1985: 147) and Dell (1986: 287) in omitting activation thresholds, saturation points and other non-linear influences. At this point, several models make different claims. On the one hand, McClelland and Rumelhart (1982: 379) arid Stemberger (1985) argue that, apart from excitatory activation, there is also inhibitory activation. More precisely, McClelland and Rumelhart claim that all neighbouring nodes within the same level receive inhibitory activation whereas all other neighbouring nodes (from adjacent levels) receive excitatory activation. On the other hand, MacKay (1987) claims that nodes do not pass on activation, but priming. In my discussion, I follow Dell (1986: 288) and assume excitatory activation only. 'Self-inhibition is the inhibitory process that terminates the self-sustained activation of [. . .] nodes and temporarily reduces their priming level to below normal or resting state' (MacKay 1987: 141). According to MacKay (1987: 141), the duration of this refractory phase is only 1ms. Cf. Dell (1986: 287-8) for a conceptually similar though slightly different formula. This might raise the question whether a syntactic level (of constructional nodes) is required in the system. As evidence for such a level, for instance, Stemberger (1982a) argues that a speech error such as roll up it (for roll it up) might have occurred due to an overly high activation level of a syntactic rule that places particles right after their verbs. However, it is easy to see that this is not a persuasive argument: if no level of constructional nodes existed, it would be possible to ascribe the error to noise on the word level such that, at the point of word selection, the activation level of up was higher than that of it. Thus, in order not to violate Occam's razor, more conclusive evidence is required to support this kind of nodes. A first intuitive reason to do so is to constrain the power of activation models: were it not for some syntactic constraints that can be captured in the form of constructional nodes, activation models would regularly violate language-specific word order patterns by selecting highly activated words without consideration of syntactic structures into which these words need to be embedded. However, I will take up this issue again later and will provide some empirical evidence on the basis of the present data. Obviously, a textual (and thus linear) representation of a multitude of reputed parallel processes nearly amounts to a contradiction in terms; expressions like and then, etc. are therefore to be taken with a pinch of salt. Also, I will not be able to provide a comprehensive discussion of the many potentially relevant issues such as. e.g., the time course of lexical access (cf. Dell and O'Seaghdha 1992), the degree of locality of interactions between semantic and phonological information (cf. Bock 1990; Dell and O'Seaghdha 1992) or whether production is strictly incremental or competitive (cf. Fcrrcira 1996). In the following discussion I do not commit to cither a local representation (i.e. a representation where there is one node representing John) or a distributed representation of concepts (i.e. where the concept John is represented by a distributed pattern of activation over many nodes) since this distinction does not bear on the issue. Moreover, I will leave aside details concerning past tense marking of the verb as well as lemma selection and phonological encoding (cf. Dell et al. 1999).
THE ACTIVATION OF CONSTRUCTIONS
183
12 The criteria used to determine the fit between the semantic representation and alternative syntactic phrase structures arc concerned with 'the structural aspect of semantic information [. . .] e.g., the grammatical relations of the noun phrases, modulations such as aspect, definiteness, etc.' (Stemberger 1982b: 320). 13 The representation of levels in terms of a hierarchy is just chosen for convenience. 14 I assume that several syntactic structures that are structurally mutually incompatible can be constructed simultaneously. There is evidence for this assumption from the analysis of syntactic blends (cf. Fay 1980 and Stemberger 1985). 15 Alternatively, we might assume that the weight of the links from S to NP are stronger, thereby facilitating early selection, but both descriptions fit our purposes. Also, the NP node may already have received some activation spreading from the 'nouny' node John,. 16 Alternatively, it is argued that a set of so-called insertion rules is operative (cf. Dell 1986)." 17 1 follow Dell's (1986: 288) definition of current node: 'It is that item of the higher level representation that is in the process of being translated into corresponding items at the immediately lower level', which entails that, at a given point of time, there can be several current nodes, namely on different levels of the system. 18 This summary cannot do justice to all the details of such networks. While the above characterization is somewhat more detailed than many previous descriptions, I will comment on many other minor details in the sections to follow, but cL e.g., McClelland and Rumelharl (1981: 377 85), Stemberger (1985) and Dell (1986: 287 9) for a different version of an IAM, the so-called node structure theory cf. MacKay( 1987). Note also that experimental work on English TPVs is still necessary to support the above characterization. To my knowledge, contemporary psycholinguistic studies on the representation and processing of morphologically complex verbs have focused largely on Dutch and German verbs (both separable vs. inseparable \aujsagen vs. unlersuchm] and literal vs. idiomatic \iveggeben vs. aufgeben\). Thus, unfortunately, there seem to be only a few studies where such issues were investigated experimentally for English complex verbs (cf. studies by Hillcrt and his associates). Besides, most of the studies on particle verbs (i) address matters of comprehension rather than production, and (ii) do not relate their findings to lAJVls of the sort discussed here. 19 While section 4.1 has already related the discourse-functional variables to activation, note that the present treatment of these variables within lAMs differs from the previous one. The explanation in section 4.1 was based on a fairly nontechnical definition of activation and was concerned with the benefits that the given-new organization provided by the speaker has for the hearer; note also that it faces the two problems of processing accounts mentioned previously. The analysis in terms of an IAM exclusively focuses on the speaker's production processes without requiring such a conscious, cooperatively planned organization of the discourse, thereby making some of the claims in section 4.1 more explicit and avoiding these potential pitfalls. 20 Note that this effect cannot be reduced to frequency; cf. Chapter 2, n. 26. 21 In my data, the average ConPC value of pronominal and non-pronominal object nouns is 6.9 and 2.6 respectively; this difference is highly significant: tvv,,,h( 100) =-8.98: p<.001.
184
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
22 A finding supporting a similar tendency is that a repetition of a phoneme X is usually at some distance to the first occurence of'X; cf. MacKay (1970). 23 I neglect the facts that (i) the above cut-off point for the choice of construction was —.039 rather than 0, and (ii) negative activation levels are difficult to motivate. These difference do not affect the logic of the point to be made. 24 Again, note that the correlation coefficients must not be compared directly. The black-rimmed boxes in the upper half of the diagram indicate that the variables in the box were subsumed under a single factor (using a principal components analysis) because of their large inter-correlation. Since the IAM makes the same predictions as the theory of processing effort outlined earlier, the cross-validated prediction accuracy of 83.9 per cent can also be applied to this model.
9
Conclusion and outlook
9.1 Summary Having discussed all variables governing particle placement from previous research, this study showed that, because of a variety of both linguistic and methodological drawbacks, previous analyses have failed in adequately describing, comprehensively explaining and successfully predicting particle placement. Accordingly, this study pursued two objectives: I wanted to describe, explain and predict particle placement, and I wanted to show that the previously used instruments of analysis need to be supported by a more realistic, data-oriented and multifactorial methodology. Chapter 4 argued that the processing effort of a particular utterance derives from phonological, morphosyntactic, semantic and discoursefunctional variables. In the light of this, 1 introduced and discussed the Processing Hypothesis, according to which the choice of word order serves to facilitate processing. Generally speaking, construction(l is preferred for VPCs requiring a lot of processing effort, whereas construction : is preferred for VPCs requiring only little processing effort. Then, for each variable postulated in the literature and included in the Processing Hypothesis, I illustrated how its relation to processing effort theoretically supports the Processing Hypothesis — for each variable previously mentioned and not included in the Processing Hypothesis, I showed on what grounds this variable was excluded. After a fairly detailed description of methods in Chapter 5, Chapter 6 then provided a variety of results. Section 6.1 discussed a large number of monofactorial results in order to subject the claims of the mostly nonempirical literature to empirical tests and provide a valid description of the individual influence/cue strength of every single variable's value/level. On the basis of these monofactorial results, it could be shown that (i) all of the variables I included in the Processing Hypothesis indeed contribute to particle placement in the predicted direction, and (ii) nearly all the variables I did not include in the Processing Hypothesis do not contribute to particle placement. Moreover, it was shown that previous analyses have failed to notice interrelations between variables (recall CONCRETE and ANIMACY or
186
MULTIFACTOR1AI, ANALYSIS IX CORPUS LINGUISTICS
DF.T: none and TYPE: pronoun) as well as interactions of variables with REGISTKK. Section 6.2 was concerned with the conflict validity of variables' values/ levels: oppositions of variables were investigated in order to assess the relative strengths of variables and values/levels in cases of conflicting preferences. These comparisons of variables' values/levels do not introduce theoretically revolutionary findings, they do. however, question an assumption underlying most previous analyses, namely that variables' values are uniformly strong by demonstrating that different values/levels have considerably different effects on particle placement. In section 6.3 the multifactorial approach, which, I had argued, provides the only rewarding approach to syntactic variation, was discussed. I showed that the relation between the variables and the choice of construction is very high. Following that, I demonstrated how a discriminant analysis identifies the variables that are most important for speakers' decisions in favour of a construction; again, the Processing Hypothesis was strongly supported by the data. On the basis of these results, it was further illustrated how we can correctly predict speakers' decisions for a construction in any given discourse situation in 84 per cent of all cases. A distribution-free technique (CART) was shown to yield both mathematically and conceptually similar results. Section 6.4 was devoted to items classified/predicted erroneously. It illustrated that the misclassified sentences are exactly those for which there is no clear preference so that even native speakers would face difficulties in expressing a clear preference for one of the constructions let alone reliably predicting what other speakers will do in several hundred cases. However, it was shown that additional variables (e.g. structural priming) might come into play and could ultimately still improve the prediction accuracy. Chapter 7 dealt with some wider-ranging implications of the present study. Section 7.1 showed how the methodology makes it possible to determine objectively the prototypical instances of each construction in the corpus and the distance of particular instantiations of the constructions to their respective prototype. Moreover, it was shown that, contrary to several decades of purely syntactically determined linguistic argumentation, there is little evidence for assuming there is one prototypical VPC: on the one hand, both constructions receive equally valid argumentative support; on the other hand, the two constructions exhibit more differences than commonalities and can be considered information-structure constructions in the sense that the two different word orders serve different communicative functions. Given these different functional purposes, there is neither a motivation nor a need to posit a single category (with a single prototype). Furthermore, this section illustrated that particle placement can be accounted for more appropriately when gradience of categories is allowed for. Section 7.2 defended probabilistic approaches in linguistics and related the findings of this study to other approaches towards variation data in linguistics in general, namely sociolinguistic analyses using variable rules,
CONCLUSION AND OUTLOOK
187
recent variation studies and contemporary corpus-linguistic accounts such as Leech etal. (1994). Finally, section 7.3 discussed how the present approach to particle placement contributes to the controversy between discourse-functional versus syntactic approaches to syntactic variation. Several fairly recent analyses were discussed, but the main focus was on elaborating the distinctions of the present approach to the superficially similar processing approach advocated by Hawkins (1994). A number of shortcomings of Hawkins's work were discussed, and I argued that Hawkins makes correct observations but draws several unwarranted conclusions. Chapter 8 connected the findings of the present study to interactive activation models. For particle placement, both the Processing Hypothesis and the IAM approach make identical predictions, supporting the activationbased account. 1 described the activation processes underlying the choice of construction in some detail and presented a graphical model of all weighted vertical function-form correlations as well as all horizontal intercorrelations. To my knowledge, this is the iirst activation-based description of an instance of syntactic variation that is entirely based on corpus data and able to incorporate as well as weigh all relevant variables. The most rewarding properties of the present approach are, I believe, the range of the empirical data coverage, the potential to explain even those cases where refinement is needed or desirable and the possibility to bring together syntactic and discourse-functional parameters. 9.2 Outlook: implications and extensions
The scope of a study devoted to particle placement alone might seem overlynarrow, but I hope to have shown that this is definitely not the case. Several sections have revealed how the present techniques and results have a potential going far beyond the single syntactic phenomenon investigated here (prototypes in linguistic theory, the role of register for syntactic analyses and interactive activation models, to name just three). This section contains some conclusions and suggestions as to what the findings of the present study mean when situated in a larger context. First, let us address the issue of parsimony and its role in linguistic analyses. The present study shows that the variables influencing particle placement are qualitatively different and highly interrelated; moreover, functional variables arc, contrary to some analyses, very important since they motivate the values/levels of the syntactic variables. Thus, there is no motivation to investigate syntactic: variation exclusively in syntactic terms: an analysis going beyond that can be much more informative. What is more, I tend to conclude that analyses attempting to discover whether discourse-functional or morphosyntactic variables are statistically more important are in a way completely off track - given the large degree of interrelations between all variables, it simply does not make sense to ask such a question. Rather, the question to ask in future research is how and why these variables are
188
MULTIFACTORIAL ANALYSIS IN CORPUS LINGUISTICS
interrelated, and I hope to have shown that analyses along the lines demonstrated above (i.e. the modelling of causal activation-based relationships) yield insightful results. Put differently, while it may even be right to point out that morphosyntax is very often a statistically sufficient predictor of constituent ordering, this statistical result on its own. however, is not necessarily a cognitively sufficient result for the following reason. Even though a scientific theory aims at being parsimonious (by not postulating more predictors than necessary; cf. Occam's razor), a cognitively realistic theory of syntactic variation might have to include more variables than are statistically necessary to describe or explain the data, for the simple reason that these variables do in fact exist and exhibit well-documented relations to the subject of interest. For instance, Hawkins acknowledges that discourse-functional variables affect the retricvability of referents and morphosyntactic variables (cf. Hawkins 1994: 238). Therefore, they must be included in any cognitively realistic theory of processing effort even if their statistical value (crucially depending on their operationalization anyway) is limited at times. This also holds for other variables that Chen, Hawkins and others have not discussed. Take, for instance, TDIOMATICITY: we know idiomaticity exists, we know it affects syntactic characteristics of many other constructions, we know it correlates strongly with particle placement and we know it strongly affects retrieval/activation of lexical and syntactic nodes. Thus, we have to consider it simply because the resulting analysis will be more comprehensive even if IDIOMATICITY does not add substantially to the predictive power (given its intcrcorrelations with syntactic determinants). I believe, these arguments underscore the importance of investigating interrelations of many predictors as opposed to trying to sum up only as few predictors as possible for the sake of parsimony. Since the activation models used in contemporary psycholinguistic work arc, given their interactive nature, exactly the sort of model that is capable of handling such complex interactions of many variables at the same time. I believe they lend themselves to forming the basis of more detailed and reliable analyses of syntactic variation, i.e. analyses that are, unlike the majority of cases, multifactorial, corpus-based and subjected to the test of truly predicting speakers' choices. And if such analyses are developed, I believe multifactorial work of the sort outlined above is, in fact, indispensable: no researcher will ever be able to account for such complex phenomena intuitively and reliably at the same time. This brings me to the next point. Apart from the linguistic findings, I also hope to have shown several advantages of advanced statistical analyses for (cognitively oriented) linguistic research. Nearly all behavioural sciences utilize complex statistical tools to describe, explain and predict human behaviour - in linguistics (more specifically, syntactic research), however, strongly statistics-based research is not frequently used outside the areas of corpus linguistics, psycholinguistics or quantitative linguistics. This study shows how various statistical procedures can be combined in order to provide insights of, with respect to particle placement, a previously unknown
CONCLUSION AND OUTLOOK
189
level of detail while still addressing and accounting for a wide scope of phenomena that are of interest to linguists hitherto using only traditional techniques. Using statistical techniques, competing analyses can be compared such that (i) analyses with a lower predictive power for natural data are rejected in favour of analyses with a higher predictive power (cf. discriminant analysis) and/or (ii) causal models with a lower degree of explanatory power are rejected in favour of analyses with a higher explanatory power (cf. structural equation modelling). Note that even if the particular statistical models proposed here are not embraced, the analysis still has provided valuable results by constraining the range of theories that might be proposed alternatively: if any other theory concerning particle placement docs not achieve a similar prediction accuracy and, at the same time, weighs the variables involved in a considerably different way. it is very likely not to be correct. For instance, whatever grammatical theory is applied to particle placement, ii' it does not incorporate discourse-functional, rnorphosyntactic variables and structural priming while assigning much less importance to the following context, it will definitely not be able to yield any rnentionable prediction accuracy and should, thus, not be considered a viable alternative. Finally. I shall briefly address the issue of further work using the methods employed here. Extensions are possible along several parameters. The simplest extension would be to study other cases of syntactic variation in English from a synchronic perspective. While English generally exhibits a fairlyrigid word order, there are a number of cases where alternative truthconditionally equivalent constructions are possible; the standard examples next to particle placement are perhaps the Dative Alternation (double-object construction vs. prepositional dative), the genitive (.v-genitive vs. o/-gcnitive) and a number of other, frequently stylistically marked, alternations (cf. Rohdenburg 1996) awaiting multifactorial research. For instance, literally hundreds of references and my own investigation suggest that the Dative Alternation and Preposition Stranding are also highly complex phenomena with numerous determinants from many different levels of linguistic analysis, but it also becomes obvious that only analyses accounting for this complexity can yield results going beyond traditional research. Similarly, the choice of analytic vs. synthetic comparatives has not so far been fully explained since quite a large variety of variables influence the selection of a particular form. Note also in this respect the frequently neglected role of the register for syntactic phenomena: we have seen that particle placement to some degree correlates with REGISTER, and first results of a corpus of 301 cases of preposed vs. pied-piped prepositions (i.e. Which coalfield mere most of these on? vs. On which coalfield were most of these?) also shows that the register has a highly significant influence on Preposition Stranding.' An additional important area that has only occasionally been investigated is the degree to which individual TPVs have preferences or dispositions for particular constructions. Some first approaches to this issue are Browman (1986: 327), Wasow (1997a: 97-102) and Slallings et al. (1998: 395-7, 411 12); for a more detailed treatment, cf. Grics and Stefaiiowitsch (submitted).
190
MULTI FACTORIAL ANALYSIS IN CORPUS LINGUISTICS
Finally, another essential focus of future research should be the more detailed investigation of causal relationships between sets of variables or factors determining subconscious speaker decisions along the lines of, but elaborating upon, my analysis in terms of TAMs in section 8.3; by investigating these and other syntactic phenomena, it might some day be possible to devise a fully elaborated activation-based syntactic theory, which, to my knowledge, has not been developed so far. There is, of course, nothing in the approach pursued here that restricts the scope of the techniques discussed to English data only.' Similar phenomena can be investigated cross-linguistically with respect to the degree of similarity of the underlying variables/factors in order to identify potentially universal patterns of usage. This is a research program that should be of particular interest to cognitive and functional linguists since it addresses the question of how linguistic universals may derive from universal cognitive/ activation principles rather than theory-driven formal principles. On a diachronic basis, it may also prove interesting to apply the proposed methods of analysis to the study of the development of (preferences of) constructions. Now that the number of diachronic corpora is increasing, it is possible to determine which factors arc decisive in the changes of (uses of) constructions over time by precisely identifying influences/weights of contemporaneously changing variables. Similar suggestions pertain to the analysis of acquisition data by determining prototypes of constructions and seeing how they change while more knowledge is acquired. For instance, the acquisition of the ditransitive construction and its prepositional paraphrases has long been discussed in the literature (cf. Fischer 1971, Bowerman 1983, Mazurkewich and White 1984, Gropcn et al. 1989, to name but a few). Many of these accounts concentrated on the questions of (i) how children learn which construction is appropriate in which circumstances, and (ii) how to explain why so few overgeneralization errors occur. On the basis of the utterances made and heard by a child, one could, at several distinct stages of the development of the language, determine what the prototype of the child's ditransitive construction looks like and which variables are likely to cause extensions from this prototype in order to determine how the learning process takes place. Finally a meta-theoretical comment is necessary. My own strategy has been to start out from linguistic analyses and, ultimately, develop a psycholinguistic explanation for particle placement. On a more general level, however, it is necessary to investigate the1 extent to which grammatical theories and psycholinguistic findings and theories match. Put differently, studies like the present one might serve to bridge the gap between linguistic theories and psycholinguistic findings/models. Let me briefly mention two examples. First, the finding that structural priming is relevant to particle placement can be explained within an IAM also supports grammatical theories recognizing constructions as grammatical entities. From the present perspective, Construction Grammar (cf, e.g., Goldberg 1995) seems to be a promising
CONCLUSION AND OUTLOOK
191
theory, and I believe it would be interesting and worthwhile to determine to what degree the two approaches can map onto each other or are at least conceptually compatible. For example, the assumptions of Construction Grammar that (i) there is no strict division between lexicon and syntax, and that (ii) all linguistic entities are form-meaning pairings arc paralleled by the assumptions of TAMs that (i) linguistic units of every level arc represented in terms of (lexical and constructional) nodes without major qualitative differences, and (ii) both lexical nodes and constructional nodes are related to nodes on the conceptual level. Investigations of such questions may serve to bring about a unification of approaches that have hitherto developed more or less separately. Second, particle placement is a case of variation where, according to most analyses, speakers decide on the arrangement of phrasal elements within a VP. An additional interesting area of research would therefore be to apply the theoretical framework of lAMs and the corpus-based and statistical techniques advocated here to cases where speakers order non-phrasal constituents within a single phrase; one such case in point is prenominal adjective order. An analysis along the lines proposed here would perhaps enable us to flesh out some "intraphrasal details' of lAMs with respect to orderings of lexical constituents to see whether the same regularities apply. Thus, I dare conclude that the potential for rewarding applications and extensions is enormous. T am convinced that analyses that combine cognitively real notions with corpus-based and statistics-based methodology can shed even more light on what, from my point of view. lies at the heart of cognitively oriented linguistics and psycholinguistics, namely the intricate interplay of linguistic and cognitive aspects of language or. put differently, the cognitively plausible investigation of linguistically relevant aspects of human cognition. Notes 1 Tn this respect, I hope that the publication of Biber el a/.'s (1999) corpus-based grammar of spoken and written English marks the advent of a more careful analysis of these and similar phenomena. For a first analysis of Preposition Stranding along the lines advocated here. cf. Gries (2002). 2 In one respect, at least. English is perhaps a good starting-point: English is the language for which the most modern and well-designed corpora are easily available.
10 Appendices
10:1 List of variables Value /level for construction,i stressed direct object
definite long complex idiomatic habitual
inanimate abstract low
direct object
Variable stress pattern of the verb phrase phonetic shape of the verb NP type of the direct object determiner of the direct object length of the direct object complexity of the direct object meaning of the verb phrase. meaning of the verb phrase;. semantic modification of the particle animacy of the direct object's referent concreteness of the direct object's referent cognitive entrenchment of the direct object's referent focus of the verb phrase
Value /level for construction , stressed particle verb has no initial stress (semi-) pronominal indefinite/none
Type of variable X!U
w.o's
o 3 ^
£, 3. 2-
*"* 5" 1 rT \
l
-^ 3 3 0 r-f M
^ "HQ. tr ?'9
yes
animate
If} rp
P
concrete
n
high particle table continues
193
APPENDICES
10:1 List of variables - continued Value /level for construction,.
high lono' long
low low
short high high written infrequent yes
high
Variable
news value of the direct object's referent distance to last mention of the direct object's referent times of preceding mention of the direct object's referent cohesiveness of the dir. object's referent to the prec. discourse distance to next mention of the direct object's referent times of subsequent mention of the direct object's referent cohesiveness of the dir. object's referent to the subs, discourse following directional PP register frequency of the direct object (head noun) particle = preposition ot foil. PP production difficulty/number of disfluencies in the utterance
Value /level for construction ,
Type of variable
low cnciTt allVJl I
high
g_ ri O -i w
high
E1
long
o'
low-
n r-h
1
low yes oral
frequent
0 I—f ET 0) >-i
low
194
APPENDICES
10.2 Register-dependent interaction plots
Figure 10:1 Interaction plot: construction X REGISTER X COMPLEX
Figure 10:2 Interaction plot: construction X REGISTER X LENGTH\¥
APPENDICES
Figure 10:3 Interaction plot: construction x REGISTER X LENGTHS
Figure 10:4 Interaction plot: construction X REGISTER X TYPE
195
196
APPENDICES
Figure 10:5 Interaction plot: construction X REGISTER X DET
Figure 10:6 Interaction plot: construction x REGISTER X IDIOMATIGITY
APPENDICES
Figure 10:7 Interaction plot: construction X REGISTER X CONCRETE
Figure 10:8 Interaction plot: construction X REGISTER X ANIMACY
197
198
APPENDICES
Figure 10:9 Interaction plot: construction X REGISTER x LM
Figure 10:10 Interaction plot: construction X REGISTER X AcxPG
APPENDICES
Figure 10:11 Interaction plot: construction X REGISTER X TOPM
Figure 10:12 Interaction plot: construction x REGISTER X CoHPC
199
200
APPENDICES
Figure 10:13 Interaction plot: construction X REGISTER X NM
Figure 10:14 Interaction plot: construction X REGISTER X CursSC
APPENDICES
Figure 10:15 Interaction plot: construction X REGISTER X TOSM
Figure 10:16 Interaction plot: construction X REGISTER X ConSC
201
202
APPENDICES
Figure 10:17 Interaction plot: construction X REGISTER X OM
Figure 10:18 Interaction plot: construction X REGISTER X PP
APPENDICES
203
10.3 ListofTPVs1 ace out act out add in add on add up answer back ante up argue down argue out ask in ask out auction off average out back up bag up bail out bail up balance out balance up bale out ball up bandage up bandy about bandy around bang down bang out bang up bark out base on base upon bash in bash out bash up bat around bat out batten down batter down bawl out beam down beam up bear away bear off bear out bear up
beat back beat down beat out beat up bed out beef up bellow out belt out belt up bid up bind over bitch up bite back black out blank off blank out blast out block in block off block out block up blot out blow away blow down blow off blow out blow over blow up bluff out blurt out board up bog down boil away boil down boil off boil out boil up bollix up bolt down boot out boss about boss around botch up bottle up
bounce around bowl out bowl over box in box up brave out brazen out break down break in break off break out break up breathe in breathe out brew up brick up brighten up bring about bring along bring around bring back bring down bring forward bring in bring off bring on bring out bring round bring together bring up broaden out brush aside brush away brush down brush off brush out brush up buck up bugger about bugger up build in build up bulk out bum out
bump off bump up bunch up bundle off bundle up bung up buoy up burn down burn off burn out burn up bust up butter up buttress up buy back buy in buy off buy out buy up call away call back call down call forth call in call off call out call over call up calm down cancel out carry away carry forward carry off carry on carry out carry over carry through cart off carve out carve up cash in cast aside cast down cast off
204
cast out cast up catch out catch up cave in cement up chain down chain up chalk up change around change round charge up chase away chase down chase off chase up chat up check off check out check over cheer on cheer up chew out chew over chew up chip in chivvy along chivvy up choke back choke down choke off choke up chop down chop up chuck away chuck in chuck out chuck up churn out churn up clap out claw back clean down clean out clean up clear away
APPENDICES
clear out clear up clock up clog up close down close off close out close up clue in clue up cobble together cock up collect up colour in colour up comb out cone off conjure up connect up consign over contract out cook up cool down cool off coop up copy down copy out cordon off cough up count down count in count off count out count up cover over cover up crack off crack up crank out cream off crease up cross off cross out cross up crowd out crumple up
cry out cut back cut down cut in cut off cut out cut up dam up damp down dash off deal out deck out deliver over deliver up dig in dig out dig over dig up dish out dish up divide off divide up divvy out divvy up do down do in do off do out do over do up dob in dole out doll up dope out dope up drag down drag in drag out drag up draw down draw in draw off draw out draw up dream up dredge up
dress down dress up drink in drink up drive away drive back drive off drive out drop in drop off drum out dry off dry up duff up dumb down dust down dust off ease oil' ease out eat away eat off eat out eat up edge out edit out egg on eke out elbow out empty out even out even up explain away eye up face down factor in factor out fade in fade out fake out fan out fancy up farm out fasten up fathom out fatten up feed up
APPENDICES
feel out feel up fence in fence off fend off ferret out fetch up fiddle away fight back fight down fight off fight out figure out figure up file awav fill in fill out fill up filter out find out finish off finish up fire off fire up firm up fish out fit out fit up fix up flag down flash around flash out flatten out flesh out fling oil' flip'off flood out fluff out fluff up flush out fly in fob off fog up fold away fold in fold up
follow out follow through follow up force back force out fork out fork over fork up foul up freak out free up freeze out freshen up frig up frighten away frighten off fritter away frizzle up frown down fry up fuck over fuck up gas up gather in gather up gear up get across get back get down get in get off get out get over get together get up ginger up give away give back give in give olf give out give up glass in gloss over goad on gobble down
gobble up goof up gouge out grass on grass over grass up grind away grind down grind out grind up gross out grub out grub up gulp down gum up gun down gussy up hack off hammer in hammer out hand around hand back hand down hand in hand on hand out hand over hand round hang out hang up hash out hash over hash up haul in haul olf haul up have in have off have on hawk about hawk around hawk round head off head up heap up hear out
205
heat up heave up hedge about hedge around hedge in hedge round help out hem in hide away hike up hire out hiss down hit back hit up hitch up hive off hoard away hoke up hold back hold down hold off hold out hold over hold up hollow out honk up hook up hose down hound out howl down hunt down hunt out hunt up hurry up hush up hype up ice down idle away ink in ink out invalid out invite along invite around invite back invite out invite over
206
iron out jack in jack up jazz up jerk around join up jolly along jot down juice up jumble up keep away keep back keep down keep in keep off keep on keep out keep up key in kick about kick around kick back kick down kick in kick off kick out kick over kick up kill off kiss off kit out knit together knock about knock around knock back knock down knock off knock out knock over knock together knock up lace up ladle out lap up lash down laugh down
APPENDICES
laugh off lay aside lay away lay by lay down lay in layoff lay on lay out lay up lead off lead on leave aside leave back leave behind leave down leave off leave out lend out let down let in let off let out level off level out lick off lift up light up lighten up line up link up liquor up live down live out liven up load down load up loan out lock away lock in lock out lock up look out look over look up loosen off
loosen up lop off louse up lug back lump together magic away mail out make off make out make over make up map out mark down mark off mark out mark up marry off marry up match up max out measure off measure out measure up mellow out melt down mess over mess up mete out miss out mix up mock up mop up move down move in move on move out move up mow down muck out muck up muddle up mug up mull over muscle out muss up muster in
muster out muster up nail down nail up narrow down nose out notch up note down nut out offer up open out open up order in pace out pack away pack down pack in pack off pack out pack up pad out palm off parcel out parcel up pare down partial out partition off partner off partner up pass around pass away pass down pass off pass on pass out pass over pass round pass up patch together patch up pay back pay down pay in payoff pay out pay up
207
APPENDICES
peel off peg down peg out pen in pen up pencil in pension off pep up perk up phase in phase out phone up pick off pick out pick over pick up piece together pile on pile up pin back pin down pin up piss away piss off plan out plant out play back play down play out play up plonk down plop down plot out plough back plough up plow back plow up plug up plumb in plump up plunk down point out point up polish off pony up portion out
post off pot up pour out power up prick out prick up print off print out prop up psych out psych up puff out puff up puke up pull apart pull away pull back pull down pull in pull off pull out pull over pull together pull up pump in pump out pump up punch in punch out punch up push aside push away push back push off push out push over push through push up put about put across put around put aside put away put back put down put forth
put forward put in put off put on put out put over put round put together put up puzzle out quiet down rack up raffle off rain down rake in rake off rake up rap out ratchet up ration out rattle off reach down reach out read back read off read out read over read through read up reason out reckon in reckon up reel in reel off reel out rein back rein in render down rent out report back report out rev up ride out rig out rig up ring back
ring down ring out ring up rinse out rip apart rip off rip up roll back roll out roll over roll up root out root up rope in rope off rough in rough out rough up round off round out round up rout out rub down rub in rub off rub out rub up ruck up ruffle up rule off rule out run down run in run off run out run over run up rush out rush through rustle up saddle up salt away sand down save up scale back scale down
208
scale up scare away scare off scare up scarf down scarf up scoop out scoop up scope out score off score out scout out scout up scrape together scream out screen in screen off screen out screw down screw over screw up scribble down scrounge up scrub out scrunch up seal in seal off seal up search out section off see off see out seek out sell off sell on sell out sell up send away send back send down send in send off send on send out send up separate off
APPENDICES
separate out serve out serve up set apart set aside set away set back set down set off set out set up sew down sew up shade in shake down shake off shake out shake up share out sharpen up shell out shin up ship off ship out shoot down shoot off shoot up shore up shout down shout out show in show off show out show up shrug off shuck off shut awayshut down shut off shut out shut up sick up sift out sign away sign in sign off
sign on sign out sign over sign up silt up sing out single out siphon away siphon off sit down sit out size up sketch in sketch out slag off slam down slap down slap on sleep off slice off slice up slick up slim down slip in slip off slip on slip out slot in slough off slow down slow up sluice down sluice out smarten up smash down smash in smash up smell out smoke out smooth away smooth down smooth out smooth over snap up snarl up sniff out
snuff out soak up sober up sock away sock in soften up sort out sound out soup up space out spark off speed up spell out spew up spice up spiel off spiff up spill out spin off spin out spit out spit up split off split up sponge down spoon out spread out spruce up spur on square away square off squash in squeeze in squeeze out squinch up squirrel away stack up stake out stall off stammer out stamp out stand up stare down stare out start off start up
209
APPENDICES
starve out stash away stave off steam up step down step up stick down stick out stick up stiffen up stink out stink up stir in stir up stitch up stoke up stop up store away store up stow away straighten out straighten up strap in strap up stretch out strike back strike down strike out strike up string along string out string together string up strip away strip down strip off stub out stuff up stump up suck in suck off suck up sum up summarize up summon up suss out
swallow up swear in sweat out sweep aside sweep away sweep out sweep up swill down switch off switch on swot up tack on tag on take along take apart take around take aside take away take back lake down take in take off take on take out take up talk down talk out talk over talk through talk up tamp down tangle up tap out tape up tart up tear apart tear away tear down tear off tear up tease out tee off tee up telegraph in tell apart tell off
lest out thaw out thin down thin out think out think over think through think up thrash out throttle back throttle down throw away throw back throw down throw in throw off' throw on throw out throw over throw together throw up thunder out tick off tidy away tidy out tidy up tie back tic down tie up tighten up tip off tip over tip up tire out tog out tone down tone up tool up top off top up toss about toss around toss back toss down toss off lot up
total up tote up touch up tough out toughen up tout around tout round trace out track down trade in trade off train up tread down trigger off trim away trim off trip up trot out true up truss up try on try out tuck away tuck in tuck up tucker out tune down tune off tune out tune up turf out turn around turn away turn back turn down turn in turn off turn on turn out turn over turn round turn up type out type up urge on use up
APPENDICES
210
vamp up vote down vote in vote out wait out wake up walk off walk up wall in wall off wall up ward off warm over warm up warn away warn off wash away wash down wash off
wash out wash up water down wave aside wave down wave off wave on wear away wear down wear in wear off wear out weed out weigh out weigh up wheel out while away whip out whip up
whittle away whittle down win around win over win round wind down wind on wind up winkle out wipe down wipe off wipe out wipe up wire up wish away wolf down work in work off work out
work over work through work up wrap up wring out write back write down write in write off write out write up x out yank off yank out yank up yell out yield up zip up
Note 1 This alphabetical list is only formally/structurally organized so that TPVs with more than one meaning (such as throw up] arc not listed twice.
11
References
Aarts, B. (1989) Verb-preposition constructions and small clauses in English. Journal of Linguistics 25: 277-90. Aarts, B. (1992) Small Clauses in English: The Non-Verbal Types. Berlin, New York: Mouton de Gruytcr. Abney, S. (1996) Statistical methods and linguistics. In J. Klavans and P. Rcsnik (eds), The Balancing Act. Cambridge, MA: The MIT Press, pp. 1 26. Altenberg, B. (1982) The Genitive Construction v. the of-Construct! on: A Study of Syntactic Variation in Seventeenth Century English. Lund: CWK Leerup. Anderson, J. R. 1996. Rognitwe Psychologic, 2nd cd. Heidelberg, Berlin. Oxford: Spektrum. Armstrong, S. L., L. R. Gleitman and H. Glcitman (1983) What some concepts might not be. Cognition 13: 263-308. Arnold, J. E. and T Wasovv (1996) Production Constraints on Particle Movement and Dative Alternation. Poster presented at the CUNY Conference on Human Sentence Processing. Arnold, J. E. et al. (2000) Heaviness vs. newness: the effects of structural complexity and discourse status on constituent ordering, language 76: 28 55. Baker, M. (1988) Incorporation: A Theory of Grammatical Function Changing. Chicago: University of Chicago Press. Bates, li. and A. Devescovi (1989) Crosslinguistic studies of sentence production. In B. MacWhinney and E. Bates (eels), The Crosslinguistic Study of Sentence Processing.
Cambridge: Cambridge University Press, pp. 225-53. Bates, E. and B. MacWhinney (1982) Functionalist approaches to grammar. In E. Wanner and L. R. Gleitman (cds), Language Acquisition: The State of the Art. Cambridge: Cambridge University Press, pp. 173-218. Bales, E. and B. MacWhinney (1989) Eunctionalism and the competition model. In B. MacWhinney and E. Bates (eds), The Crosslmgms/ic Stud}1 of Sentence Processing. Cambridge: Cambridge University Press, pp. 3-73. Beals, K. et al. (eds) 1994. Papers from, the, Thirtieth Regional Meeting of the Chicago Linguistic Society, Vol. 2: The Parasession on Variation in Linguistic Theory. Chicago: Chicago Linguistic Society. Behaghel, O. (1930). Von dcutscher Wortstellung. ^eitschriftfur Deutschkunde ^'\: 81-9. Behaghel, O. (1932) Deutsche Syntax: Erne Geschichtliche Darstellung, Vol. IV: Wortstellung, Penodenbau. Heidelberg: Carl Winters. Berg, T. (1998) Linguistic Structure and Change: An Explanation from Language Processing. Oxford: Oxford University Press.
212
REFERENCES
Berg, T. and U. Schadc (2000). A local conncctionist account of consonant harmony in child language. Cognitive Science 24: 123-49. Biber, D., S.Johansson, G. Leech, S. Conrad and E. Finegari (1999) Longman Grammar of Spoken and Written English. Harlow, Essex: Pearson Education. Biber, D. (1988) Variation across Speech and Writing. Cambridge: Cambridge University Press. Biber, D. (1995) Dimensions ofRegista' Variation. Cambridge: Cambridge University Press. Bickerton, D. (1971) Inherent variability and variable rules. Foundations of Language 7: 457-92. Birbaumer, N. and R. F. Schmidt (1991) Biologische Psychologic. 2nd rev. cd. Berlin: Springer. Birner, B. J. and G. Ward (1998) Information Status and Noncanonical Word Order in English. Amsterdam, Philadelphia: John Benjamins. Bock, J. K. (1977) The effect of pragmatic presupposition on syntactic structure in question answering. Journal of Verbal Learning and Verbal Behaviour 16: 723 34. Bock, J. K. (1982) Towards a cognitive psychology of syntax: information processing contributions to sentence formulation. Psychological Review 89: 1—47. Bock, J. K. (1986) Syntactic persistence in language production. Cognitive Psychology 18: 355-87. Bock, J. K. (1990) Structure in language: creating form in talk. American Psychologist 45: 1221-36. Bolinger, D. (1971) The Phrasal Verb in English. Cambridge, MA: Harvard University Press. Bolkcstcin, A. M. and R. Risselada (1987) The pragmatic motivation of syntactic and semantic perspective. In J. Verschuercn and M. Bertucelli-Papi (eds), The Pragmatic Perspective; Selected Papers from the 1985 International Pragmatics Conference. Amsterdam, Philadelphia: John Benjamins, pp. 4-97- 512. Bolkestein, A. M. (1985) Cohesiveness and syntactic variation: quantitative vs. qualitative grammar. In A. M. Bolkestein, C. de Groot andj. L. Mackenzie (eds), Syntax and Pragmatics in Functional Grammar. Dordrecht: Foris, pp. 1-15. Bortz,J., G. A. Licncrt and K. Boehnke (1990) VerUilungsfreie Methoden in der Biostatistik. Berlin, Heidelberg, New York: Springer. Bortz,J. (1999) Statistikfur Sozialwissenschqftler. 5th cd. Berlin: Springer. Bortz, J. and N. Doring, (1995) Forschungsmethoden und Evaluation. 2nd comp. revised and updated edition. Berlin, Heidelberg, New York: Springer. Bowcrman, M. (1983) How do children avoid constructing an overly general grammar in the absence of feedback about what is not a sentence? Papers and Reports on Child Language Development 22: 23 35. Breiman, L., J. H. Friedman, R. A. Olshen and C. J. Stone (1984) Classification and Regression Trees. Monterey, CA: Wadsworth and Brooks/Cole Advanced Books and Software. Broihier, K. et al. (1994) The Acquisition of the Germanic VPC. Paper delivered at the Eighteenth Boston University Conference on Language Development. Browman, C. P. (1986) The hunting of the quark: the particle in English. Language and Speech 29: 3\ \-34-. Brown, P. M. and G. S. Dell (1987) Adapting production to comprehension: the explicit mention of instruments. Cognitive Psychology 19: 441—72. Burnarcl, L. (ed.) (1995) Users Reference Guide for the British .National Corpus (Version 1.0). Oxford University Computing Services. Carlson, G. N. and M. K. Tanenhaus (1988) Thematic roles and language com-
REFERKNCKS
213
prehension. In W. Wilkins (cd.) Syntax and Semantics 21: Thematic Relations. NewYork: Academic Press, pp. 263-88. Carstensen, B. (1964) Zur Struktur des englischen Wort\-"crbandes. Die Neueren Sprachen 13:305-28. Cedergren, H. and D. Sankoff (1974) Variable rules: performance as a statistical reflection of competence. Language 50: 333-55. Chafe, W. (1994) Discourse. Consciousness, and Time: The Flow and Displacement oj Conscious Experience in Speaking and Writing. Chicago: The University of Chicago Press. Charlton, R. (1990) English Verb-Particle Colligations. Doctoral Dissertation, Universitat dcs Saarlandes at Saarbriicken. Chen, P. (1986) Discourse and particle movement in English. Studies in Language 10: 79-95. Chomsky, N. A. (1957) Syntactic Structures. The Hague: Mouton. Chomsky, N. A. (1961) On the notion 'rule of grammar'. In R. Jakobson (ed.) Structure of Language and Its Mathematics. Providence, RI: American Mathematical Society, pp. 6 24. Church, K. W. and P. Hanks (1990) Word association norms, mutual information, and lexicography. Computational Linguistics 16: 22-9. Clark, H. H. (1977) Bridging. In P.Johnson-Laird and P. C. Wasori (eds) Thinking: Readings in Cognitive Science. Cambridge: Cambridge University Press, pp. 411-20. Clifford,.]. (1990) Grammar and Activation of Referents in Working Memory: The Case of Particle Movement. Technical Report, University of Colorado at Boulder. Clifton, C. Jr., S. Specr and S. P. Abriey (1991) Parsing arguments: phrase structure and argument structure as determinants of initial parsing decisions. Journal of Memory and Language 30: 251-71. Cohen, J. (1983) Statistical Power Analysis for the Behavioral Sciences. 2nd ed. Mahwah, NJ: Lawrence Erlbaum Associates. Collins, C. and H. Thrainssori (1996) VP-internal structure and object shift in Icelandic. Linguistic Inquiry 27: 391 444. Cowart, W. (1997) Experimental Syntax: Applying Objective Methods to Sentence Judgments. Thousand Oaks, London, New Delhi: Sage Publications. Cowie, A. P. and R. Mackin (1993) Oxford Dictionary of Phrasal Verbs. Oxford: Oxford University Press. Croft, W. (1995) Autonomy and functionalist linguistics. Language 71: 490-532. Curme, G. O. (1931) Syntax. Boston. Curme, G. O. (1935) A Grammar of the English Language: Parts of Speech and Accidence. Boston. Danes, F. (1966) A three-level approach to syntax. In F. Danes et al. (eds) Travaux Linguisiique de Prague,. University of Alabama Press, pp. 225—40. Deane, P. D. (1987) English possessives, topicality, and the Silverstein Hierarchy. InJ. Aske et al. (eds) Proceedings of the Thirteenth Annual Meeting of the Berkeley Linguistics Society. Berkeley: Berkeley Linguistics Society, pp. 64-76. Deane, P. D. (1992) Grammar in Mind and Brain: Explorations in Cognitive Syntax. Berlin, New York: Mouton de Gruyter. Dehe, N. (2000) On particle verbs in English: more evidence from information structure. In N. M. Antrim, G. Goddall, M. Schulte-Nafeh and V Samian (eds) Proceedings of the 28th Western Conference on Linguistics. Fresno, CA: California State University, pp. 92-105. Dell, G. S. (1986) A spreading activation theory of retrieval in sentence production. Psychological Review 93: 283-321.
214
REFERENCES
Dell, G. S. and P. A. Reich (1981) Stages in sentence production: an analysis of speech error data. Journal of Verbal Learning and Verbal Behaviour 20:611-29. Dell, G. S. el al. (1997) Lexical access in aphasic and nonaphasic speakers. Psychological Review 104: 801-38. Dell, G. S. arid P. G. O'Seaghdha (1991) Stages of lexical access in language production. Cognition 42: 287-314. Dell, G. S., F. Chang and Z. M. Griffin (1999) Connectionist models of language production: lexical access and grammatical encoding. Cognitive Science 23: 517—42. Den Dikken, M. (1995) Particles: On the Syntax of Verb-Particle, Triadic, and Causative Constructions. Oxford: Oxford University Press. Deutschbein, M. (1917) System der neumglischen Syntax. Cothern Verlag von Otto Schulzc. Dirvcn, R. and G. Radden (1977) Semantische Syntax des Englischen. Wiesbaden: Athenaion. Emonds,J. (1972) Evidence that indirect object movement is a structure preserving rule. Foundations of Language 5: 546—61. Erades, P. A. (1961) Points of modern English syntax. English Studies 42: 56-60. Fairclough, N. L. (1965) Some English Phrasal Types: Studies in the Collocation of lexical Items with Prepositions and Adverbs in a Corpus of Spoken, and Written Present-Day English. MA Thesis, University College London. Fasold, R. (1990) The Sociolinguistics of Language. Oxford: Basil Blackwcll. Fass, D. (1991) Met*: a method for discriminating metonymy and metaphor by computer. Computational Linguistics 17: 49-90. Fay, D. (1980) Transformational errors. In V A. Fromkin (ed.) Errors in Linguistic Performance. New York: Academic Press, pp. 441-68. Ferreira, V (1996) Is it better to give than to donate? Syntactic flexibility in language production. Journal of Memory and Language. 35: 724—55. Fijn van Draat, P. (1921) The place of the adverb: a study in rhythm. Neophilologus 6: 56-88. Fischer, S. D. (1971) 'The Acquisition of Verb-Particle and Dative Constructions'. Doctoral dissertation, Cambridge, MA: Massachusetts Institute of Technology. Francis, W. N. (1958) The Structure of American English. New York: Ronald Press. Fraser, B. (1965) 'An Examination of the Verb-Particle Construction in English'. Doctoral dissertation, Cambridge, MA: Massachusetts Institute of Technology. Fraser, B. (1966) Some remarks on the Verb-Particle Construction in English. In F. P. Dinnccn (ed.) Problems in Semantics, History of Linguistics, Linguistics and English. Washington, DC: Georgetown University Press, pp. 45—61. Fraser, B. (1974) The phrasal verb in English. By Dwight Bolinger. Language 50: 568-75. Fraser, B. (1976) The Verb-Particle Combination in English. New York: Academic Press. Frazier, L. (1979) On Comprehending Sentences: Syntactic Parsing Strategies. Bloomington, IN: Indiana University Linguistics Club. Frazier, L. (1985) Syntactic complexity. In D. Dowty, L. Karttunen and A. Zwicky (eds) Natural Language Parsing: Psychological, Computational, and Theoretical Perspectives. Cambridge: Cambridge University Press, pp. 129-89. Gibbs, R. WJr. (1994) The Poetics of Mind: Figurative Thought, Language, and Understanding. Cambridge: Cambridge University Press. Givon, 1. (1979) On Understanding Grammar. New York: Academic Press. Givon, T. (ed.) (1983) Topic Continuity in Discourse: A Quantitative Cross Language Study. Amsterdam, Philadelphia: John Benjamins.
RKKKRENCES
215
Givon, '!'. (1988) The pragmatics of word order: predictability, importance, and attention. In M. Hammond, E. A. Moravcsik and J. Wirth (eds) Studies in Syntactic 'Typology. Amsterdam, Philadelphia: John Benjamins, pp. 243-84. Givon, T. (1992a). On interpreting text-distributional correlations: some methodological issues. In D. I,. Payne (ed.) Pragmatics of Word Order Flexibility. Amsterdam, Philadelphia: John Benjamins, pp. 305 20. Givon, T. (1992b) The grammar of referential coherence as mental processing instructions. Linguistics 30: 5 55. Goldberg, A. (1992) The inherent semantics of argument structure. Cognitive Linguistics 3: 37-74. Goldberg, A. (1995) Constructions: A Construction Grammar Approach to Argument Structure. Chicago: Chicago University Press. Goldberg, A. (1996) Construction grammar. In E. K. Brown and J. Miller (eds) Concise Encyclopedia of Syntactic Theories. Oxford: Pergamon Press. Goldman-Eisler, F. (1968) Psycholinguutics: Experiments in Spontaneous Speech. New York: Academic Press. Gries, St. Th. (1999) Particle movement: a cognitive and functional approach. Cognitive Linguistics 10: 105-46. Grics, St. Th. (2002) Preposition stranding in English: predicting speakers' behaviour. In V Samiian (ed.) Proceedings of the 12th Western Conference on Linguistics 2000. Fresno, CA: California State University, pp. 230-41. Gries, St. Th. (forthcoming) Towards a corpus-based identification of prototypical instances of constructions. International SCOIA Journal. Gries, St. Th. and A. Stefanowitsch (submitted) Extending collostructional analysis: a corpus-based perspective on 'alternations'. University of Southern Denmark and University of Bremen. Gropen, J., S. Pinker, M. Hollander, R. Goldberg and R. Wilson (1989) The learnability and acquisition of the dative alternation in English. Language 65: 203 57. Guy, G. (1980). Variation in the group and the individual: the case of final stop deletion. In W. Labov (ed.) Locating I/mguage in Time and Space. New York: Academic Press, pp. 1 36. Guy, G. (1994) The phonology of variation. In K. Bcals et al. (eds) Papers from the Thirtieth Regional Meeting of the Chicago Linguistic Society, Vol. 2: T he Parases.non on Variation in Linguistic Theory. Chicago: Chicago Linguistics Society, pp. 133-49. Haiman, J. (1992) Iconicity. In W. Bright (ed. in chief) International Encylopedia of Linguistics. Vol. 2. New York, Oxford: Oxford University Press, pp. 191-5. Halliday, M. A. K. (1985) An Introduction to Functional Grammar. London: Edward Arnold. Hawkins. I. A. (1991) Syntactic weierht versus information structure in word order variation. In J. Jacobs (ed.) Informationsstruktur und Grammatik. Opladen: Westdcutscher Verlag, pp. 196-220. Hawkins, J. A. (1994) A Performance Theory of Order and Constituency. Cambridge: Cambridge University Press. Hawkins, J. A. (2000) The relative order of prepositional phrases in English: going beyond manner-place-time. Language Variation and Change 11: 231-66. Heliel, M. H. (1994) Verb-particle combinations in English and Arabic. In R. de Beaugrandc, S. A. Talal and M. H. Heliel (eds) language, Discourse and Translation in the West and Middle East. Amsterdam/Philadelphia: John Benjamins, pp. 141 -52. Hillert. D. (1998) Verb processing in German and English: ambiguity, discontinuous forms, and thematic complexity. In D. Hillert (ed.) Syntax and Semantics 31: A Cnsslinguistic Perspective. New York: Academic Press, pp. 247-63. ' <J
\
i
/
O
216
REFERENCES
Hiltunen, R. (1983) The Decline of the Prefixes and Beginnings of the English Phrasal Verb: The Evidence from Some Old and Early Aliddle English Texts. Turku. Horton, W S. and B. Kcysar (1996) When do speakers take into account common ground? Cognition 59: 91-117. Hudson, R. (1997) Inherent variability and linguistic theory. Cognitive Linguistics 8: 73-108. Hunter, P. J. (1981) Verb-Particle Position in English. Unpublished M.A. Thesis, University of Alberta. Hyams, N.,'J. Schaeffer and K. B. Johnson (1993) On the Acquisition of VPCs. Manuscript, University of California at Los Angeles and University of Amherst. Johnson, K. B. (1991) Object positions. Natural language and Linguistic Theory 9: 577-636. Johnson, K. B. (1992) Scope and die Binding Theory: comments on Zubizarreta. In J. A. Stowell and E. Wehrli (eds) Syntax and the Lexicon: Syntax and Semantics 26. New York, San Diego, London: Academic Press, pp. 259-75. Kay, P. (1995) Construction grammar. In J. Verschueren, J.-O. Ostman and J. Blommaert (eds) Handbook of Pragmatics: Manual. Amsterdam, Philadephia: John Benjamins, pp. 171-7. Kayne, R. (1985) Principles of particle constructions. In J. Gueron, H.-G. Obenauer and J.-Y. Pollock (eds) Grammatical Representation. Dordrecht: Foris, pp. 101-40. Kennedy, A. G. (1920) The Modern English Verb-Adverb Combination. Stanford: Stanford University Publication. Kennedy, G. (1998) An Introduction to Corpus Linguistics. London: Longman. Kiffer, T. (1965) A Diachronic and Synchronic Analysis and Description of English Phrasal Verbs. Doctoral Dissertation, Pennsylvania State University. Kimball, J. (1973) Seven principles of surface structure parsing in natural language. Cognition 2: 15 49. Kroch, A. S. (1994) Morphosyntactic variation. In K. Beals el at. (eds) Papers from the Thirtieth Regional Meeting of the Chicago Linguistic Society, Vol. 2: The Parasession on Variation in Linguistic Theory. Chicago: Chicago Linguistics Society, pp. 180-201. Kruisinga, E. and E A. Erades (1953) An English Grammar. Vol. I. Groningen: P. Noordhoff. Labov. W. (1969) Contraction, deletion and inherent variability of the English copula. Language 45: 715-62. Labov, W. (1975) Empirical foundations of linguistic theory. In R. Austerlitz (ed.) The Scope of American Linguistics. Lisse: The Peter de Ridder Press, pp. 77-133. Lakoff, G. (1972) Hedges: a study in meaning criteria and the logic of fuzzy concepts. In P. Perantcau, J. N. Levi and G. C. Phares (eds) Papers from the Eighth Regional Meeting of the Chicago Linguistic Society. Chicago: Chicago Linguistics Society, pp. 183-228. Lakoff, G. (1987) Women, Fire, and Dangerous Things. Chicago: University of Chicago Press. Lakoff, G. (1991) Cognitive vs. generative linguistics: how commitments influence results. Language and Communication 11: 53—62. Lakoff, G. and M.Johnson (1980) Metaphors We Live By. Chicago: University of Chicago Press. Lamb, S. (1999) Pathways of the Brain. Amsterdam, Philadelphia: John Benjamins. Lambrecht, K. (1994) Information Structure and Sentence Form: Topic, Focus, and the Mental Representations of Discourse Referents. Cambridge: Cambridge University Press.
REFERENCES
217
Langacker, R. W. (1987) Foundations of Cognitive Grammar: Theoretical Prerequisites. Stanford: Stanford University Press. Langacker. R. W. (1997) Constituency, dependency; and conceptual grouping. (Cognitive Linguistics 8: 1-32. Leech, G., B. Francis and X. Xu (1994) The use of computer corpora in the textual demonstrability of gradience in linguistic categories. In C. Fuchs and B. Victorri (eels) Continuity in Linguistic Semantics. Amsterdam, Philadelphia: John Benjamins, pp. 57-76. Legum, S. (1968) The verb-particle construction in English, basic or derived? In B. ]. Dardcn, C.-J. Bailey and A. Davison (eds) Papers from the Fourth Regional Meeting of the Chicago Linguistic Society. Chicago: Chicago Linguistics Society, pp. 50-62. Levelt. W. J. M. (1989) Speaking: From Thinking to Articulation. Cambridge, MA: The MIT Press. Levin, H., I. Silverman and B. Ford (1967) Hesitations in children's speech during explanation and description. Journal of Verbal Learning and Verbal Behaviour §'. 560 4. Lindner, S. (1981) A Lexico-Semantic Analysis of English Verb-Particle Constructions with up and out. Doctoral Dissertation, University of California at San Diego. Live, A. (1965) The discontinuous verb in English. Word 21: 428-51. Lyons, J. (1981) Language and Linguistics: An Introduction. Cambridge: Cambridge University Press. McClelland. J. L. and D. E. Rurnelhart (1982) An interactive activation model of context effects in letter perception: Part 2. Psychological Review 89: 60-94. McDonald, J. L., J. K. Bock and M. H. Kelly (1993) Word and world order: Semantic, phonological, and metrical determinants of serial position. Cognition Psychology, 25: 188-230. McDonald, J. and B. MacWhinney (1989) Maximum likelihood models for sentence processing. In B. MacWhinney and E. Bates (eds) The Crosslinguistic Study of Sentence Processing. Cambridge: Cambridge University Press, pp. 397-421. MacDonald, M. C. (1996) Representation and activation in syntactic processing. In T. Inui and J. L. McClelland (eds) Attention and Performance XVI: Information Integration in Perception and Communication. Cambridge, MA: The MIT Press, pp. 433-56. McEnery. T. and A. Wilson (1997) Corpus Linguistics. Edinburgh: Edinburgh University Press. MacKay, D. G. (1970) Phoneme repetition in the structure of languages. Language andSpeech 13: 199-213. MacKay; D. G. (1971) Stress prc-entry in motor systems. American Journal of Psychology: 84: 35-51. MacKay, D. G. (1987) The Organization of Perception and Action. Berlin, Heidelberg, New York: Springer. MacWhinney, B. (1989) Competition and conneetionism. In B. MacWhinney and E. Bates (eds) 7 he Crosslinguistic Study of Sentence Processing. Cambridge: Cambridge University Press, pp. 422 57. Martin, P. (1990) The Phrasal Verb: Diachwnic Development in British and American English. Doctoral Dissertation, Columbia University. Master Metaphor List. 02.12.1999 . Mattairc, M. (1972/1712) The English Grammar. London. Mazurkcwich, I. and L. White (1984) The acquisition of the dative alternation: unlearning ovcrgcncralizations. Cognition 16: 261-83. Meroney; H. M. (1943) Old English upp. uppe, uppon arid upon. Doctoral dissertation, University of Chicago.
218
REFERENCES
Mitchell, T. F. (1958) Syntagmatic relations in linguistic analysis. Transactions of the Philological Society 56:101-18. Morgan, P. S. (1997) Figuring out figure out metaphor and the semantics of the English verb-particle construction. Cognitive Linguistics 8: 327-57. Moss, H. E., S. McCormick and L. K. Tyler (1997) The time-course of activation of semantic activation during spoken word recognition. Language and Cognitive Processes 12: 695-731. Neeleman, A. (1994) Complex Predicates. Doctoral dissertation, Utrecht University. Nelson, F. W. and H. Kucera (1982) frequency Analysis of English Usage: Lexicon and Grammar. Boston: Houghton Mifflin. Nisbett, R. E. and L. Ross (1980) Human Inference: Strategies and Shortcomings of Social Judgement. Englewood Cliffs, NJ: Prentice Hall. Nisbett, R. E. and T. D. Wilson (1977) Telling more than we can know. Psychological Review 84: 231-59. Nuyts, J. (1995) Functionalism vs. Formalism. In J. Verschucren, J.-O. Ostman andj. Blommaert (eds) Handbook of Pragmatics: Manual. Amsterdam, Philadelphia: John Benjamins, pp. 293-300. O'Dowd, E. M. (1994) Prepositions and Particles in English: A Discourse-Based, Unifying Account. Doctoral dissertation, University of Colorado at Boulder. O'Dowd, E. M. (1998) Prepositions and Particle: A Discourse-Functional Account. Oxford: Oxford University Press. Oakes, M. (1998) Statisticsfor Corpus Linguistics. Edinburgh: Edinburgh University Press. Ostman, J.-O. and T. Virtanen (1997) Theme, Comment, and Newness: Revisiting the Oh-So-OJien Revisited. Paper presented at the International Cognitive Linguistics Conference, Amsterdam. Palmer, F. R. (1988) The English Verb. 3rd ed. London: Longman. Peters, J. (1999) Discourse Factors Influencing the Ordering of Constituents in the Verb Particle Construction. Unpublished MA thesis, University of Alberta. Peters, J. (2001) Given vs. new information influencing constituent ordering in the verb-particle construction. In R. Brend, A. K. Mclby and A. Lommel (eds) LACUS Forum XXVII: Speaking and Comprehending. Fullerton, CA: LACUS, pp. 133 40. Potter, S. (1965) English phrasal verbs. Philologica Prcigensia 8: 285—9. Prince, E. F. (1981) Toward a taxonomy of given-new information. In P Cole (ed.) Radical Pragmatics. New York: Academic Press, pp. 223-56. Prince, E. F. (1991) On functional explanation in linguistics and the origins of language. Language and Communication 11: 79-82. Quirk, R., S. Greenbaum, G. Leech and J. Svartvik (1985) A Comprehensive Grammar of the English Language. London, New York: Longman. Radford, A. (1988) Transformational Grammar: A First Course. Cambridge: Cambridge University Press. Roberts, M. H. (1936) The antiquity of the Germanic verb-adverb locution. Journal of English and German Philology 35: 466—81. Rohdenburg, G. (1996) Cognitive complexity and increased grammatical explicitness in English. Cognitive Linguistics 7: 149 82. Rohdenburg, G. and B. Mondorf (eds) 2003 Determinants of Grammatical Variation in English. Berlin, New York: Mouton de Gruyter. Rohrbacher, B. (1994) English main verbs move never. Penn Review of linguistics 18: 145-59. Rosch, E. and C. B. Mcrvis (1975) Family resemblances: studies in the internal structure of categories. Cognitive Psychology 7: 573-605.
REFERENCES
219
RossJ. R. (1986) Infinite Syntax. Norwood: Ablex. Roth, G. (1997) Das Gehirn und seine Wirklichkat: Kognitive Neurobiologie und ihre philosophischen Konsequenzen. Frankfurt am Main: Suhrkamp. Rumelhart, D. E. andj. L. McClelland (1981) An interactive activation model of context effects in letter perception: Part 1. Psychological Review 88: 375—407. Schluter. J. (2003) Phonological determinants of grammatical variation in English: Chomsky's worst possible case. In C. Rohdenburg and B. Mondorf (eds) Determinants of Grammatical Variation in English. Berlin, New York: Moutori de Gruytcr. Schreudcr, R. et al. (1990) Lexical processing, morphological complexity and reading. In D. A. Balota, G. B. Florcs D'Arcais and K. Rayner (cds) Comprehension Processes in Reading. Hillsdale. NJ: Lawrence Erlbaum, pp. 125-41. Schutze, C. T. (1996) The Empirical, Base of Linguistics: Grammatically Judgements and Linguistic Methodology. Chicago: University of Chicago Press. Selkirk, E. O. (1984) Phonology and Syntax: The Relation between Sound and Structure. Cambridge, MA: MIT Press. Shluktenko, J. A. (1955) Uber die sogenanntcn zusammengesetzten Vcrben vom Typ stand up in der englischen Sprachc dcr Gegenwart. Sowjetttiiisenschaft 8: 223-35. Siewierska, A. (1988) Word Order Rules. London: Croom Helm. Siewierska, A. (1993) Syntactic weight vs information structure and word order variation in Polish. Journal of linguistics 29: 233-65. Stallings, L. M., M. C. MacDonald and P. O'Seaghdha (1998) Phrasal ordering constraints in sentence production: phrase length and verb disposition in HeavyNP Shift. Journal of Memory and Language 39:392 417. StatSoft, Inc. (1999) Electronic Statistics Textbook. Tulsa, OK: StatSoft. . Stembergcr, J. P. (1982a) The nature of segments in the lexicon: evidence from speech errors. Lingua 56: 235-59. Stemberger, J. P. (1982b) Syntactic errors in speech. "Journal of Psycholinguistic Research 11:313-45. Stemberger, J. P. (1985) An interactive activation model of language production. In A. Ellis (ed.) Progress in the. Psychology ojLanguage. London: Routledge, pp. 143-86. Stowell, T. A. (1981) Origins of Phrase Structure. Doctoral dissertation, Massachusetts Institute of Technology at Cambridge, MA. Stucky, S. (1987) Configurational variation in English: a study of extraposition and related matters. In G. J. Huck and A. E. Ojeda (eds) Discontinuous Constituency: Syntax and Semantics 20. New York: Academic Press, pp. 377-405. Sweet, H. A. (1892) A New English Grammar. Oxford: Clarendon Press. Taha, A. K. (1960) The structure of two-word verbs in English. language Learning 10: 115-22. Taha, A. K. (1964) The structure of two-word verbs in English. In H. B. Men (cd.) Readings in Applied English Linguistics. New York: Meredith, pp. 130-6. Talmy, L. (1985) Lexicalization patterns: semantic structure in lexical forms. In T. Shopen (cd.) Language Typology and Syntactic Description 3: Grammatical Categories and the lexicon. Cambridge: Cambridge University Press, pp. 57 149. Taylor, J. R. (1995) Linguistic Categorisation: Prototypes in Linguistic Theory. 2nd ed. Oxford: Clarendon Press. Tomlin, R. S. (1986) Basic Word Order: Functional Principles. London, Sydney: Croom Helm.
220
REFERENCES
Tyler, L. K. and W. D. Marslen-Wiison (1977) The on-line effects of semantic context and syntactic processing. Journal of Verbal Learning and Verbal Behaviour 16: 683-92. Urban, S. and A. D. Fricdcrici (1999a) Zur Bereitstellung von Vcrb-ArgumentStrukturen im Sprachverstehensprozep: Evidenzen aus ercigniskorrelierten Hirnpotential-Untersuchungen. In I. Wachsmuth and B.Jung (cds) Proceedings der 4. Fachtagung der Gesellschaftfur Kognitionswissenschajlen Biekfeld (28.9.-1.10.99). Sankt Augustin: Infix-Verlag. Urban, S. and A. D. Friedcrici (1999b) Zur Vcrarbeitung komplexcr Verben: Zwei EKP-Studien mil separicrbaren komplexen Vcrben. In E. Schrogcr, A. Mecklinger and A. Widmarm (eds) Experimentelle Psychologic. Beitrdge zur 41. Tagung experimentell arbeitender Psychologen, Leipzig (28.3, 1.4 1999). Letigerich: Pabst Science Publishers. Van Dongen, W A., Sr. (1919) He puts on his hat and He puts his hat on. .Neophilolo, ? zw4:322-53. Van Hout, R. (1995) Statistics. In J. Verschueren, J.-O. Ostman and J. Blommacrt (eds) Handbook of Pragmatics: Manual. Amsterdam, Philadephia: John Benjamins, pp. 624-7. Vennemann, T. (1988) Preference iMwsfor Syllable Structure and the Explanation of Sound Change: With Special Reference to German, Germanic, Italian, and Latin. Berlin, New York: Mouton de Gruyter. Vestcrgaard, T. (1974) Review of Bolinger (1971). English Studies 55: 303-8. Von Schon, C. V (1977) The Origin of Phrasal Verbs in English. Doctoral Dissertation. State University of New York at Stony Brook. Wardaugh, R. (1986) An Introduction to Sodolinguistics. Oxford: Basil Blackwell. Wasow, Th. (1997a) End-weight from the speaker's perspective. Journal of Psycholinguistic Research 26: 347-61. Wasow, Th. (1997b) Remarks on grammatical weight. Language Variation and Change 9: 81-105. Werner, J. (1997) Lineare Statistik: Das A llgemeine LineareModel!. Weinhcim: Psychologic Verlags Union. Western, A. (1906) Some remarks on the use of English adverbs. Englische Studien 6: 75-100. Winer, B. J., D. R. Brown and K. M. Michels (1991) Statistical Principles in Experimental Design. 3rd ed. New York: McGraw-Hill. Yeagle, R. (1983) 'The Syntax and Semantics of English VPCs with off: A Space Grammar Analysis'. Unpublished MA thesis, Southern Illinois University at Carbondale. Zofel, P. (1992) Statistik in der Praxis. 3rd rev. and extended edition. Stuttgart, Jena: Gustav Fischer.
Dictionaries /•works used for the list ofTPVs Benson, M., E. Benson and R. Ilson (1997) BBI Dictionary of English Word Combinations. Revised edition. Amsterdam, Philadelphia: John Benjamins. Cambridge International Dictionary of Phrasal Verbs (1997) Cambridge: Cambridge University Press. Collins Cobmld E-Dict CD-ROM (1999) London: Harper Collins. Courtney, R. (1983) Longman Dictionary of Phrasal Verbs. London: Longman. Cowie, A. P. and R. Mackin (1975) Oxford Dictionary of Current Idiomatic English. Volume 1: Verbs with Prepositions and Particles. Oxford: Oxford University Press.
REFERENCES
221
Cowic, A. P. and R. Mackin (1993) Oxford Dictionary of Phrasal Verbs. Oxford: Oxford University Press. Rudzka-Ostyn, B. (1999) English Phrasal Verbs: A Cognitive Approach. Manuscript.1 Note 1 I thank Paul Ostyn for generously providing this manuscript for my study.
This page intentionally left blank
Subject index
acquisition 142, 154-5 n. 11, 190 activation discourse-functional 48-52. 56—7, 74, 89-95, 99 noise 160-1, 175-6 psycholinguistic 7, 9, 62 n.2, 130 n.34, 144-5, 158-81, 187-9 resting level 159-60, 166-7, 169, 174, 175 attention 5-6, 48, 63 n.6 CART 109, 115-18, 174 5, 186 categorization 5-6, 132-43, 186 Competition Model 7, 143, 157- 9, 174, 176
conflict validity 158 9, 186 construal 6, 24, 26 Construction Grammar 6, 53, 139 40, 154 n.10, 190 cue validity 133. 153n.3, 158 Dative Alternation 1, 47, 120, 151, 153 4n.5, 189 diachrony 3 5. 21, 142, 190 discriminant analysis 108 18, 134- 9, 174-5, 186, 189
EIC
37 n.7, 59 60, 65 n. 14, 146- 52, 178 180
locus end-focus 5. 24-5, 52 3. 55-6, 58, 59-60 end-weight, see end-focus semantic focus, see variables: semantic focus General Linear Model (GLM) 108-9, 113 given, see variables: ActPC/Dtlm, Lm, Topm identifiabilitv
49-51, 56-7
importance, w variables: ClusSC/Dtnrn, ' 1 bsm inhibition 160, 182 n.6-7 LISREL. see structural equation modelling metaphor
16, 63 n.8, 72, 77 n.9-10, 129
new; see variables: ActPC/Dtlm. Lm, Topm preposition vs. adverb 2. 17 Preposition Stranding 43 n.34, 189, 191 n.1 priming 119-20, 131 n.37, 154 n.6, 173-4, 181, 186 Processing Hypothesis 8, 48-52, 55. 59, 60-1, 62 n.1, 82, 85, 87-90, 92, 99, 111, 113-15, 120, 124-5, 130n.33, 131 n.38, 132, 185 7 prototypes/prototypicality 6, 8-9, 132-43, 152- 3 n.2 3, 186, 190 rhythmic alternation
40 n. 17
structural equation modelling 178 80, 187-9 syllable structure 120-1
157,
topicality, w variables: ActPC/Dtlm. Topm variable rules 143-6 variables all 23, 61-2, 98-101, 103, 106, 110-11, 114-17, 177 ActPC/Dtlm 19-20, 28-9, 39 n.13, 51, 57, 58, 65 n.13, 66 n.17, 72 4, 77 n.12, 90, 99-100, 119, 129 n.24, 131 n.36, 167, 170 AMMACY 31, 55-6, 71, 88- 9, 170, 185 Cu;sSC/DTXM 21, 26. 77 n. 12. 93, 99, 167
224 cognitive entrenchment
SUBJECT INDEX 1 6 - 1 7 . 29-31,
33, 41 n.22, 41 n.26, 88, 95 ConPC 27-28, 40 n.20, 51, 72 4, 91-2, 99-100, 119, 167, 170, 183 n.21 GonSC 27 28, 40 n.20, 72-4, 94, 99, 167 COMPLEX 14-15, 19, 32-3, 37 n.8, 48, 57-8, 69-71, 80-3, 98, 100, 101, 104-5, 156 n. 18, 170-1, 178-80 CONCRETE 31, 53, 71, 88 9, 98, 100. 128-9 n. 18, 169-70, 185 DET 14, 37n.5, 56-7, 69-70, 86- 7, 100, 101, 102, 104, 119. 171, 186 DISFLUEXCY 22, 60-1, 75, 96- 7, 99, 119, 123, 173 habitual meaning of the verb phrase 16, 24, 33, 77-8 n. 13, 130n.28 IDIOMATICITY 15-16. 37 n.9. 38 n.10, 52 6, 58, 63-4, 72, 87-8, 100-1, 119, 128-9 n.18, 131 n.36, 154-55 n. 11, 167-9, 188 length of the direct object 14, 19, 37 n.6-7, 57-9, 66 n.17, 70-1, 83-5, 98, 100, 101, 119, 128-9 n.18, 19, 131 n.36, 151, 170-1, 178-80 LENGTHS, see length of the direct object LENGTHW, see length of the direct object LM 19-20, 41 n.21, 51, 72-4, 89 90, 98, 104-5, 129n.22, 166, 170 news value of the direct object 18- 20, 57, 72-4, 150
NM 26. 72-4, 92, 167 OM 27, 72-4, 94 5, 128 n.18 PART-PREP 21-2, 60, 65 n. 16, 75, 96, 173 phonetic shape of the verb 13, 22- 4, 33-4, 39-40 n.16, 77-8 n.13, 130n.28 PP 21, 58-9, 65 n.14, 15. 74-5, 95-6. 101, 128-9 n.18, 130 n.27, 173 presence of a directional adverbial, see PP production and planning effects, see DlSl'LUENCY
REGISTER 36, 69, 97, 100-1, 130 n.33, 185-6, 189 semantic focus of the verb phrase 16-18, 24-5, 38 n. 12, 52, 77 n.11 semantic modification of the particle 16. 55. 72, 130n.28 stress 12-13, 17, 24-5, 38 n.11, 58, 78 n.13, 123, 130 n.28, 147, 156 n.17, 172-3 TOPM 19-20, 26-9, 39 n.l3, 51, 57, 58. 65 n.13, 66 n.17, 72-4, 91, 95, 99-100, 119, 128 n.17, 129 n.22, 131 n.36, 155 n.14, 166 7. 170 TOSM 21, 39 n.14. 26-7, 40 n.18, 19, 72-4, 93-4, 95, 99, 128 n.18, 167 TYPE 13, 19, 31, 56-7. 69-70, 85-6, 98. 100, 102, 105, 107, 129n.24. 130n.30, 131 n.36, 171-2, 174, 186
Author index*
Arnold,J. K. 22, 39 n.15, 43 n.31, 60-1, 75, 173 Arnold,.]. E. et al. 43 n.31, 47, 151 Batcs,E. 7, 9, 47, 155 n.12, 157-9, 171. 181 n.2 Behaghel, O. 38 n. 12, 59 Berg, T 160, 172, 181 n.1 Bock.J. K. 18, 40 n.19, 53, 55, 77 n.7, 120, 164, 169-70, 171, 182 n.10 Bolinger, I). 9 n.4, 10 n.4, 12, 16-18, 19-20, 24-5, 38 n.10, 52, 64 n.10, 72, 87-8, 129 n.24 Bolkesrein, A. M. 1, 27-8, 62 n.4, 74, 92 Browman. C. P. 43 n.31, 47, 56, 120. 155 n.13, 189 Chen, P. 14-15, 20-1, 26-8, 35, 39 n.14, 40 n. 19, 42 n.28, 76 n.3, 5, 84, 86, 90-4. 99, 127 n.10. 188 Cowie, A. P. 10 n.4; 15, 72 Dearie, P.I) 16-17, 28-30, 41 n.26, 50 Dell, G. S. 7, 9, 157. 159, 1 6 1 , 163-4, 182 n.5-6, 8, 10, 11, 16-8, 183n.l8 Erades, P. A.
9 n.4, 12, 13, 18
Francis, B. 43 n.31, 47, 145 6. 187 Frasei; B. 9 n.4, 13-16, 21, 22-4, 31, 33- 4, 37 n.4, 5, 40 n.17, 43 n.31, 55, 75 n.1, 82, 86 7 Givon, T. 8 9, 10 n.8. 14, 20- 1, 26, 39 n.14, 40 n.l9, 41 n.23, 42 n.28, 44, 48, 50, 57, 76 n.6, 81, 92, 125-6 n.3, 146, 155 n.12, 181 Goldberg, A. 6, 10 n.4. 53, 154 n.9- 10, 190
Cries, St. Th. 14, 16-17, 20, 26 8, 29 31, 33, 35, 56, 76 n.3, 88, 131 n.39, 191 n.2 Hawkins,.). A. 7 9, 14, 35, 54, 56, 59, 64 n.12, 65 n.14, 76 n.3, 5, 82, 84, 132, 139, 142, 146-52, 156 n.17, 20, 157, 174, 178 80, 187-8 Johnson, M. 16, 63 n.8, 72, 77 n.10 Kennedy; G. 9 n.4, 22, 43 n.33. 75 n. 1 Kruisinga, E. 9 n.4, 12, 13, 18 Lakoflf, G. 6, 16, 31, 63 n.8, 72, 77 n.10 133, 154 n.8 Lambrecht, K. 1, 47, 48-50, 62 n.3, 65 n.13, 139, 181 Langacker, R. W. 6, 38 9 n.l2, 63 n.8. 64n.10, 154 n.10 Leech, G. 43 n.31, 47, 145-6, 187 Mat-Kay, D. G 157, 160, 172. 182 n.5-6, 183 n.18, 184 n.22 Mackin, R. 10 n.l, 15, 72 McClelland,).!,. 9, 157, 182 n.6, 183 n.18 MacWhinney, B. 7, 9, 47, 155 n.12, 159, 174, 181 n.2 Moiidorf, B. 1, 120 O'Dowd
9 n.3, 42 n.27, 75 n.l
Peters 18 19, 26, 28-9, 40-1 n.21, 128 n.16 Prince 27, 42 n.27, 50. 62 n.3 Quirk et al. 9 n.4, 10 n.4, 13, 25, 57 Rissclada, R. 1, 27-8, 62 n.4, 74, 92 Rohdenburg, G. 1. 120
* Only the authors whose work is mosl significantly connected to the core of the present study anil who are mentioned several times throughout the book have been included here.
226
AUTHOR INDEX
Rumelhart, D. K. 9, 157, 182 n.6, 183n.l8
Van Dongen. W. A., Sr 12, 13, 35, 43 n.33. 65 n.15
Siewierska, A. 41 n.23, 43 n.31, 129 n.20, 146-7 StembergerJ. P 9, 157, 159-63, 168-9, 182n.5-6, 9, 183 n.12, 14, 183 n.18
Wasow, T. 15, 22, 39 n. 15, 43 n.31, 60 1 75, 124, 152, 173 Xu, X.
43 n.31, 47, 145-6, 187