Rethinking Grammaticalization: New perspectives (Typological Studies in Language)

Rethinking Grammaticalization Typological Studies in Language (TSL) A companion series to the journal Studies in Lang...

Author: Mara Jose Lopez-Couso | Elena Seoane

60 downloads 1815 Views 3MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Rethinking Grammaticalization

Typological Studies in Language (TSL) A companion series to the journal Studies in Language. Volumes in this series are functionally and typologically oriented, covering specific topics in language by collecting together data from a wide variety of languages and language typologies.

General Editor Michael Noonan

University of Wisconsin-Milwaukee

Assistant Editors Spike Gildea

University of Oregon

Suzanne Kemmer Rice University

Editorial Board Wallace Chafe

Matthew S. Dryer

Paul J. Hopper

Ronald W. Langacker

Doris L. Payne

Sandra A. Thompson

Bernard Comrie

John Haiman

Andrej A. Kibrik

Charles N. Li

Frans Plank

Dan I. Slobin

R.M.W. Dixon

Jerrold M. Sadock

Edith Moravcsik

Andrew Pawley

Bernd Heine

Santa Barbara San Diego

Leipzig / Santa Barbara Santa Barbara Melbourne Canberra

Buffalo

Oregon St Paul

Konstanz Chicago Köln

Volume 76 Rethinking Grammaticalization. New perspectives Edited by María José López-Couso and Elena Seoane

Pittsburgh

Santa Barbara Moscow

Berkeley

Milwaukee

Rethinking Grammaticalization New perspectives

Edited by

María José López-Couso Elena Seoane University of Santiago de Compostela

In collaboration with

Teresa Fanego

John Benjamins Publishing Company Amsterdam / Philadelphia

8

TM

The paper used in this publication meets the minimum requirements of American National Standard for Information Sciences – Permanence of Paper for Printed Library Materials, ansi z39.48-1984.

Library of Congress Cataloging-in-Publication Data Rethinking grammaticalization : new perspectives / edited by María José López-Couso and Elena Seoane. p. cm. (Typological Studies in Language, issn 0167-7373 ; v. 76) Includes bibliographical references and index. 1. Grammar, Comparative and general--Grammaticalization. I. López-Couso, María José. II. Seoane, Elena. P299.G73R48 2008 415--dc22 2008009379 isbn 978 90 272 2988 5 (Hb; alk. paper) – isbn 978 90 272 2991 5 (set : alk. paper)

© 2008 – John Benjamins B.V. No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any other means, without written permission from the publisher. John Benjamins Publishing Co. · P.O. Box 36224 · 1020 me Amsterdam · The Netherlands John Benjamins North America · P.O. Box 27519 · Philadelphia pa 19118-0519 · usa

Table of contents Address list Preface Introduction: New perspectives on grammaticalization María José López-Couso and Elena Seoane

vii ix 1

Grammaticalization and the areal factor: The perspective of East and mainland Southeast Asian languages Walter Bisang

15

On the grammaticalization of ‘come’ and ‘go’ into markers of textual connectivity Philippe Bourdin

37

Grammaticalization, typology and semantics: Expanding the agenda Zygmunt Frajzyngier

61

Mismatch: Grammar distortion and grammaticalization Livio Gaeta

103

Areal convergence in grammaticalization processes Anna Giacalone Ramat

129

The grammaticalization of nominalizers in Japanese and Korean: A contrastive study Kaoru Horie

169

On the frills of grammaticalization Tania Kuteva

189

Nominalizations in Bodic languages Michael Noonan

219

On the rise and fall of Korean nominalizers Seongha Rhee

239

The grammaticalization of clausal nominalizers in Burmese Andrew Simpson

265

vi

Rethinking Grammaticalization: New Perspectives

The grammaticalization cline of cardinal numerals and numeral systems Ferdinand von Mengden

289

The development of nominalizers in East Asian and Tibeto-Burman languages Foong Ha Yap and Stephen Matthews

309

Index of names index of languages Index of subjects

343 348 350

Address list Walter Bisang Department of English and Linguistics Johannes Gutenberg University Mainz Jakob-Welder-Weg 18 D-55099 Mainz, Germany e-mail: [email protected] Philippe Bourdin Glendon College York University York Hall 264 2275 Bayview Avenue Toronto, Ontario M4N 3M6, Canada e-mail: [email protected] Zygmunt Frajzyngier Department of Linguistics Box 295 University of Colorado Boulder, CO 80309-0295 USA e-mail: [email protected] Livio Gaeta Dipartimento di Filologia Moderna Salvatore Battaglia Università di Napoli Federico II Via Porta di Massa 1 I-80133 Napoli, Italy e-mail: [email protected] Anna Giacalone Ramat Dipartimento di Linguistica Teorica e Applicata

Facoltà di Lettere e Filosofia Università di Pavia Corso Strada Nuova, 65 I-27100 Pavia, Italy e-mail: [email protected] Kaoru Horie Center for the Advancement of Higher Education Division of Liberal Education Programs Tohoku University Kawauchi 41, Aoba-ku, Sendai-shi 980-8576, Japan e-mail: [email protected] Tania Kuteva Anglistik III - Department of English Language and Linguistics Heinrich-Heine-Universität Düsseldorf Universitätsstrasse 1 D-40225 Düsseldorf, Germany e-mail: [email protected] María José López-Couso Department of English University of Santiago de Compostela Avda. de Castelao s/n E-15782 Santiago de Compostela, Spain e-mail: [email protected]

viii Rethinking Grammaticalization: New Perspectives

Stephen Matthews Department of Linguistics MB124, Main Building The University of Hong Kong Pokfulam, Hong Kong e-mail: [email protected] Michael Noonan Department of English University of Wisconsin Milwaukee, WI 53217 USA e-mail: [email protected] Seongha Rhee English Linguistics Department Hankuk University of Foreign Studies 270 Imun-dong, Dongdaemun-gu Seoul 130-791, Republic of Korea e-mail: [email protected] Elena Seoane Department of English University of Santiago de Compostela Avda. de Castelao s/n E-15782 Santiago de Compostela, Spain e-mail: [email protected] Andrew Simpson Department of Linguistics 3601 Watt Way Grace Ford Salvatori Hall 301 University of Southern California Los Angeles, CA90089-1693 USA e-mail: [email protected] Ferdinand von Mengden Institut für Anglistik und Amerikanistik Universität Hamburg Von-Melle-Park 6

D-20146 Hamburg, Germany e-mail: [email protected] Foong Ha Yap 303A Fung King Hey Building, Main Campus Department of Linguistics and Modern Languages The Chinese University of Hong Kong Shatin, New Territories, Hong Kong e-mail: [email protected]

Preface This book and its companion volume (Theoretical and empirical issues in grammaticalization, TSL 77) offer a representative sample of the contributions presented at the Third International Conference New Reflections on Grammaticalization (NRG3), held at the University of Santiago de Compostela, 17–20 July 2005. This was the third in a series of conferences which began in 1999 with the symposium organized by Ilse Wischer at the University of Potsdam, followed by a second conference organized by Olga Fischer, Muriel Norde and Harry Perridon at the University of Amsterdam in 2002. The great success of these meetings and the growing interest in grammaticalization phenomena over the past two decades prompted us to invite our colleagues in the field to follow the Pilgrims’ Way to Santiago de Compostela in 2005 to share some further reflections on grammaticalization. Attendance at the conference was close to 200, and the papers delivered (after a rigorous selection procedure by the NRG3 scientific committee) came to 120, proving that interest in grammaticalization remains unabated. The distinguished panel of plenary speakers featured William Croft (then of the University of Manchester, now at the University of New Mexico), Zygmunt Frajzyngier (University of Colorado), Anna Giacalone Ramat (University of Pavia), Paul Kiparsky (Stanford University) and Tania Kuteva (University of Dusseldorf). Out of the events organized to run concurrently with the main programme, special mention should be made of the five very productive conference workshops, whose titles are strongly suggestive of the wide range of interesting topics covered there: Reconstructing grammaticalization and the definition of grammar, Grammaticalization and language contact, English intensifiers and grammaticalization, Prototypes and grammaticalization: Grammaticalization as prototype? and The grammaticalization of nominalizers: East Asian perspectives. A great many people have assisted us in organizing the conference and editing this book and its companion. First and foremost, though, we owe a special debt of gratitude to the leader of our research group, Variation, Linguistic Change and Grammaticalization, Professor Teresa Fanego, whose invaluable guidance and support was unfailing throughout the whole process. Special thanks are also due to the two other core members of the group, Belén Méndez-Naya and Paloma Núñez-Pertejo; without their hard work, neither the conference nor these volumes would have been possible. To all three of you, our most heartfelt thanks. We would like to express our gratitude to all the participants for helping to make the conference such a success and to all the authors for their cooperation

x

Rethinking Grammaticalization: New Perspectives

in the edition of the two volumes. We are also greatly indebted to the following colleagues who acted as anonymous external reviewers, for their insightful suggestions for improvements: Anders Ahlqvist, Minoji Akimoto, Henning Andersen, Umberto Ansaldo, John Ole Askedal, Joan Beal, Balthasar Bickel, Walter Bisang, David Bradley, Laurel J. Brinton, Concepción Company Company, Bert Cornillie, Sonia Cristofaro, William Croft, Hubert Cuyckens, Östen Dahl, Nicole Delbecque, Hans den Besten, Holger Diessel, Gabriele Diewald, Nina R. Dobrushina, Teresa Fanego, Gisella Ferraresi, Zygmunt Frajzyngier, Victor A. Friedman, Anna Giacalone Ramat, Tom Güldemann, Florian Haas, Jan Heegard, Paulette M. Hopple, Daniel Jacob, Brian D. Joseph, Johannes Kabatek, Torodd Kinn, Maria Koptjevskaja-Tamm, Bernd Kortmann, Tania Kuteva, Ursula Lenker, Torsten Leuschner, Diana M. Lewis, Belén Méndez-Naya, Marianne Mithun, Michael Noonan, Muriel Norde, Hella Olbertz, Harry Perridon, Paolo Ramat, Matti Rissanen, Anette Rosenbach, Roula Svorou, Johan van der Auwera, Victoria Vázquez-Rozas and Thora Vinther. Special thanks are due to all the student helpers (over twenty of them!) who took care of every little detail before and during the conference. The help of our highly efficient editorial assistants, Pilar Castillo, Lidia Gómez and Iria Fernández, and that of several becarios de colaboración have been crucial in regard to various aspects of the edition of the volumes. Thanks are also due to the editorial staff of John Benjamins, in particular Kees Vaes, and the general editor of the TSL series, Michael Noonan, for their cooperation. Finally, we gratefully acknowledge the financial support of the Autonomous Government of Galicia (grants nos. PGIDIT05PXIC20401PN and 2006/XA-133) and the Spanish Ministry of Education and Science and the European Regional Development Fund (grants nos. HUM2004-00940 and HUM2007-60706). Santiago de Compostela, September 2007 María José López-Couso Elena Seoane

chapter 0

Introduction New perspectives on grammaticalization María José López-Couso and Elena Seoane University of Santiago de Compostela

Almost one hundred years after the publication of Meillet’s ground-breaking article “L’évolution des formes grammaticales” (1912), grammaticalization has become the object of considerable renewed interest, both in terms of its relevance to general linguistic theory and in its application to particular case studies in various languages and language families. The many publications on the topic which have appeared over the past few years testify to this revived interest, among them the collections of papers in Traugott and Heine (1991), Giacalone Ramat and Hopper (1998), Fischer, Rosenbach and Stein (2000), Bisang, Himmelmann and Wiemer (2004), as well as the seminal works of Bybee, Perkins and Pagliuca (1994), Lehmann (1995) and Hopper and Traugott (2003). That grammaticalization is still a hot issue today, after the turn of the century, is also confirmed by the wideranging, stimulating and inspiring discussion at recent international conferences on the topic: the three meetings of the series New Reflections on Grammaticalization (Potsdam 1999, Amsterdam 2002 and Santiago de Compostela 2005) and FITIGRA — From Ideational to Interpersonal: Perspectives from Grammaticalization (Leuven 2005).1 One reason for the attention that grammaticalization has received recently is that some of the basic tenets of grammaticalization theory, such as the principle of unidirectionality and its status as a distinct grammatical phenomenon, have been seriously called into question (cf. Newmeyer’s (1998) deconstruction of grammaticalization or the critical contributions in the special issue of Language Sciences (2001) edited by Lyle Campbell). Moreover, it has increasingly been acknowledged that the traditional characterizations of grammaticalization have a limited range of applicability and a need has been felt for a more comprehensive definition of the term. Several of the contributions in this volume and its 1. A selection of the papers presented at the NRG1 and NRG2 conferences has been edited by Wischer and Diewald (2002) and Fischer, Norde and Perridon (2004) respectively. The publications derived from the FITIGRA conference will appear in the John Benjamins series Typological Studies in Language (Cuyckens, Davidse and Verstraete, forthcoming) as well as in Topics in English Linguistics by Mouton de Gruyter (Cuyckens, Davidse and Vandelanotte, forthcoming).

2

María José López-Couso and Elena Seoane

companion (Theoretical and empirical issues in grammaticalization, also edited by Seoane and López-Couso, TSL 77) share these and other related concerns. From the rich programme of the NRG3 conference, the twelve contributions included in this volume were carefully selected to reflect the state of current research in grammaticalization and suggest possible directions for future investigations in the field. Combining theoretical discussions with the analysis of individual test cases, the selected papers focus on such central issues as the need for a broader notion of grammaticalization (Frajzyngier, Bisang), the areal perspective in grammaticalization (Bisang, Giacalone Ramat and the five papers devoted to nominalizers in East Asian and Tibeto-Burman languages), the relevance of contact-induced change to grammaticalization (Gaeta, Giacalone Ramat, Kuteva) and the distorting effects of grammaticalization on grammar (Gaeta, Kuteva). Other appealing topics discussed in the book include the development of markers of textual connectivity from a crosslinguistic perspective (Bourdin) and the emergence of cardinal numerals and numeral systems (von Mengden). In his paper “Grammaticalization, typology and semantics: Expanding the agenda,” Zygmunt Frajzyngier makes a strong case for broadening the scope of the term grammaticalization. In the wider scenario proposed by the author, grammaticalization subsumes the traditional definition of the term as the development of grammatical morphemes from lexical items (and from already grammatical to more grammatical forms) as one of the means of building grammatical systems, but also extends to cover the emergence of other devices used to code various formal, semantic or pragmatic functional domains. The expanded agenda of grammaticalization advocated by Frajzyngier includes, among others, the following coding means: phonological devices (e.g. tone, pauses or vowel reduction and retention), linear order, the repetition of lexical items or phrases and the grammaticalization of lexical categories and subcategories (e.g. locative predicators). Examples taken from the grammars of some Semitic, Chadic and Omotic languages illustrate the functions, mechanisms and motivations of the grammaticalization of these devices, which have largely been neglected in the relevant literature so far. Within this expanded model of grammaticalization, the paper also raises a number of interesting theoretical questions. Which are the conditions governing the choice of coding means in grammaticalization? Which are the consequences of grammaticalization for grammatical systems and for language change? What are the motivations for grammaticalization? In his answer to the last of these questions, Frajzyngier discusses three basic principles driving grammaticalization: (i) the principle of indirect means, whereby indirect rather than direct means of expression are used in functional domains involving interpersonal relations, (ii) the initial state of the language regarding the coding of some functional domains and the availability of coding means and, especially, (iii) the principle of function-

Introduction

al transparency, according to which the role of every form in the utterance must be transparent to the hearer, which is extensively tested on the coding of various grammatical relations in four Chadic languages (Wandala, Hona, Hdi and Mutsuvan). All in all, the broadening of the scope of grammaticalization proposed by Frajzyngier in this paper no doubt opens up new research challenges for grammaticalization studies in the twenty-first century. One particularly promising domain for future studies is areal grammaticalization, which is addressed by several of the essays in this volume in various ways and to varying degrees. Walter Bisang’s contribution (“Grammaticalization and the areal factor: The perspective of East and mainland Southeast Asian languages”), for example, stresses the need to integrate the areal perspective in grammaticalization and to develop models of grammaticalization which allow for the type of phenomena characteristic of the languages of East and mainland Southeast Asia. The paper discusses a number of properties of grammaticalization which seem to be specific from that area: the lack of obligatory categories, the pervasiveness of pragmatics in grammaticalization processes, the lack of morphologically integrated paradigms, the existence of rigid word-order patterns and the reduced coevolution of form and meaning. Three case studies are used to illustrate these properties: (i) the grammaticalization of the multifunctional Khmer verb ba:n ‘come to have,’ which shows how a wide range of inferences can be triggered by one and the same marker, (ii) Thai classifiers, another instance of multiple inference depending on the constructional context, and (iii) the rigid word-order patterns involving tense-aspect-modality markers, coverbs and directional verbs in Thai and Khmer. The author then proceeds to examine the aforementioned properties and their structural consequences from the perspective of the standard approaches to grammaticalization as found in Heine et al. (1991), Bybee et al. (1994), Lehmann (1995) and Hopper and Traugott (2003). His conclusion is that none of these models is fully adequate to account for the grammaticalization processes which take place in East and mainland Southeast Asian languages. Thus, for example, Lehmann’s (1995) framework takes the coevolution of form and meaning for granted and most of his criteria for measuring the autonomy of a linguistic sign do not apply to the grammaticalized items in the languages of this area. Similarly, the leading role of pragmatics and the distinctive relevance of constructions in the processes of grammaticalization discussed by Bisang do not easily fit into Heine et al.’s (1991) approach. Bybee et al. (1994) do indeed acknowledge the relevance of inference and integrate constructions in their model; however, their approach cannot deal with the relative unimportance of phonetic reduction and the limited coevolution of form and meaning found in East and mainland Southeast Asian languages. Hopper and Traugott’s (2003) model, on the other hand, accounts for grammaticalization in the languages at issue more satisfyingly, but only in connec-

3

4

María José López-Couso and Elena Seoane

tion with the early stages of the process, so that their approach is not completely satisfactory either. The examples discussed by Bisang in this article therefore prove that the languages of East and mainland Southeast Asia show a specific areal type of grammaticalization which has so far gone unnoticed by the leading theorybuilding researchers in the field. In view of this, the author draws our attention to the potential existence of similar neglected areal grammaticalization types in other language areas which are still to be discovered. Areality in East Asian languages is also the concern of five other articles in the volume, which all focus on nominalization phenomena, a topic which has attracted growing interest in recent years. These five contributions (by Kaoru Horie, Michael Noonan, Seongha Rhee, Andrew Simpson, and Foong Ha Yap and Stephen Matthews) constitute a selection of the papers presented at one of the very successful workshops organized during the conference, entitled The grammaticalization of nominalizers: East Asian perspectives.2 The workshop, convened by Foong Ha Yap, represented a concerted attempt to trace the development of nominalizers from multiple angles, including diachronic, synchronic and crosslinguistic approaches. Through detailed analyses of East Asian and Tibeto-Burman languages, the workshop presented areal evidence of recurring pathways and language-specific variations in the development of nominalizers across different language families and tried to clarify the close links between different types of nominalization constructions. Some of the major themes discussed in the different sessions included the question of how and from which sources nominalizers emerge, the relationship between adnominals/nominalizers and the tense-aspectand-modality systems, and the rise and fall of old and new nominalizers manifested in languages like Chinese, Japanese and Korean. The contribution by Foong Ha Yap and Stephen Matthews (“The development of nominalizers in East Asian and Tibeto-Burman languages”) derives from the overview paper of the workshop and serves as a lead into the four other articles included in the volume. The first part of the paper reviews earlier research on the development of nominalizers in various East Asian and Tibeto-Burman languages, including Classical and Lhasa Tibetan, Chantyal, Gurung, Lahu, Japanese, Okinawan, Korean and a number of Chinese dialects (Chaozhou, Cantonese and Mandarin). The authors draw out similarities and differences between grammaticalization pathways of nominalizer morphemes, clarify the close links between different types of nominalization phenomena and summarize areas of current thinking and future research. The emphasis is on the etymological sources of nominalizers in the selected languages and on crosslinguistically robust grammat2. Four other fruitful workshops were organized at NRG3. For the topics discussed, cf. the preface to this volume.

Introduction

icalization trajectories (e.g. from lexical source to pronominal to nominalizer or from nominalizer to other functions, such as relativizer, complementizer or stance marker). Yap and Matthews then explore in greater detail several issues observed in the first section of data which seem to require further investigation, such as the frequent interaction of nominalization with tense/aspect and with stance functions, the close link between nominalization and copula usage (for example, in adnominal constructions or in cleft structures) and the analysis of forgotten or residual nominalizers which we may be able to identify by reading backward from areal historical patterns. Kaoru Horie’s article, “The grammaticalization of nominalizers in Japanese and Korean: A contrastive study,” draws our attention to the complex nominalization system of Japanese and the way it compares with that of Korean, a language exhibiting remarkable structural similarities to Japanese. The paper describes how, in view of their functional extension and semantic generalization, Japanese overt sentential nominalizers (e.g. no, mono, koto, tokoro, among others) show differential degrees of grammaticalization and how, either independently or in combination with other forms such as case particles or copulas, they display a wide variety of grammatical uses: complementizer, cleft-construction marker, marker of internally-headed relative clauses, conjunction, marker of modal and aspectual constructions and sentence-final particle. Some of these grammaticalization pathways are also attested in other languages of East Asia, which points to the areal nature of versatile nominalization (cf. Noonan 1997). Interestingly, however, the grammaticalization of nominalizers in Japanese appears to exhibit a greater type variation and token instantiation than those found in other closely related languages. Thus, Horie’s detailed contrastive analysis of the grammaticalization patterns of overt sentential nominalizers in Japanese and Korean reveals that Japanese no and tokoro, for instance, show a higher degree of multifunctionality and attest a more advanced stage of grammaticalization than their Korean counterparts kes and tey. The paper further suggests that the contrast between Japanese and Korean nominalizers is not an isolated phenomenon, but is also observable in other grammatical domains. An example of this parallel is seen in the development of periphrastic aspectual constructions, whose grammaticalization is shown by Horie to have proceeded further in Japanese than in Korean. Nominalization in Korean is also the topic addressed in Seongha Rhee’s paper “On the rise and fall of Korean nominalizers,” though from a different perspective. The article provides a succinct summary of the changes which have affected the complex system of Korean nominalizers from Old Korean times to the present day. In this diachronic survey, the author explores interrelated issues in the historical development of a variety of nominalizers (e.g. -m, -ki, -ti and its variant -ci or -kes), among them the following: their emergence from various sources (mostly

5

6

María José López-Couso and Elena Seoane

nouns denoting ‘thing’ or ‘place’), the semantic changes they have gone through in their development from their known sources or the range of uses they display at the different stages in the history of Korean. Of particular interest is the detailed discussion of the competition between several nominalizing morphemes for the same functional space and the subsequent specialization which some of these forms have undergone over time or, otherwise, their acquisition of new functions in related domains. Rhee shows how in Modern Korean -kes stands out as the most common and also the most versatile and most abstract nominalizer, while other nominalizing morphemes have specialized for specific functions. Examples of specialization are found in -ki and -ci, both deriving from the same lexical source, which are now constrained to occur in affirmative and negative constructions respectively; specialization is also attested, for instance, in the different levels of illocutionary force shown by -m, -ki, -ci and -kes when used as sentential end-markers. In other cases, the functional overlap in the nominalization domain is avoided by the disappearance of the original nominalizing function on the part of some of the competing forms and the development of new uses. This is precisely the path followed by the former nominalizers -n and -l, which developed into adnominalizers from Middle Korean onwards. The paper by Michael Noonan, “Nominalizations in Bodic languages,” offers a useful survey of grammaticalization patterns among nominalizers in the Bodic group of Sino-Tibetan languages. As amply documented in Noonan (1997), nominalization phenomena are also pervasive in the languages of this region, where constructions headed by nominalizations are used far beyond the range of their core functions. A clear example of these extended uses of nominalizations is found in Chantyal, a Tamangic language spoken in Nepal, where nominalizations can serve up to ten different functions. As the author shows, the functional extension of nominalizations seen in Bodic languages is also attested in a number of neighbouring language families in the Indo-Altaic speech area, which suggests that the phenomenon reflects areal, rather than genetic, tendencies. Special attention is paid in the paper to a prominent feature of the grammar of many Bodic languages, namely the nominalization-relativization syncretism, i.e. the formal identity between the morphological marker signalling nominalizations and that marking relative clauses. The paper also addresses other issues related to the history of nominalizations in these languages, such as various additional uses of nominalizations (e.g. agent and patient nominals and the use of nominalized clauses as main clauses), the potential sources of nominalizing suffixes (typically generic nouns), the addition of the genitive to nominalizers when used adnominally in many languages of the Tamangic group or the innovations in nominalizations. Concerning the construction with the genitive, Noonan suggests that it may have been acquired through areal diffusion from the Tibetan Complex, rather than being

Introduction

reconstructed to the common ancestor of the Tibetan Complex and Tamangic, as maintained in Noonan (1997) and DeLancey (2005). As to the innovations in the system of nominalizations, the author discusses three different types of changes: (i) the elaboration of the categories coded by nominalizers, by means of the addition of specialized nominalizing suffixes or of the introduction of tense-aspect and person-number distinctions between nominalizers; (ii) the extension of the nominalization-relativization syncretism into novel functions, such as the generalization of the Chantyal nominalizer -wa as a general marker of attribution; and (iii) the loss of the nominalization-relativization syncretism, attested in Western Himalayish languages. The last contribution which furthers the study of nominalization phenomena in Asian languages is Andrew Simpson’s analysis of the grammaticalization of nominalizers in Burmese, a language with a particularly rich system of nominalizing elements. The paper opens with an overview of the morphosyntactic properties of nominalizers, their varied sources and the array of tests which can be used for the identification of nominalized constructions, including both lexical projection tests and traditional distributional tests (e.g. the ability to occur in subject position or to be coordinated with other nominals). The author then moves on to examine nominalization in two complementary varieties of Burmese, Colloquial and Literary Burmese, and apply the test criteria for nominalization to the data, focussing on the grammaticalization of sentential nominalizers. After a detailed analysis of the common and divergent morphosyntactic features of the Literary Burmese forms thii and mii and their Colloquial Burmese equivalents te and me, Simpson concludes that grammaticalization has progressed a significant stage further in the colloquial language than in its literary counterpart. He argues that te and me, once part of the nominalizer paradigm of Colloquial Burmese, no longer create nominal categories and are not to be considered nominalizers synchronically. Once their nominal specifications are lost, these former nominalizers are reanalysed as grammatical elements of a rather different type, instantiating verbrelated categories, such as the representation of realis/irrealis distinctions. Simpson’s comparison of the synchronic patterning observed in Colloquial and Literary Burmese is not only interesting in itself, but also provides deeper insight into the reanalysis of already grammaticalized morphemes and into a general theory of the grammaticalization of nominalization strategies. The relevance of language contact for grammaticalization studies has recently attracted considerable attention among linguists and has been recognized as being more important than previously thought. Canonical grammaticalization processes have commonly been attributed to universal tendencies and internally motivated changes (cf. Hopper and Traugott 2003, among many others); a number of papers in this volume try to demonstrate how the universal properties of gram-

7

8

María José López-Couso and Elena Seoane

maticalization can be legitimately reconciled with the necessary specificity of historical contact. One of these is Anna Giacalone Ramat’s “Areal convergence in grammaticalization processes.” In particular, her article offers a detailed discussion of four well-known developments within the European linguistic area, or Standard Average European (Haspelmath 2001), a well-established concept but one which is still in need of deeper insights into the mechanisms responsible for the areal distribution of its features and the role of contact. The author convincingly argues that language contact was involved in the genesis of four linguistic changes, as follows. The first is the development of periphrastic ‘have’-perfects in Romance, Germanic and other European languages, which she describes as the result of a process of areal diffusion which took place in two stages: firstly, Late Latin acts as the centre of innovation and the vehicle for the spread of ‘have’-perfects in Western Europe; secondly, periphrastic perfects are used as past tenses in some languages such as French, (Northern) Italian and German, a process that is well attested crosslinguistically but for which, in Giacalone Ramat’s view, an areal diffusion hypothesis should not be discarded, since the area where it occurs is limited and made up of contiguous sub-areas of intense and long-standing crosscultural contacts. The second development studied by the author is the extension of the definite article area in Europe, where some Slavic languages (especially spoken Czech, Sorbian and Slovenian) depart from the traditional Slavic pattern and are undergoing changes in their demonstrative systems leading to the emergence of definite articles, probably under the influence of German. The third change concerns the renewal of the relative pronoun paradigm in European languages, where the spread of inflecting relative pronouns in Romance and Germanic languages, the author argues, is the result of contact-induced grammaticalization and not a direct development from Latin. Finally, the fourth change analysed by Giacalone Ramat is the development of reflexive markers towards passive and impersonal constructions, for which three grammaticalization areas can be recognized; firstly, the area of core Standard Average European languages (Romance languages and German) with predominantly one clitic form for the reflexive and related functions; secondly, the Baltic Sea area, in which postfixes have emerged as the result of contact-induced change; and, thirdly, languages outside the Standard Average European core area (Hungarian, Turkish, Albanian, Greek, Celtic languages and Maltese), which exhibit distinct forms for the reflexive pronouns and middle/passive markers. In the cases examined in this paper contact therefore reveals itself as a factor which triggers and accelerates grammaticalization (cf. Bisang et al. 2004) and which does not contravene widely accepted grammaticalization principles like unidirectionality. Contact-induced change in grammaticalization is also the central topic of Tania Kuteva’s paper, “On the frills of grammaticalization.” Contrary to the trad-

Introduction

itional assumption that language contact brings about simplification, Kuteva shows how situations of intense language contact can be responsible for elaborateness or redundancy of marking (“frills”). The paper thus adds a significant new perspective to existing work on linguistic accretion, such as the residual quirks theory, according to which accretion results from the overlap of historical residues and newly emerged marking; the hermit crab theory, which concerns the coexistence of an old form, subject to heavy phonological attrition, and a phonologically longer, novel form; or Haiman’s (forthcoming) decorative imagery theory, whereby linguistic bulking may happen for purely aesthetic reasons. Illustrative examples from a wide range of languages and from different grammatical domains (e.g. relative clauses, comparison of inequality or definiteness marking, among others) provide Kuteva with the material to demonstrate how in situations of language contact a replica language may show linguistic bulking for an existing grammatical category through the influence of the model language. Thus, in answering the question why linguistic material sometimes accumulates, the author integrates cognitive processes of grammaticalization with contact-induced change. As in some of her earlier work (e.g. Heine and Kuteva 2005), Kuteva maintains that languageinternal change and contact-induced change are not mutually exclusive. She contends that contact-induced grammaticalization always involves language-internal grammaticalization and that the factors at work in the latter can also be seen operating in the former. As a matter of fact, the two explanations offered by Kuteva to account for the phenomenon of accretion in contact-induced grammaticalization make use of the notion of overlap recognized in language-internal grammaticalization processes. On the one hand, overlap is manifested as a diachronic stage of the grammaticalization development. In other words, the accretion which results from contact-induced grammaticalization is characteristic of a transitional period showing coexistence of marking from both the model and the replica language; in this sense, linguistic accretion in contact-induced grammaticalization could be seen simply as a sub-type of the notion of residual quirks mentioned above. On the other hand, overlap also manifests itself in contact-induced grammaticalization as a synchronic buffer zone in the geographical distribution of competing grammaticalization patterns. In those areas where contact is most intense and two grammaticalizing processes meet, the areal overlap may lead to a merger of the two patterns, that is, to chunking of linguistic material. The potentially distorting effects of grammaticalization on grammar are also discussed in Livio Gaeta’s contribution. His paper (“Mismatch: Grammar distortion and grammaticalization”) shows that grammaticalization can provide fruitful insights regarding both the way in which language mismatches are produced and constrained and the theoretical mechanisms which can best capture these constraints. The concept of mismatch, mostly used by geneticists along with the

9

10

María José López-Couso and Elena Seoane

related concept of repair, has been used to explain synchronic form-meaning discrepancies in language by Francis and Michaelis (2003). In his paper, Gaeta applies it diachronically and investigates its distorting impact on grammar by focussing on synchronic mismatches between form and content which result from the expansion of processes of grammaticalization. Gaeta recognizes two main types of mismatch, namely content mismatches and complexity mismatches, and, drawing on a wealth of examples from various languages, he concentrates on the different ways in which grammaticalization may lead to them. On the one hand, a given grammaticalization process and its generalization may result in the emergence of new non-canonical constructions which are incompatible with the general pattern of a language; this is the case, for example, of the progressive periphrasis in Basque. On the other hand, mismatches may be due to an unchanged residue of grammaticalization, so that the process does not affect all the items that should, in principle, be susceptible to its workings, what Gaeta calls inertial resistance; this he exemplifies by means of a “classical example of lexical inertia,” namely the strong/ weak verb inflection in Germanic languages. Grammaticalization processes may also leave an unchanged residue because there may be a constraint in the grammar of the language which prevents the process from affecting certain items; the example here is the Infinitivus Pro Participio (IPP)-effect in German and Dutch, i.e. cases in which an infinitive appears in constructions where one would have expected a past participle (cf. Gaeta 2005; Schmid 2005). Finally, mismatches can also result from language contact, either by hindering the potential grammaticalization of endogenous constructions due to ready-made borrowed structures, as in Khmer, where a productive and functional morphological system does not develop because of contact with Thai and Vietnamese, or by exploiting new grammatical options of a foreign origin, as in Vietnamese, which uses morphological right-heading due to contact with Chinese (cf. Bisang 2001). The two remaining articles in the volume, those by Bourdin and von Mengden, offer crosslinguistic evidence for two interesting grammaticalization clines. Philippe Bourdin’s contribution (“On the grammaticalization of ‘come’ and ‘go’ into markers of textual connectivity”) gives an overview of crosslinguistically attested cases of the grammaticalization of motion verbs meaning ‘come’ and ‘go’ into markers of textual coherence and clause linkage. This, he argues, is another instance of the metaphorical mapping of space in time, which also underlies the grammaticalization of these verbs into future markers and markers of progressive aspect (cf. Bourdin 2000, 2004). His data illustrate the very well-known fact that grammaticalization is not a unitary concept but is characterized by a cluster of formal and semantic properties, and, consequently, linguistic phenomena are not amenable to fast and hard criteria but need to be analysed in terms of a grammaticalization cline (cf. Fischer et al. 2004). In the semantic dimension of the

Introduction

grammaticalization cline he presents, from directional deictics to markers of textual connectivity, the relevant markers have sometimes retained varying shades of their original semantics (as the ‘go’-derived sequential je in Izi), while in other languages they have evolved further into exponents of past tense (as is the case with the past tense auxiliary in Old Catalan, derived from lexical anar ‘go’). The range of morphosyntactic mutations involved is just as wide: some markers continue to behave like full-fledged verbs (e.g. in Neo-Aramaic), while others have turned into affixes (as in Supyire); some have become auxiliaries or have travelled the serialization route (as in Mari) and yet others have fossilized (as in Bagiro). Besides deicticity, Bourdin discusses two further factors which may have played a role in triggering the grammaticalization process at issue. Firstly, the iteration scenario, which involves fairly substantial crosslinguistic evidence that genetically and geographical unrelated languages make iterated use of their markers of directional deixis to indicate textual cohesion. Secondly, the futurity scenario, which involves the disputed link between grammaticalization as a future marker and grammaticalization as a sequential. From the level of textual connectivity, we shift to a fairly neglected area in linguistic research with Ferdinand von Mengden’s paper, namely the grammaticalization of cardinal numerals and numeral systems. After a detailed definition of these two concepts, von Mengden goes on to explain the emergence of numeracy in languages, taking into account that, phylogenetically, the first instance of numeracy is not a spoken number but gestures, which often involve body part counting and expressions. Later, these body part lexical items tend to become used as numerically specific quantifiers, in a process which qualifies as grammaticalization. This is the initial step in the grammaticalization cline postulated by von Mengden in his paper, namely cardinal numerals are likely to develop out of body part nouns, the potential target being any non-complex numeral (e.g. the development of a word for ‘hand’ to the numeral ‘5’; cf. Heine and Kuteva 2002: 166). The second step in the cline involves the formation of numeral systems. The author provides evidence showing that, once linguistic number expressions have become autonomous, they are organized into combinations based on arithmetical operations, which form complex number expressions serialized and integrated into numeral systems. From this he generalizes the second step in this cline: within numeral systems, cardinal numerals may develop into affixes which can be interpreted as grammatical markers inasmuch as they signal the relations among the constituents of complex numerals. During this development, the relevant expression may lose syntactic autonomy and become an extensionally modifying constituent of a noun phrase. For example, English -teen in fourteen and -ty in fourty are bound morphemes which are syntactically less autonomous than ten, the simple numeral from which they derive. Von Mengden connects these two developments

11

12

María José López-Couso and Elena Seoane

with grammaticalization as is generally understood among linguists: they represent a crosslinguistic pathway from a body part noun (a referential expression) to a numeral and further to a functional affix encoding internal arithmetic relations within complex numerals. We hope that the articles in this volume will open new directions for future research in the field of grammaticalization and related phenomena, and will encourage readers to continue the dialogue started by Meillet nearly a century ago.

References Bisang, Walter. 2001. Areality, grammaticalization and language typology. On the explanatory power of functional criteria and the status of Universal Grammar. In Walter Bisang (ed.), Language typology and universals, 175–223. Berlin: Akademie-Verlag. Bisang, Walter, Nikolaus P. Himmelmann & Björn Wiemer (eds.). 2004. What makes grammaticalization? A look from its fringes and its components. Berlin: Mouton de Gruyter. Bourdin, Philippe. 2000. À propos des futurs ventifs et itifs: Remarques sur un paradoxe. Verbum 22(3): 293–311. Bourdin, Philippe. 2004. Espace, temporalité, personne: Parcours en linguistique anglaise, française et générale. Thèse de Doctorat d’État, Université Paris XIII. Bybee, Joan L., Revere Perkins & William Pagliuca. 1994. The evolution of grammar: Tense, aspect and modality in the languages of the world. Chicago IL: University of Chicago Press. Campbell, Lyle (ed.). 2001. Grammaticalization. A critical assessment. (Special issue of Language Sciences 23(2–3)). Cuyckens, Hubert, Kristin Davidse & Lieven Vandelanotte (eds.). Forthcoming. Subjectification, intersubjectification, and grammaticalization [Topics in English Linguistics]. Berlin: Mouton de Gruyter. Cuyckens, Hubert, Kristin Davidse & Jean-Christophe Verstraete (eds.). Forthcoming. Grammaticalization and grammar [Typological Studies in Language]. Amsterdam: John Benjamins. DeLancey, Scott. 2005. Relativization and nominalization in Bodic. In Tibeto-Burman linguistics: Proceedings of the 28th Annual Meeting of the Berkeley Linguistics Society, 5–72. Berkeley CA: Berkeley Linguistics Society. Fischer, Olga, Muriel Norde & Harry Perridon (eds.). 2004. Up and down the cline: The nature of grammaticalization [Typological Studies in Language 59]. Amsterdam: John Benjamins. Fischer, Olga, Anette Rosenbach & Dieter Stein (eds.). 2000. Pathways of change. Grammaticalization in English. Amsterdam: John Benjamins. Francis, Elaine J. & Laura A. Michaelis. 2003. Mismatch: A crucible for linguistic theory. In Elaine J. Francis & Laura A. Michaelis (eds.), Mismatch: Form-function incongruity and the architecture of grammar, 1–27. Stanford CA: CSLI. Gaeta, Livio. 2005. Ersatzinfinitiv im Deutschen: Diachrone Überlegungen zu einem synchronen Rätsel. In Claudio Di Meola, Antonie Hornung & Lorenza Rega (eds.), Perspektiven Eins. Akten der ersten Tagung deutscher Sprachwissenschaft in Italien, 149–165. Roma: Istituto Italiano di Studi Germanici.

Introduction

Giacalone Ramat, Anna & Paul J. Hopper (eds.). 1998. The limits of grammaticalization. Amsterdam: John Benjamins. Haiman, John. Forthcoming. The esthetic impulse of grammar. Oxford: OUP. Haspelmath, Martin. 2001. The European linguistic area: Standard Average European. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and language universals, Vol. II, 1492–1510. Berlin: Mouton de Gruyter. Heine, Bernd, Ulrike Claudi & Friederike Hünnemeyer. 1991. Grammaticalization: A conceptual framework. Chicago IL: University of Chicago Press. Heine, Bernd & Tania Kuteva. 2002. World lexicon of grammaticalization. Cambridge: CUP. Heine, Bernd & Tania Kuteva. 2005. Language contact and grammatical change. Cambridge: CUP. Hopper, Paul J. & Elizabeth Closs Traugott. 2003. Grammaticalization. 2nd edn. Cambridge: CUP. Lehmann, Christian. 1995. Thoughts on grammaticalization: A programmatic sketch. 2nd edn. München: Lincom. Meillet, Antoine. 1912. L’évolution des formes grammaticales. Scientia (Rivista di Scienze) 12/26 (6): 384–400. Newmeyer, Frederick J. 1998. Language form and language function. Cambridge MA: MIT Press. Noonan, Michael. 1997. Versatile nominalizations. In Joan Bybee, John Haiman & Sandra A. Thompson (eds.), Essays on language function and language type. In honor of T. Givón, 373–394. Amsterdam: John Benjamins. Schmid, Tanja. 2005. Infinitival syntax: Infinitivus pro participio as a repair strategy. Amsterdam: John Benjamins. Seoane, Elena & Maria José López-Couso (eds.). 2008. Theoretical and empirical issues in grammaticalization [Typological Studies in Language 77]. Amsterdam: John Benjamins. Traugott, Elizabeth Closs & Bernd Heine (eds.). 1991. Approaches to grammaticalization. 2 vols. Amsterdam: John Benjamins. Wischer, Ilse & Gabriele Diewald (eds.). 2002. New reflections on grammaticalization [Typological Studies in Language 49]. Amsterdam: John Benjamins.

13

chapter 1

Grammaticalization and the areal factor The perspective of East and mainland Southeast Asian languages Walter Bisang University of Mainz Grammaticalization in East and mainland Southeast Asian languages is characterized by a set of properties which seem to be typical of that area. The aim of this paper is to present these properties and to examine them from the perspective of several prominent approaches to grammaticalization: Heine, Claudi and Hünnemeyer (1991), Bybee, Perkins and Pagliuca (1994), Lehmann (1995) and Hopper and Traugott (2003). The specific properties of grammaticalization under discussion will be the lack of obligatory categories, hence the high relevance of pragmatics, the existence of rigid syntactic patterns, and the limited coevolution of form and meaning. The data on East and mainland Southeast Asia show that it seems not only necessary to assume areally relevant patterns of grammaticalization, but also to develop models which allow for more structural variation as far as processes of grammaticalization are concerned.

1. Introduction: Properties of grammaticalization in East and mainland Southeast Asia In a large number of languages from East and mainland Southeast Asia, grammaticalization is characterized by the following three features: a. Lack of obligatory categories and predominance of pragmatic inference even in the case of highly abstract grammatical concepts such as tense or definiteness. b. Existence of rigid syntactic patterns (word-order patterns). c. No or limited coevolution of form and meaning. The definition of obligatoriness adopted in this paper is that of Lehmann (1995): a category is obligatory if the speaker is forced to specify that category by selecting a marker which belongs to it. In Lehmann’s (1995: 139) own words: By this [i.e. transparadigmatic variability] we mean the freedom of the language user with regard to the paradigm as a whole. The paradigm represents a certain category, and its members, the subcategories (or values) of that category. There may then be a certain freedom in either specifying the category by using one of its subcategories, or

16

Walter Bisang

leaving the whole category unspecified. To the extent that the latter option becomes constrained and finally impossible, the category becomes obligatory. We shall therefore use the term ‘obligatoriness’ [. . .].

The lack of obligatoriness is particularly remarkable in cases where the concept inferred is an abstract grammatical one expressed by obligatory categories in IndoEuropean languages. While these functions are conventionalized in Indo-European, they are the product of pragmatic inference in many markers of East and mainland Southeast Asian languages. This is corroborated by the fact that, in a number of cases, one and the same marker may express different grammatical concepts in different situations or in different constructions (cf. the functions of Khmer ba:n ‘come to have’ in Section 2.1 below or the functions of classifiers in Thai discussed in Section 2.2). As a consequence, the synchronic representation of the relation between the different functions of a grammatical marker is not that of a cline or path of grammaticalization with one function being cognitively related to its succeeding function (e.g. “object > space > time > quality” in Heine et al. 1991: 65), but rather that of an initial source concept which simultaneously radiates into different directions (cf. Section 2.1).1 Even if one looks at the diachronic development of these markers, it does not seem to follow a neat cline or pathway.2 Part of this diachronic picture may be due not only to pragmatics, but also to the highly complex contact situation in East and mainland Southeast Asia (cf. Enfield 2003 on ‘come to have;’ cf. also Section 4.1 below). In spite of their lack of obligatoriness, grammatical markers follow very rigid word-order patterns in East and mainland Southeast Asian languages and they belong to relatively closed classes which must take certain positions within constructions. Rigid word order and closed-class membership clearly indicate a high degree of grammaticalization (cf. Lehmann 1995 on paradigmaticity and syntactic variability; cf. also Section 3.1). Thus, there is a class of markers with an undoubtedly high degree of grammaticalization. These markers clearly express grammatical functions, though they are not obligatory, and the grammatical concepts which they express are the result of inference (cf. the examples in Sections 2.1 and 2.2 below). As pointed out by Ansaldo and Lim (2004) for Sinitic languages, grammaticalization in East and mainland Southeast Asian languages is not primarily expressed at the level of morphological reduction, but rather by phonetic erosion, in terms of 1. Other authors arguing against the existence of a cline in grammaticalization in East and mainland Southeast Asian languages are Ansaldo (1999) on ‘surpass’ comparatives and Enfield (2003) on ‘come to have’ verbs. 2. Even if the clines discussed by authors like Bybee et al. (1994) are claimed to reflect diachronic stages of development, one should keep in mind that they are primarily derived from typological data based on synchronic material.

Grammaticalization and the areal factor

duration and vowel quality. These authors argue very plausibly that this is due to the discreteness of syllable boundaries and to phonotactic restraints. In my view, these phonological properties together with the broad functional spectrum of one and the same marker and the lack of obligatoriness prevent the development of morphologically integrated paradigms as we know them from Indo-European languages (cf. Section 4.2). The consequence of these properties is that the degree of erosion attested in East and mainland Southeast Asia is not strong enough in most cases for a parallel development of meaning and form. This can be seen synchronically by looking at the data and, as a consequence, must also be true for the diachronic development of these markers. In this paper I would like to show that the three above-mentioned properties of grammaticalization cannot be fully accounted for by any of the standard approaches to grammaticalization described in Section 3. If this is true, and if these properties are areal (cf. Bisang 1996), that is, if they are widely attested in several language families spoken in East and mainland Southeast Asia, we may ask the more general question of whether the existing approaches to grammaticalization are somehow areally biased. Could it be that certain properties have never been analysed or have never found their way into typologically-oriented questionnaires because they do not exist in familiar Indo-European languages? From such a perspective, we might assume that there exist different types of grammaticalization in which different strategies are of different importance. As will be shown in Section 4.2, the languages to be discussed in this paper are characterized by the predominance of inference. The current paper is structured as follows. Section 2 presents three case studies of grammaticalization which show the high importance of inference in East and mainland Southeast Asian languages. Section 3 assesses the extent to which Heine et al. (1991), Bybee et al. (1994), Lehmann (1995) and Hopper and Traugott (2003) succeed in accounting for the processes of grammaticalization attested in these languages, and in fact proposes that none of these approaches works fully. Section 4 then briefly addresses the issue of areality and the question of why the emergence of morphological paradigms is unlikely in East and mainland Southeast Asia. Finally, Section 5 provides a brief account of how and why grammaticalization remains interesting in the aftermath of Newmeyer’s (1998) deconstruction, through the integration of an areal perspective.

2. Examples This section illustrates the properties of grammaticalization in East and mainland Southeast Asia. The wide range of inferences which can be triggered by a single

17

18

Walter Bisang

marker is discussed in Section 2.1 on the verb ‘come to have.’ Section 2.2 deals with Thai classifiers, which represent another instance of multiple inference. In this case, however, the different functions are associated with different constructions, which shows the relevance of constructions for grammaticalization. Finally, Section 2.3 very briefly illustrates the rigid word-order patterns which determine the preverbal and postverbal positions of three different types of grammatical markers.

2.1 Inference of several different grammatical functions: The case of Khmer baːn ‘come to have’ The lack of obligatory categories and the wide range of inferences associated with highly grammaticalized items is illustrated in this section by the verb baːn ‘come to have’ in Khmer. As shown by Enfield (2003; cf. also Bisang 1992, 1996), verbs with this meaning are extremely widespread across the languages of East and mainland Southeast Asia. Since the data on ‘come to have’ verbs are extensively discussed by Enfield (2003), and since I have examined the data on Khmer more extensively elsewhere (Bisang 2004: 118–121), in what follows only a summary of the most relevant information will be given. Like most ‘come to have’ verbs in mainland Southeast Asia, Khmer baːn occurs preverbally as well as postverbally or clause-finally. Since looking at the preverbal position is sufficient to illustrate the multifunctionality of baːn, I will not discuss here the postverbal or clause-final position. The basic meaning of ‘come to have’ verbs is described by Enfield (2003: 38–39) as follows: The predication made by the expression ‘come to have’ involves two important components, namely an event of something coming into one’s sphere of possession, and a subsequent and consequent state of possessing that thing. [...] ‘Come to have’ treats the recipient (x) as the ‘primary figure’ in the event (i.e. it is encoded as grammatical subject), and there is no necessary reference to the source of transfer, nor is it specified whether the thing moves to the recipient or the recipient moves to the thing.

In example (1), Khmer bːn ‘come to have’ is used as a full verb. (1) Khmer Chnm nìh y`ːŋ bːn sroːv crən ns. year this we come.to.have rice a.lot very ‘This year we came to have a lot of rice.’

If a ‘come to have’ verb occurs as a grammaticalized item in the preverbal position, its interpretation depends on a number of presuppositions. Thus, when the event denoted by the main verb is supposed to be desired, the ‘come to have’ verb expresses ability or permission. With other presuppositions, ‘come to have’ triggers

Grammaticalization and the areal factor

the inference of obligation, past and truth or factuality. The possible inferences of ‘come to have’ in East and mainland Southeast Asian languages are summarized as follows (revised version of Bisang 2004: 119): a. The event E is [+desired]: +> modal interpretation: ‘can’ (potential meaning: abilitative or permissive). b. The event E is [–desired]: +> modal interpretation: ‘must, have to’ (obligation). c. In order for X to come to have E, E must have taken place: +> past (E) (particularly if E is negated). d. In order for X to come to have E, E must be true: +> truth, factuality. In Khmer, the verb bːn can trigger the interpretations of ability and permission (a), of past tense (c) and of truth or factuality (d), while inferences in terms of obligation (b) are marginal. Thus, a constructed example like (2) can be interpreted as follows:

(2) Khmer khɲom bːn t`ːu phsː(r). I come.to.have go market a. Ability or permission: ‘I was able to go to the market.’/‘I was allowed to go to the market.’ b. Past: ‘I went to the market.’ d. Truth or factuality: ‘I do go/I really go to the market.’

The different interpretations of ‘come to have’ listed above can even be combined to a certain extent. Thus, example (2) can also be translated as ‘He did go to the market,’ thus combining interpretations (c) past and (d) truth or factuality. The next example, also discussed in Bisang (2004: 120), is from a novel, and it contains two instances of bːn. In its first occurrence (bːn1), past tense interpret` ŋ. ation is excluded because bːn1 is immediately preceded by the future marker nɯ Since it is known from the context that the event marked by bːn1 is desired, the abilitative reading is the most likely interpretation. This also applies to the second occurrence of the ‘come to have’ verb (bːn2), although here a past interpretation or a factual reading cannot be excluded.

(3) Khmer cùːən-kːl nìːəy kɔː nɯ ` k-sɔŋkhɯ ` m thː bːn1 cùːəp borɔs sometimes she then think-hope quot come.to.have meet man cùəːp dòːc bɔmnɔːŋ. nùh tìət, te-kɔː pụ̀ m bːn2 that again but neg come.to.have meet as wish ‘Sometimes she hoped to be able to meet that man again, but she was not able to/did not meet him as she wished.’

19

20

Walter Bisang

To conclude this section, I would like to point out some basic properties of grammaticalization which can be illustrated by Khmer bːn. Firstly, as in other languages of East and mainland Southeast Asia, there is no obligatory tense-aspectmodality marking in Khmer. Thus, an example such as (3) is also grammatical (but pragmatically less adequate) if the categories marked by bːn are omitted. Secondly, the interpretation of bːn depends on context. In principle, bːn can express all the functions listed above in (a) to (d), with function (b) being rare. Given their inferential character, the different interpretations of bːn should not be treated in terms of polysemy or semantic change.3 In addition, the various functions of bːn are not necessarily related to each other by a cline or a pathway. Rather they seem to be inferred from the same source concept of ‘come to have’ in different contexts with different presuppositions. Although more research is certainly needed here, the diachronic development of bːn does not seem to follow a strict cline either. Finally, bːn basically keeps its phonetic substance even though it expresses grammatical functions and occurs in a position for markers expressing grammatical categories.

2.2 Inference of different grammatical functions depending on the constructional context: The case of classifiers in Thai The languages of East and mainland Southeast Asia are characterized by their transnumerality, that is, their lack of obligatory number marking. Thus, a word like Thai còtmˇj only refers to the concept of ‘letter’ as such without any reference to number. According to Greenberg (1974), there is a typological correlation between transnumerality and the existence of classifiers: “[n]umeral classifier languages generally do not have compulsory expression of nominal plurality, but at most facultative expression” (Greenberg 1974: 25). This correlation applies to Chinese, Japanese, Thai, Hmong and a large number of other languages of East and mainland Southeast Asia. Thus, Thai còtmˇj ‘letter’ needs to co-occur with a classifier in the context of counting, as shown in (4).

(4) Thai a. còtmˇj sˇm chb`b letter three clf ‘three letters’ b. *còtmˇj sˇm letter three

3. One anonymous referee suggested treating the different functions of baːn in terms of polysemy. However, what is being dealt with here is pragmatic inference, rather than semantics.

Grammaticalization and the areal factor

Crosslinguistically, classifiers minimally occur with numerals. In a number of East and mainland Southeast Asian languages, they have additional functions which are all based on the function of classification, i.e. a cognitive activity whereby items of the world and their lexical representations are assigned to certain classes on the basis of certain properties or criteria. The most relevant criteria of classification are the following: material (human, animate, inanimate, abstract), social status, physical properties (shape, consistency, size), functionality (man-made) and location (buildings, places, etc.) (cf. also Denny 1976; Allan 1977; Aikhenvald 2000; Grinevald 2000). The process of classification can be used to profile conceptual boundaries of concepts. Due to this function, classification is the basis of the two main roles involved with numeral classifiers, namely identification and individuation. On the one hand, classification helps in identifying a certain sensory perception by using its conceptual boundaries to highlight that perception against other sensory perceptions. On the other hand, it can establish a sensory perception as an individual item by actualizing those salient inherent properties which constitute it as a conceptual unit. Apart from their function of individuation in the context of counting, Thai classifiers can assume other functions related to the identificational role of classifiers. In such cases, the classifier is always optional, while it is compulsory in combination with numerals. As opposed to the various functions of Khmer b:n ‘come to have,’ those of Thai classifiers depend on the context in which they are used. Table 1 indicates the basic cognitive functions of the classifier, its concrete grammatical functions and the constructions with which they are associated. The individuating function of the classifier is illustrated by example (4a) above, [N NUM CLF]. The singulative interpretation of the classifier is limited to the demonstrative construction. If there is no classifier, as in the case of [N DEM], the noun can be either singular or plural, that is, it remains transnumeral (5a). If the

Table 1. Functions of the Thai classifier in different constructions Basic function

Grammatical function

Construction

Individuation

Countability

Numerative construction: [N NUM CLF]

Identification

Singulative

Demonstrative construction: [N DEM] vs. [N CLF DEM]

Reference

Adjective construction: [N ADJ] vs. [N CLF ADJ] (definiteness/specificity: the object is accessible to the hearer)

Contrastive focus

Adjective and demonstrative constructions (presupposed items are in contrast)

21

22

Walter Bisang

classifier is present, by contrast, the construction has the structure [N CLF DEM] and a singular interpretation can be inferred (5b).

(5) Thai a. rót níi car this ‘this car’/‘these cars’ b. rót khn níi car clf this ‘this car’

In combination with stative verbs, which will be called adjectives here,4 the presence of the classifier triggers referential interpretation in terms of definiteness or specificity. In the adjective construction with no classifier [N ADJ], the noun is referentially neutral, i.e. it can be definite, specific, indefinite and so on depending on the context (6a). If the classifier is present, on the other hand, the construction has the structure [N CLF ADJ] and the noun is interpreted as definite or specific (6b). In an example like (7), with the superlative, the classifier is obligatory because this construction presupposes a specific noun in the subject position.

(6) Thai a. rót sɪˇi dɛɛŋ car red ‘a red car’/‘the red car’/‘red cars’/‘the red cars’ b. rót khn sɪˇi dɛɛŋ car clf red ‘the red car’/‘the red cars’

(7) Thai rót khn sɪˇi dɛɛŋ phɛɛŋ thîi sùt. car clf red expensive most ‘The red car is the most expensive.’

Finally, the classifier can be used to express contrastive focus both in the demonstrative construction and in the adjective construction. This is the case if the items to be discussed are presupposed, that is, they have already been introduced and the classifier has already been used. The general rule in Thai (and in other main4. Like many other East and Southeast Asian languages, Thai makes no distinction between adjectives and verbs, that is, both categories share the same syntactic properties. It is thus an “adjectival-verb” language in terms of Schachter (1985).

Grammaticalization and the areal factor

land Southeast Asian languages; cf. Bisang 1993 on Hmong) is that once a noun has been marked as definite or specific, it will remain unmarked as long as it is used in the same discourse paragraph. If this rule is broken, i.e. if a noun which has already been introduced as definite or specific is marked by the classifier again, the hearer can infer that the adjective (8) or the demonstrative (9) must be understood contrastively.5 Thus, the following example can be used in a context in which a small car and a big car are presupposed. Speaker A says that B likes the big car. B can now correct this wrong presupposition by uttering either (8a) or (8b).

(8) Thai a. Mˆj chˆj, chɔˆɔp rót khn lék mˆk kw`. neg true like car clf small more ‘This is not true, I prefer the small car.’ b. chɔˆɔp rót khn lék, mˆj chɔˆɔp rót khn j`j. like car clf small neg like car clf big ‘I like the small car, I do not like the big car.’

The same can be shown for demonstratives. If there are two cars, of which one is closer to the speech-act participants than the other, a speaker can react as follows to the wrong presupposition concerning the deictic position of the car s/he prefers:

(9) Thai a. Mˆj chˆj, chɔˆɔp rót khn níi mˆk kw`. neg true like car clf this more ‘This is not true, I prefer this car [not that one].’ b. chɔˆɔp rót khn níi, mˆj chɔˆɔp rót khn nán. like car clf this neg like car clf that ‘I like this car, I do not like that car.’

Finally, I would like to call attention to two fundamental properties of grammaticalization which Thai classifiers share with the tense-aspect-modality marker derived from ‘come to have’ verbs (cf. Section 2.1). Firstly, there is no gradual development from one function to the other. Depending on the syntactic context, classifiers can express various functions which are all related to classification and its two main roles. Secondly, in spite of their grammatical function and their occurrence in a particular syntactic position, classifiers show, maximally, a small loss of phonetic substance related to tonality. However, there exists also a difference between ‘come to have’ verbs and Thai classifiers: while the various interpretations

5. For a consistent account of the use of classifiers in on-going text, see Becker (2005).

23

24

Walter Bisang

of ‘come to have’ verbs are not associated with different constructions, the different readings of Thai classifiers are clearly related to different constructions.

2.3 The existence of rigid syntactic patterns The existence of rigid word-order patterns will be illustrated by a brief look at the positions of tense-aspect-modality markers, coverbs and directional verbs relative to the main verb in the clause in East and mainland Southeast Asian languages. Each of these markers are products of grammaticalization; thus, tense-aspect-modality markers are verbs like Khmer b:n ‘come to have’ (cf. Section 2.1 above), verbs which can be used for expressing functions of tense, aspect and/or modality. In turn, coverbs are verbs which can take the function of adpositions, while directional verbs indicate the direction of the event denoted by the main verb. Each of these markers only occurs in specific positions; in the case of Thai and Khmer, the positions for these markers are distributed as follows:6 tense-aspect-modality + main verb + directional verb + coverb + tense-aspect-modality. This word-order pattern is illustrated in (10) and (11) below, where the grammaticalized items are printed in bold. As can be seen, there is a preverbal tense-aspect-modality marker in Khmer and a postverbal tense-aspect-modality marker in Thai. The position for directional verbs can maximally take two verbs in Thai and three verbs in Khmer, and the position for coverbs always follows the position for directional verbs in both languages. (10) Khmer Kɔ`ət bːn yɔ`ːk ʔ yvn coh cëɲ mɔ`ːk he tam take luggage move.down-Vd move.out-Vd come-Vd ʔoy khɲom. give-cov I ‘He took the luggage down and out for me.’ (11) Thai Kháw ʔw krpˇw loŋ m hˆy phǒm lɛ´ɛw. he take luggage move.down-Vd come-Vd give-cov I tam ‘He took the luggage down for me.’ The order of the grammaticalized items relative to each other in (10) and (11) is absolutely rigid, and even the order of the individual directional verbs cannot be changed. The only exception is Khmer b:n, which can also occur in the clausefinal tense-aspect-modality position, in accordance with the word-order pattern given above. 6. For a much more detailed account of the distributional properties of these markers in Chinese, Hmong, Vietnamese, Thai and Khmer, see Bisang (1992, 1996).

Grammaticalization and the areal factor

3. Approaches to grammaticalization in the light of data from East and mainland Southeast Asia 3.1 Lehmann (1995) Lehmann (1995) defines grammaticalization in terms of the autonomy of the linguistic sign, whereby the reduction of autonomy raises the degree of grammaticalization. Autonomy is determined by the three parameters of weight, cohesion and variability, each of which must be analysed from the paradigmatic and the syntagmatic perspectives. This yields the six well-known criteria for measuring grammaticalization, as illustrated in Table 2. By looking at the languages of East and mainland Southeast Asia from the perspective of these criteria, the paradoxical observation can be made that their grammatical markers can express the semantics of categories which are situated at the end of grammatical clines (e.g. tense or reference; cf. Sections 2.1 and 2.2, respectively), although their autonomy is still high in terms of the majority of the criteria discussed by Lehmann (1995). In the case of integrity (i.e. paradigmatic weight), the grammaticalization of a linguistic sign does not run fully parallel to the reduction of phonetic integrity, that is, there is no necessary coevolution of form and meaning. Similarly, the other two paradigmatic criteria do not produce significant effects. Thus, paradigmaticity (i.e. paradigmatic cohesion) only applies to the emergence of closed-class categories, and does not lead to the emergence of paradigms. Paradigmatic variability, in turn, is not developed very far either, since none of the categories expressed by grammaticalized items is obligatory (with the exception of a few well-defined instances, such as the obligatory use of the classifier with numerals, as in (4), and with the superlative in Thai, as in (7)). On the syntagmatic side, structural scope as suggested by Lehmann (1995) is a problematic criterion, because even a highly grammaticalized item can have wide scope if this is compatible with its semantics. Bondedness (i.e. syntagmatic cohesion) does occur with some markers (e.g. the well-known verbal markers -le and -zhe in Modern Standard Chinese), but it is certainly not a general property of grammaticalized items in the languages of East and mainland Southeast Asia. The only parameter which fully applies to these languages is syntagmatic variability. Grammaticalized items Table 2. Lehmann’s (1995: 123) criteria for measuring the autonomy of a linguistic sign Weight Cohesion Variability

Paradigmatic

Syntagmatic

Integrity Paradigmaticity Paradigmatic variability

Structural scope Bondedness Syntagmatic variability

25

26

Walter Bisang

Table 3. Lehmann’s (1995) criteria for measuring autonomy and their relevance for East and mainland Southeast Asian languages Paradigmatic

Syntagmatic

Weight

Integrity (−): No necessary coevolution of form and meaning

Structural scope (?): Grammaticalized items rarely have scope over one single word

Cohesion

Paradigmaticity (−/+): No emergence of paradigms but at least closed-class categories

Bondedness (−): Coalescence is not prominent

Variability

Paradigmatic variability (−): No obligatory categories

Syntagmatic variability (+): Rigid word-order patterns

have to follow strict word-order rules; thus, ‘come to have’ verbs (cf. Section 2.1) must occur in front of the main verb (or in postverbal, sentence-final position). As illustrated with Thai (cf. Section 2.2), classifiers must take specific positions in the numeral construction, the demonstrative construction and the adjective construction. Finally, tense-aspect-modality markers, coverbs and directional verbs are assigned to specific positions relative to each other within the serial unit (cf. Section 2.3). Table 3 summarizes to what extent Lehmann’s (1995) criteria for measuring the autonomy of a linguistic sign, and thus its degree of grammaticalization, apply to East and mainland Southeast Asian languages.

3.2 Heine, Claudi and Hünnemeyer (1991) Grammaticalization in terms of Heine et al. (1991) is a creative act of problem solving by which old resources are used for the expression of new functions. Its properties are motivated by cognition as a language-external factor. The development of new functions for existing resources takes place on two levels, namely macrostructure and microstructure. Macrostructure is characterized by metaphoric processes, that is, changes of meaning from one semantic domain to another through perceptible functional or relational similarities (cf. analogy). The mapping of linguistic items from one domain of conceptualization onto another is the central strategy of grammaticalization. It takes place along universal scales or chains like the following: person > object > activity > space > time > quality. Microstructure, by contrast, is characterized by metonymic processes, that is, pragmatic inferences from context. Metonymic processes trigger context-induced reinterpretations, “whereby conversational implicatures are conventionalized to new focal senses” (Heine et al. 1991: 103). Since metaphor determines the changes from one domain to the next, and since metonymy operates more locally as a bridge between indi-

Grammaticalization and the areal factor

vidual domains, the strategy of metaphor is more basic than that of metonymy. As Heine et al. (1991: 73–74) note, It would seem that metaphor and metonymy form different components of one and the same process leading from concrete to more abstract grammatical concepts. On the one hand, this process is made up of a scale of contiguous entities that stand in a metonymic relation to one another. On the other hand, it contains a smaller number of salient and discontinuous categories such as space, time or quality. The relation between these categories [...] is metaphorical but can also be described as being the result of a number of metonymic extensions. Conceivably, metonymy is the more basic component of this process in that metaphor is grounded in metonymy [...].

If this approach is applied to the languages of East and mainland Southeast Asia, it yields at least two problems. Firstly, metaphor is not the main strategy of grammaticalization in these languages, pragmatic inference being more important and not necessarily leading to scales or chains. Even if individual lexical items can express a number of cognitive domains from this scale, they are in no hierarchical order, that is, all of them can be inferred equally, depending on context and on the constructions to which they belong. Secondly, Heine et al.’s (1991) model is primarily concerned with cognition and semantics. It does not integrate the correlation between grammatical functions and their morphosyntactic expression in constructions. In order to be able to explain how certain lexical elements become productive markers of a grammatical function, we need a structural framework which allows the necessary implicatures leading to the reanalysis of a given utterance. The framework which makes this type of implicatures possible is the construction itself.

3.3 Bybee, Perkins and Pagliuca (1994) The criticisms concerning Heine et al.’s (1991) approach do not apply to Bybee et al. (1994). The latter authors attach more importance to inference than to metaphor and they do integrate constructions into their model. As they point out (Bybee et al. 1994: 25), metaphor only comes into operation at the beginning of grammaticalization: While it is true that when one compares source concepts to related grammatical concepts one can construct a metaphorical relation between the two in many cases, our evidence suggests that the actual formation of metaphors is not the major mechanism for semantic change in grammaticization. Rather we see metaphor operating only on the more lexical end of grammaticization paths rather than propelling grams into the more and more abstract domains of grammatical meaning.

By contrast, inference operates pervasively through all stages of grammaticalization, from beginning to end. It is based on Grice’s (1975) maxim of quantity,

27

28

Walter Bisang

which states that the speaker does not say more than s/he must and that the hearer infers as much as s/he can. In Bybee et al.’s (1994: 286) words, One of the results of this delicate balance is that the hearer is obliged to extract all the meaning possible from the message, which includes all the implications that are not controversial. A semantic change can take place when a certain implication commonly arises with a certain linguistic form. That implication can be taken as part of the inherent meaning of the form, and can even go so far as to replace the original meaning of the form.

In Bybee et al.’s (1994) model, constructions are taken to be relevant for understanding processes of grammaticalization. It is often the meaning of a construction or of the elements within it that determines the result of grammaticalization. What Heine et al. (1991) explain in terms of metaphor often turns out to be the result of inferences based on the meaning of a construction. The following quotation shows how this works for be going to in English: While it is certainly the case [...] that the same schema structure for spatial ‘be going to’ is preserved in temporal ‘be going to’, it does not follow that metaphorical extension is the operative mechanism of change. Once again, the temporal meaning was present in the construction from the beginning. We’re going to Windsor to meet the King emphasizes the spatial but certainly makes a temporal statement as well. Again, the construction can spread gradually from cases where the spatial is important to cases where both temporal and spatial are important and finally to cases where only the temporal is relevant. (Bybee et al. 1994: 292)

The fact that Bybee et al.’s model takes inference to be more important than metaphor and that it integrates constructions does not mean that this model is not unproblematic from the perspective of East and mainland Southeast Asian languages. Two drawbacks will be briefly mentioned in what follows. Firstly, Bybee (1985) and Bybee et al. (1994) take the coevolution of form and meaning for granted. As a consequence, their model cannot deal with linguistic signs which can trigger highly grammatical concepts without an equal amount of phonetic reduction. The languages of East and mainland Southeast Asia show that the claim of a “causal link between semantic and phonetic reduction” in the quotation below is not universal: It therefore seems natural to look for a direct, and even causal, link between semantic and phonetic reduction in the evolution of grammatical material, beginning with the earliest stages of development from lexical sources and continuing throughout the subsequent developments grams undergo. Our hypothesis is that the development of grammatical material is characterized by the dynamic coevolution of meaning and form. (Bybee et al. 1994: 20)

Secondly, Bybee et al. (1994) understand grammaticalization as a gradual process in terms of grammaticalization clines in which a marker develops through differ-

Grammaticalization and the areal factor

ent functional stages (e.g. iterative > continuative > progressive > imperfective > intransitive). As Bybee et al. (1994: 289) put it, What makes inference interesting as a mechanism of change is the fact that inference allows the incorporation of new meaning into a gram. [...] Yet the orderlinesss of semantic change in grammaticization and the universality of paths of change demonstrate that the process of infusion of new meaning into a gram is quite constrained. The constrained set of inferential changes that can be discovered in grammaticization are interesting in their own right, as they will reveal us the nature of the commonly made inferences that guide speakers and hearers in conversation.

As shown by the case of ‘come to have’ verbs in Khmer discussed in Section 2.1 above, a grammatical cline is not a necessary prerequisite to account for the grammatical functions which such verbs can express.

3.4 Hopper and Traugott (2003) Hopper and Traugott’s (2003) model is based on both metaphor and metonymy, which are seen as pragmatic processes. In their view, metonymic processes are based on conversational implicatures and are more important and prior to metaphoric processes, which are based on conventional implicatures through analogy between semantic domains. Moreover, metaphor is concerned with analogy on the paradigmatic level and operates through conceptual domains, while metonymy belongs to reanalysis or abduction at the syntagmatic level and operates through interdependent syntactic constituents (cf. Bisang 2004: 115). Table 4 provides a summary of metonymy and metaphor as defined by Hopper and Traugott (2003). According to Hopper and Traugott (2003: 81), other approaches to grammaticalization, such as Heine et al.’s (1991), overestimate the relevance of metaphor due to a tendency to conceive grammaticalization in terms of the change from lexical item to grammatical item, rather than in terms of “use of lexical item in discourse > grammatical item” (Hopper and Traugott 2003: 81). With its clear discourse basis, Hopper and Traugott’s (2003) model has a number of advantages Table 4. Mechanisms of language change: Metonymy and metaphor in terms of Hopper and Traugott (2003) Metonymy

Metaphor

– Syntagmatic level – Reanalysis (abduction) – Conversational implicature – Operates through interdependent syntactic constituents

– Paradigmatic level – Analogy – Conventional implicature – Operates through conceptual domains

29

30

Walter Bisang

for the understanding of grammaticalization as it is found in East and mainland Southeast Asian languages. Firstly, it gives priority to metonymy, and thus to inference, at least at the beginning of grammaticalization processes when lexical items are enriched through conversational implicatures which later become part of the conventionalized meaning of these items. Secondly, it does not depend on the coevolution of form and meaning. Finally, it does not assume that grammaticalization is necessarily gradual in terms of a continuous semantic bleaching, at least at the beginning of the process. In Hopper and Traugott’s words, “[t]here is no doubt that over time, meanings tend to become weakened during the process of grammaticalization. Nevertheless, all the evidence for early stages is that initially there is a redistribution or shift, not a loss, of meaning” (2003: 88). The disadvantage with Hopper and Traugott’s (2003) approach is that the above-mentioned positive aspects of the model are supposed to operate only at the beginning of the process of grammaticalization. Thus, pragmatics in the sense of conversational implicatures will be replaced later by metaphoric processes and by processes of conventionalization which lead to more gradual processes and seem to fall more or less in line with assumptions concerning the coevolution of form and meaning. However, pragmatic inferences also operate at the level of highly grammatical concepts and, as we have seen, different inferences with the same lexical item do not follow a hierarchy and are not necessarily subject to the coevolution of form and meaning (cf. the case of ‘come to have’ verbs in Section 2.1 above). Therefore, the model of Hopper and Traugott (2003) is not fully adequate for East and mainland Southeast Asian languages either.

3.5 Summary The foregoing outline of the most prominent models of grammaticalization shows that none of them can fully account for grammaticalization in East and mainland Southeast Asian languages. Lehmann’s (1995) concept of autonomy only seems to work in the case of syntagmatic variability (rigid word-order patterns). Heine et al. (1991) overemphasize the importance of metaphor and neglect the relevance of constructions, whereas Bybee et al. (1994) take for granted the coevolution of form and meaning and the graduality of grammaticalization processes. Finally, Hopper and Traugott (2003) provide all the necessary ingredients to account for grammaticalization in East and mainland Southeast Asian languages, but they do not allow them to operate at later stages of grammaticalization.

Grammaticalization and the areal factor

4. Areality and why grammaticalization is special in East and mainland Southeast Asia 4.1 An areal type of grammaticalization Grammaticalization is usually defined in two stages: (i) from a lexical item to a grammaticalized item; and (ii) from a grammaticalized item to a more grammaticalized item. This definition goes back to Kuryłowicz (1964) and has recently been taken up again by Detges and Waltereit (2002: 188): “[g]rammaticalization is a process whereby a lexical item assumes a grammatical function or whereby an already grammatical item takes on a more grammatical function.” In contrast, many instances of grammaticalization in East and mainland Southeast Asian languages seem to follow a one-stage model. One can see the step from lexical item to grammaticalized item, but it is often hard to distinguish clearly between more and less grammaticalized items. This situation is the result of the pervasiveness of pragmatics in the grammaticalization processes which take place in these languages. The possibility of several pragmatic inferences with one and the same linguistic item is fairly common, and there is no need for these inferences to follow a certain hierarchy (even though this is not excluded). As a consequence, there are instances such as ‘come to have’ verbs which can be interpreted equally in terms of ability, permission, obligation, past and emphasis of truth or factuality (cf. Section 2.1 above). In my view, the strong discourse-basis of some processes of grammaticalization represents an areal type of grammaticalization which encompasses the following language families situated in East and mainland Southeast Asia: a. Mon-Khmer (branch of Austroasiatic). b. Tai (branch of Tai-Kadai). c. Sinitic (branch of Sino-Tibetan). d. Hmong-Mien. If there is a considerable number of languages in which grammaticalization also operates at levels usually associated with higher degrees of grammaticalization, and if these languages form a geographical cluster in East and mainland Southeast Asia, we may wonder to what extent current approaches to grammaticalization depend on the languages and language groups on which they are based. Since the languages discussed in this paper only play a marginal role in the approaches discussed in Section 3, none of their proponents came across pragmatics-based processes of grammaticalization as we find them in East and mainland Southeast Asia; even typological questionnaires seem to be biased by this lack of input. Why else

31

32

Walter Bisang

would it be possible that verbs with the meaning ‘come to have’ are hardly discussed as source concepts in generalizations concerning the development of tenseaspect-modality systems? From a more general perspective, we may thus wonder how many other source concepts and types of grammaticalization there are to be discovered if other linguistic areas are analysed in more detail. To give just an illustrative example, the lack of form–meaning coevolution is also attested in Slavonic languages, another language family which is understudied as far as grammaticalization is concerned (cf. Bisang et al. 2004).

4.2 How to account for the situation in East and mainland Southeast Asia: On the lack of paradigms This section briefly outlines why grammatical markers of East and mainland Southeast Asian languages are not organized in morphological paradigms like those familiar to us from languages like Latin (cf. amo, amas, amat, etc. ‘I love,’ ‘you love,’ ‘she/he/it loves’). As I argued in the introduction, this is due to two complementary types of factors, namely phonological properties (Ansaldo and Lim 2004) and the broad functional spectrum of grammatical markers combined with their lack of obligatoriness. Both of these factors will briefly be explained in what follows. Ansaldo and Lim (2004) point out that the relative phonetic stability of grammaticalized items in East and mainland Southeast Asian languages is due to the discreteness of syllable boundaries and phonotactic restraints. They show for Cantonese and Hokkien that grammaticalization manifests itself in the form of reduced syllable duration and vowel quality. These two effects have to do with the fact that grammaticalized items occur adjacent to metrically stressed syllables and thus tend to lose their own stress in such an environment. Interestingly enough, there is no significant reduction in pitch with grammaticalized items. This is explained by Ansaldo and Lim by the fact that tonal contrast must be maintained in order to keep up contrast, particularly in Sinitic languages like Cantonese and Hokkien, which distinguish three tonal registers. In Sinitic languages with only one tonal register, like Modern Standard Chinese (Putonghua), reduction of pitch would not produce the same strong effect of blurring distinctive features of the language. This seems to be the reason why some morphemes in Modern Standard Chinese, such as verb-final -le (derived from liaˇo ‘finish’) or verb-final -zhe (derived from zháo ‘touch, contact’), did not only lose vowel quality but also reduced their diphthongs and triphthongs to the reduced toneless vowel [ə]. The correlation between the existence of a neutral tone as we find it in -le and -zhe and the presence of a single register is also pointed out by Ansaldo and Lim (2004). Facts like these clearly show that there are phonological properties preventing the full coevolution of form and meaning which can lead to the emergence of morpho-

Grammaticalization and the areal factor

logical paradigms. Processes which develop morphology are simply not prominent enough in most East and mainland Southeast Asian languages. The second factor which obstructs the emergence of morphological paradigms is based on frequency and on the existence of markers belonging to clearly determined semantic domains. The central role of frequency for the development of paradigms has to be seen in the light of generality as defined by Bybee (1985) and obligatoriness (cf. the quotation from Lehmann 1995 in Section 1 above). Thus, morphological paradigms develop from categories which are frequently used. Frequency, in turn, is enhanced by semantic generality, which grants its compatibility with a wide range of lexical items. If a marker is semantically general enough to be coextensive with a basic grammatical entity like noun or verb, its occurrence may become obligatory with that entity. As a consequence, it becomes even more frequent. As we can see from Lehmann (1995: 139; cf. Section 1), paradigms refer to certain clearly defined grammatical categories, such as tense or aspect, with their values (e.g. present, past and future in the case of tense) or sub-categories (e.g. perfective and imperfective in the case of aspect). The preconditions for categories to be integrated into paradigms are a certain degree of homogeneity on the level of the category as a whole and semantically clear-cut definitions of its subcategories. In East and mainland Southeast Asian languages, the emergence of a situation in which grammatical markers are frequent and homogeneous enough to become part of a coherent paradigm is systematically undermined by the high degree of indeterminateness (lack of obligatory categories) and the broad functional spectrum of markers. Thus, on the one hand, despite their highly generalized meanings, grammatical markers are not so frequent as, for example, tense markers in English or German, because they are optional. On the other hand, the meaning of grammatical markers depends on pragmatics on all levels of grammaticalization, and therefore their functional range is not homogeneous, not limited to a single clearly determined semantic domain. Thus, the emergence of a paradigm is rather unlikely for both reasons, low frequency and low degree of semantic h omogeneity.

5. Outlook: Why is research on grammaticalization interesting? Newmeyer (1998) deconstructs grammaticalization by showing that none of its three components, namely downgrading analysis, semantic change and phonetic reduction, is exclusively related to processes of grammaticalization. Grammaticalization is thus nothing but an epiphenomenon of these independent processes of diachronic change. As Newmeyer (1998: 295) himself puts it,

33

34

Walter Bisang

We have examined the associated set of diachronic changes that fall under the rubric of ‘grammaticalization’ and have found that no new theoretical mechanisms, nor mechanisms unique to grammaticalization itself are needed to explain them. Far from calling for a ‘new theoretical paradigm,’ grammaticalization appears to be no more than a cover term for a conjunction of familiar developments from different spheres of language, none of which require or entail any of the others.

In East and mainland Southeast Asian languages, reanalysis (the occurrence in a particular position within a syntactic pattern) and semantic change (defined in terms of pragmatic inference) always co-occur, while phonetic reduction operates only to a certain degree and does not necessarily correlate with change of meaning. This pattern of interaction between the three processes of diachronic change seems to be a specific property of East and mainland Southeast Asian languages. In other languages, and in other areas, there may be other patterns. If it turns out that there are such patterns of grammaticalization and that the same languages also deviate in interesting ways from standard theoretical assumptions (cf. the discussion of subject/object asymmetry in East and mainland Southeast Asia), questions concerning the basis of grammatical properties are far from trivial.

Abbreviations clf cov fut neg quot tam Vd

classifier coverb future negation quotation particle tense-aspect-modality directional verbs

References Aikhenvald, Alexandra Y. 2000. Classifiers: A typology of noun categorization devices. Oxford: OUP. Allan, Keith. 1977. Classifiers. Language 53: 285–311. Ansaldo, Umberto. 1999. Comparative constructions in Sinitic: Areal typology and patterns of grammaticalization. PhD dissertation, Stockholm University. Ansaldo, Umberto & Lisa Lim. 2004. Phonetic absence as syntactic prominence. Grammaticalization in isolating tonal languages. In Olga Fischer, Muriel Norde & Harry Perridon (eds.), Up and down the cline: The nature of grammaticalization [Typological Studies in Language 59], 345–362. Amsterdam: John Benjamins. Becker, Neele. 2005. Numeralklassifikatoren im Thai. PhD dissertation, University of Mainz.

Grammaticalization and the areal factor

Bisang, Walter. 1992. Das Verb im Chinesischen, Hmong, Vietnamesischen, Thai und Khmer (Vergleichende Grammatik im Rahmen der Verbserialisierung, der Grammatikalisierung und der Attraktorpositionen) [Language Universals Series 7]. Tübingen: Gunter Narr. Bisang, Walter. 1993. Classifiers, quantifiers and class nouns in Hmong. Studies in Language 17: 1–51. Bisang, Walter. 1996. Areal typology and grammaticalization: Processes of grammaticalization based on nouns and verbs in East and mainland Southeast Asian languages. Studies in Language 20: 519–597. Bisang, Walter. 2004. Grammaticalization without coevolution of form and meaning: The case of tense-aspect-modality in East and mainland Southeast Asia. In Walter Bisang, Nikolaus P. Himmelmann & Björn Wiemer (eds.), What makes grammaticalization? A look from its fringes and its components, 109–138. Berlin: Mouton de Gruyter. Bisang, Walter, Nikolaus P. Himmelmann & Björn Wiemer (eds.). 2004. What makes grammaticalization? A look from its fringes and its components. Berlin: Mouton de Gruyter. Bybee, Joan L. 1985. Morphology: A study of the relation between meaning and form. Amsterdam: John Benjamins. Bybee, Joan L., Revere Perkins & William Pagliuca. 1994. The evolution of grammar: Tense, aspect and modality in the languages of the world. Chicago IL: University of Chicago Press. Denny, Peter J. 1976. What are noun classifiers good for? Papers from the 12th Regional Meeting of the Chicago Linguistic Society (CLS): 122–132. Detges, Ulrich & Richard Waltereit. 2002. Grammaticalization vs. reanalysis: A semanticpragmatic account of functional change in grammar. Zeitschrift für Sprachwissenschaft 21: 151–195. Enfield, Nicholas James. 2003. Linguistic epidemiology: Semantics and grammar of language contact in mainland Southeast Asia. London: RoutledgeCurzon. Greenberg, Joseph H. 1974. Numeral classifiers and substantival number: Problems in the genesis of a linguistic type. In Proceedings of the 11th International Congress of Linguistics, Bologna - Florence, Aug.-Sept. 1972, 17–37. Cambridge: CUP. Grice, Herbert Paul. 1975. Logic and conversation. In Peter Cole & Jerry L. Morgan (eds.), Syntax and semantics. Vol. III: Speech acts, 41–58. New York NY: Academic Press. Grinevald, Colette. 2000. A morphosyntactic typology of classifiers. In Gunter Senft (ed.), Systems of nominal classification, 50–92. Cambridge: CUP. Heine, Bernd, Ulrike Claudi & Friederike Hünnemeyer. 1991. Grammaticalization: A conceptual framework. Chicago IL: University of Chicago Press. Hopper, Paul J. & Elizabeth Closs Traugott. 2003. Grammaticalization. 2nd edn. Cambridge: CUP. Kuryłowicz, Jerzy. 1964. The inflectional categories of Indo-European. Heidelberg: Winter. Lehmann, Christian. 1995. Thoughts on grammaticalization: A programmatic sketch. 2nd edn. Munich: Lincom. Newmeyer, Frederick J. 1998. Language form and language function. Cambridge MA: MIT Press. Schachter, Paul. 1985. Parts-of-speech systems. In Timothy Shopen (ed.), Language typology and syntactic description. Vol. 1: Clause structure, 3–61. Cambridge: CUP.

35

chapter 2

On the grammaticalization of ‘come’ and ‘go’ into markers of textual connectivity* Philippe Bourdin University of Paris X and York University (Toronto) There is crosslinguistic evidence that verbs of ‘coming’ and ‘going’ are prone to grammaticalize into sequential markers and, more generally, into textual connectives. This paper shows how, in the grammaticalization cline from directional deictics to sequentials, some of these markers have preserved their original semantics to varying degrees, while others have developed further into markers of past tense. The grammaticalization of ‘come’ and ‘go’ into markers of textual connectivity also exhibits a wide range of morphosyntactic variation: while in some languages they have retained all their formal verbal characteristics, in others they have become affixes, have followed the auxiliation or the serialization routes, or have fossilized. Three potential triggers for grammaticalization are discussed in the paper: the role of deicticity, the iteration hypothesis and the futurity scenario.

1. From motion in space to textual connectivity The phenomenon addressed in this paper is illustrated in (1) and (2) below.

(1) Toussian (Niger-Congo, Gur; Burkina Faso) mɛ nyar nyõ te pi yugu sõaʔ. 1sg tô ate and sequential[

(2) Wari’ (Chapacura-Wanham; Brazil) […] ’ep toc mao cacama-in na. grind drink sequential[=go.sg] 3pl.f-3n 3sg.realis.pst/prs ‘[They arrived going that way…]. Then they ground and drank [the corn].’ (Everett and Kern 1997: 480)

* I gratefully acknowledge support from Glendon College (York University, Toronto) and the Social Sciences and Humanities Research Council of Canada in the form of travel grants awarded in April and June 2005, respectively. I also wish to thank one anonymous reviewer for helping me to address, and I hope resolve, a number of shortcomings in a previous version of this paper.

38

Philippe Bourdin

In (1) a verb originally meaning ‘come’ is used to describe, not motion in space, but rather motion in time from one event to the next. That the ‘come’ verb here does not refer to motion in space is borne out by the fact that it combines with a verb of ‘going:’ it would clearly make no sense for the same motion event to be simultaneously oriented towards the speaker’s location and away from it. The semantic function encoded by ‘come’ in (1) is encoded in (2) by a verb meaning ‘go.’ It is reasonable to conclude that we are dealing here with a specific grammaticalization pathway. The source meaning of the markers involved is ‘motion in space oriented with respect to the speaker’s location,’ for short directional deixis. In turn, the target meaning is ‘textual sequentiality,’ i.e. the temporal sequencing of an event B with respect to a previous state of affairs A. Typically, though not necessarily, sequential markers are found in narratives, where their function is to propel the story forward, most commonly by specifying B as being temporally contiguous with A. Occasionally, B is further specified as a logical consequence of A: the marker then encodes consecution as well as sequentiality. It is possible to draw a conceptual link between actual sequential markers and markers which fulfil such discursive functions as highlighting the unexpected nature of event B or making salient its newness or its status as a turning point in the narrative. Indeed, while suddenness or counter-expectation is the diametric opposite of logical consecution, it too presupposes a connection between event B and a previous state of affairs. In no language is this more obvious than in Kera; consider example (3).

(3) Kera (Afro-Asiatic, Chadic, East; Cameroon and Chad) a. […] bəŋ a gər‑a giidə kiiri‑ŋ‑a. consecutive[=came] in.order.to put-3f in bean-def-loc ‘[They set out to find a trap…]. They then set it up in the beans.’ (Ebert 2003: 115) b. […] ɗeŋ a gol dur‑u counterconsecutive[=went] in.order.to see eye-poss.3m kənəŋgo aska‑ŋ paapa pəda. now fish-def neg not.any.longer ‘[Hyena hung around chatting…]. Suddenly she saw that the fish was no longer there.’ (Ebert 2003: 115)

While ‘come’ in (3a) signals that the event flows logically from what has happened just before, ‘go’ in (3b) indicates that it was totally unexpected. To the extent that consecution and unexpectedness are two sides of the same conceptual coin, the view of the latter as an avatar of textual connectivity, albeit perhaps a peripheral one, is not unwarranted. Example (4) illustrates another subtype of connectivity which is tangential to the notion of temporal/logical sequentiality.

On the grammaticalization of ‘come’ and ‘go’

(4) In London he gradually took over the Underground system and came to control every line except the Metropolitan (British National Corpus; EDN 2051)

Come here has a subsequential or rather culminative interpretation, as it configures an event as the end-point of a preparatory process.1 Given that sequentiality is only the most basic component of their semantics, it is best to use another term to comprehend the functional range of the markers documented in this paper: the phrase textual connectivity seems both sufficiently general and metaphorically apt. Connectives are only one of the types of markers used to hold a text together; the anaphoric pronoun is another type which comes to mind. The term sequential will be used in the glosses when no discursive function other than temporal sequencing is involved; otherwise the term connective will be used. The underlying assumption is that sequentiality forms the core of textual connectivity, whereas foregrounding, culminativity and counter-consecution are more peripheral or borderline subtypes. It might be argued, for instance, that exponents of unexpectedness properly belong among the mirative markers which encode the speaker’s surprise. It would be problematic, however, to exclude them from the broad category of textual connectives on this basis. This is because in a language like Kera we would have to rule in the ‘come’-derived marker of sequential consecution while ruling out the ‘go’-derived counter-consecutive marker. At this exploratory stage in my research, I believe that such differential treatment would be directly counter-intuitive, if not arbitrary.

2. The crosslinguistic scope of the phenomenon The data I have collected are drawn from a sample of 64 languages.2 In Tables 1 and 2 (see Appendix), language stocks and the languages within each stock are listed in accordance with the conventions followed in the fifteenth edition of Ethnologue (2005). Table 1 lists those languages which have grammaticalized a directional deictic into a marker encoding sequentiality, mainly or exclusively. Question marks (e.g. ‘go’ (?)) point to uncertainty regarding the etymology of a given 1. The term culmination is borrowed from Rowlands (1969: 66). 2. Overall, the number of languages which I have looked at is well in excess of 200. However, given the preliminary nature of the study and the fact that the phenomenon being investigated does not rank among those which are commonly recognized, let alone systematically explored, by reference grammars, no attempt has been made to keep a record of those languages for which there are currently no data available to suggest the encoding of textual connectivity by ‘come’ and/or ‘go.’

39

40

Philippe Bourdin

marker or, in the case of Hittite, the semantics of the markers. Table 2 lists languages in which the semantic value of the markers involves peripheral connectivity, namely culminativity, foregrounding or unexpectedness.3 While this is not a statistically representative sample of the languages of the world, either genetically or areally, it is worth observing that the grammaticalization of ‘come’ and ‘go’ into sequentials is disproportionately widespread in African languages: the four language stocks originating in Africa are represented, and within each of the three largest the phenomenon is present in a number of branches. This is particularly true of Niger-Congo, with all the main branches represented. Cases outside of Africa are far more scattered. However, the fact that nine distinct stocks are represented in Table 1 alone suggests that we are dealing with a bona fide grammaticalization pathway with fairly strong crosslinguistic foundation. Indeed, it has been acknowledged as such by Heine and Kuteva (2002: 68–69, 156–157) and, in a more allusive fashion, by Traugott (1978: 383–384) and Marchese (1986: 145).

3. The grammaticalization cline Grammaticalization is not, except in a very general sense, a unitary concept. Instead, it is usually characterized as a cluster of formal and semantic properties (see, for instance, Lehmann 1995). It is not surprising, therefore, to find that the phenomena discussed here are not amenable to any hard and fast criteria which would demarcate grammaticalized markers of connectivity from non-grammaticalized lexemes. For that reason, I propose to analyse what appear to be the most distinctive semantic and morphosyntactic features of what is commonly known as the grammaticalization cline (cf. Fischer et al. 2004), specifically in relation to markers of textual connectivity originating from directional deictics.

3.1 Semantic aspects Textual connectivity is intuitively more abstract than motion in space, which is exactly the sort of semantic change to be expected when a marker undergoes grammaticalization. While this semantic change is a necessary condition for grammaticalization to take place, the interpretation of the relevant item sometimes leaves room for ambiguity. Thus, in (5) it is possible to interpret ‘go’ either as a verb of 3. Another occasional source of uncertainty is the fact that two different authors may disagree as to the proper analysis of a given marker. This explains why Yoruba is listed in both Table 1 and Table 2.

On the grammaticalization of ‘come’ and ‘go’

motion in space or as a connective, a kind of semantic coexistence which is common in other grammaticalization pathways.

(5) Neo-Aramaic (Afro-Asiatic, Semitic; Israel) zelli zaraʕli. go.prs.3sg.m sow.prs.3sg.m>3sg.m ‘He went to sow it.’/‘Then he sowed it.’ (Correll 1978: 81)

More subtle and diffuse is the kind of semantic retention exemplified in (6) for Izi. Longacre (1990) observes that je is mostly used as a sequential when the previous verb refers to motion in space. This suggests that, while it has faded somewhat, its original meaning as a verb of motion has not disappeared completely.

(6) Izi (Niger-Congo, Benue-Congo; Nigeria) bahu l’oma ulo je walk.in.pst inside house sequential[=go].pst awatakwa akuku. begin.sequential beat.the.drum ‘He walked into the house and began to drum.’ (Longacre 1990: 149)

At the opposite end of the cline stands the semantic trajectory followed by the Catalan verb anar, ‘go,’ as reconstructed by Detges (2004). In Old Catalan, anar developed into a narrative connective emphasizing the unforeseen nature of the event. Subsequently, it lost this narrative function to become a full-fledged past tense auxiliary. This particular route happens to be paralleled by the semantic evolution of ‘go’ and ‘come’ in spoken Arabic, a fact which an inter-dialectal comparison will bear out. Examples (7a–b) and (8) illustrate the verb ‘go’ in spoken Egyptian Arabic and spoken Lebanese Arabic respectively.

(7) Spoken Egyptian Arabic (Afro-Asiatic, Semitic) a. rα(α)ḥ Dαrαb-ni. counterconsecutive[=go].prf.3sg.m strike.prf.3sg.m-obj.1sg ‘He suddenly struck me.’ (Mitchell and al-Hassan 1994: 77) b. ʔāl waʔʔaf il-filūka say.prf.3sg.m stop.imp.2sg.m def-boat ruḥ-na muwaʔʔaf-īn. sequential[=go].prf-1pl stop.active.ptcp-pl ‘He said: “Stop the boat!” So we stopped.’ (Brustad 2000: 199)

(8) Spoken Lebanese Arabic (Afro-Asiatic, Semitic) ʕād reḥ-na šef-nɛ̄-h mbɛ̄reḥ. again pst[=go].prf-1pl see.prf-1pl-obj.3sg.m yesterday ‘We saw him again yesterday.’ (Feghali 1928: 8)

41

42

Philippe Bourdin

While in (7a) ‘go’ combines with a verb in the perfect and highlights the unforeseen nature of the event, in (7b) it combines with a verb in the participle and specifies the event as a consequence of the event described by the previous sentence. In turn, ‘go’ in example (8) no longer encodes sequentiality, but signals merely that the event is situated in the past. Much the same contrast obtains between the use of ‘come’ as a sequential in varieties such as Bedouin Arabic (cf. Rosenhouse 1984: 122) and its use as an index of past time in spoken Lebanese Arabic (cf. Feghali 1928: 8).

3.2 Morphosyntactic aspects The grammaticalization of ‘come’ and ‘go’ into sequentials exhibits a wide range of morphosyntactic variation across languages. At one extreme stand languages like Neo-Aramaic, exemplified in (5) above, in which ‘go’ has retained all of its formal attributes as a verb. Zulu is especially interesting in this regard because ‘come’ has followed two distinct grammaticalization pathways, both illustrated in example (9).

(9) Zulu (Niger-Congo, Bantu (S40); South Africa) si-zo-libala si-ze si-bone 1pl-fut[
As a future marker, ‘come’ no longer behaves as a verb, but as an affix; as a sequential, by contrast, it retains its verbal characteristics, such as the ability to have a pronominal prefix of its own and to carry subjunctive inflection. At the other extreme stand a number of Bantu languages including Swahili and Rundi. These languages have the so-called subsecutive affix -ka-, which is possibly based on a Proto-Bantu verb meaning ‘go.’ The affix is so fully integrated in the tense system that it has been completely stripped of both its putative original semantics and the verbal status which went with it. Likewise, the narrative/sequential marker sì of Supyire, a Gur language of Mali, is subject to extensive phonological mutations which obscure its probable origin as the imperfective form of a verb meaning ‘go’ (cf. Carlson 1994: 348–349). Among the various types of syntactic structure which support ‘come’- and ‘go’derived sequentials, auxiliation and serialization are especially favoured. They correspond to an intermediate stage along the grammaticalization cline because they imply retention of the marker’s verbal attributes together with paradigmaticization (cf. Lehmann 1995: 135–136). Typical of the auxiliation route is the behaviour

On the grammaticalization of ‘come’ and ‘go’

of the ‘go’-derived connective in Mari, which auxiliates a lexical verb in the converbal, i.e. non-finite, form, as in (10). (10) Mari (Uralic; Russia) […] koδ-šo boepripas puδešt remain-active.ptcp ammunition explode.cvb kaj-yš. counterconsecutive[=go]-pst.3sg ‘[When the partisans had finished taking away from the village all that was necessary,] the remaining ammunition [all of a sudden and unexpectedly] exploded.’ (Timofeeva 1959: 139) Serialization, broadly defined as involving a sequence of verbs which (i) share a single argument and (ii) are not linked by any marker of coordination or subordination (cf. Crystal 2003: 416), would appear to have been the preferred route in a number of African languages and creoles. Consider (11) and (12). (11) Yaayuwee, dialect of Gbaya (Northwest) (Niger-Congo, Adamawa-Ubangi; Cameroon) mέ nɛɛ dé há̰ gk mέ yɔ̃ŋ am. 2sg sequential[=go].prf make prep snake prep bite 1sg ‘Then you arranged for a snake to bite me.’ (Noss 1973: 158) (12) Guinea Bissau Crioulo (Portuguese-based creole; Guinea-Bissau) i bay mansoa, i biŋ bay bisaw, he go Mansoa he sequential[=come] go Bissau i biŋ biŋ li. he sequential[=come] come here ‘He went to Mansoa, then he went to Bissau, and then he came here.’ (Wilson 1962: 25) There are also languages in which the sequential marker, while still recognizable as a verb, is well on its way to morphological fossilization. Thus, in Bagiro, sequential ‘come’ has become impervious to subject agreement. Consider example (13). (13) Bagiro (Nilo-Saharan, Bongo-Bagirmi; Central African Republic) mbùnzú↓ gɛ̄ ɗe tūgyué white pl sequential[
43

44

Philippe Bourdin

Likewise, in Bedouin Arabic of the Sinai, sequential ‘go’ is described as an invariable particle, rather than as a verb (cf. De Jong 2000: 570). Morphological fossilization is sometimes compounded by syntactic rigidification. In Jakalteko, a Mayan language of Guatemala, sequentiality is marked by the morpheme cat (probably based on the imperative of the verb tita ‘come’) which occurs in constructions subject to severe constraints; for example, it is incompatible with negation and is not allowed when the verbs which it links together are in the past tense (cf. Day 1973: 32). In Wari’, the verb mao, which originally meant ‘go,’ is the core of a highly constrained construction, dubbed by Everett and Kern (1997: 73) a “verbalized sequential sentence.” The syntactic pattern involved is illustrated by example (2) above, where mao, which forms a “derived predicate” with the lexical verb preceding it and the “inflectional clitic” following it, is formally invariable, and the only “verb” with which it is in free variation is ’ac, which originally meant ‘travel.’ The syntax of the verb ‘go’ in Boko is equally constrained. It takes on sequential value only in temporal clauses and in conjunction with a comitative suffix attached to the verb following, as in (14). (14) Boko (Niger-Congo, Mande; Benin and Nigeria) ´ma ́ʔi mi ʔe ma ´nɛ ´gɛ su‑̀o. 1sg.fut water drink until poss.1sg child sequential[=go] return-com ‘I will drink water until my child returns.’ (Jones 1998: 260)

4. Possible triggers for grammaticalization 4.1 The role of deicticity Why are ‘come’ and ‘go’ predisposed to become markers of sequentiality and textual connectivity? First of all, it must be noted that there is of course nothing inevitable about this process, since it is not attested in the majority of languages. Secondly, languages exhibit the same variability when it comes to choosing between ‘come’ and ‘go’ as a sequential marker as they do when it comes to choosing either as a future marker (cf. Bourdin 2000), a passive marker (cf. Bourdin 2004a: 399ff) or a marker of continuous or progressive aspect (Bourdin 2004b). This may mean either that deicticity is not a factor in the process of grammaticalization or that ‘come’ and ‘go’ do not produce quite the same type of sequential. There are arguments to support both of these claims. As mentioned in Section 3.2 above, in Wari’ the verb mao ‘go’ alternates as a sequential with a verb meaning ‘travel.’ Likewise, sequentiality is encoded in Puluwat, a Micronesian language, by the doubling of the non-deictic motion verb

On the grammaticalization of ‘come’ and ‘go’

fáyl (cf. Elbert 1974: 100). In turn, in Southern Sotho, a Bantu language of Lesotho, it is encoded by the verbs tlōha ‘leave’ and tšōha ‘exit’ (cf. Doke and Mofokeng 1957: 294–295; Mabille and Dieterlen 1961: 552). Finally, in Thai, it is marked by thy̌ ŋ ‘arrive’ (cf. Bisang 1992: 386). Clearly, in these particular languages deicticity would appear to be irrelevant. It would not be justified, however, to extrapolate this finding to languages in general. The data in Table 2 reveal an intriguing trend: most languages using a directional deictic to encode suddenness or counter-consecutiveness employ ‘go.’ The case of Tupuri is a good example; see (15). (15) Tupuri (Niger-Congo, Adamawa-Ubangi; Cameroon and Chad) mïŋgïr ràw ʔaagee tú. lion counterconsecutive[ and the semantics of ‘come’-type markers by the schema < directed motion + identity >. How otherness is manifested linguistically depends on the notional configuration involved. In examples such as (16) below, otherness takes the form of narrative dissonance, in other words the narrative takes a turn other than that which the flow of events thus far should have led to. The semantic contrast illustrated in (3a–b) above is also suggestive: it is reasonable to assume that it is by virtue of their original deictic values that ‘come’ and ‘go’ in Kera have evolved into markers of consecution and counter-consecution, respectively.

4.2 Non-deictic scenarios There are no doubt limits to the explanatory power of the line of reasoning outlined in the preceding section. Indeed, in few languages is it possible to reconstruct with any degree of certainty the actual diachronic route which ‘come’ and ‘go’ have travelled as they evolved into sequentials. Nonetheless, there is some crosslinguistic evidence of at least two possible scenarios in which deicticity is not directly involved. These are discussed below.

4.2.1 The iteration scenario In a number of languages a ‘come’- or ‘go’-derived connective refers back to a previous occurrence of the corresponding verb used with its lexical meaning, namely

45

46

Philippe Bourdin

as a verb of motion in space. Saramaccan and Kxoe offer typical instances of this pattern; cf. (16) and (17). (16) Saramaccan (English-based creole; French Guiana and Suriname) dí wómi gó a saamáka sitáati, gó bái soní. def man go prep Saramacca street sequential[=go] buy thing ‘The man went to Saramacca Street and made purchases.’ (Boretzky 1983: 170) (17) Kxoe (Khoisan; Namibia) taátenu ci yáa-tè //’áxa‑m okà n//à‑mà úàn‑m ̀. then proceed come-prs chief-3sg.m loc dem-3sg.m hare-3sg.m tàxùátenu n//à‑mà yàá then dem-3sg.m foregrounding.sequential[=come] ’à. |x’ɛci-tè n|í‑//ua regale-prs dem-3pl.m obj ‘Then he [the hyena] returned to the chief, that is, the hare. Then he [the hare] regaled them.’ (Heine 2000: 279) In (16), the second occurrence of gó fulfils a stylistic function which consists of establishing a connection with what has been said before (Boretzky 1983: 170).4 The link with antecedent gó here may be viewed as an iconic reflex of narrative cohesion. As for example (17), it illustrates in Kxoe what Heine (2000) claims corresponds to an intermediate stage in the grammaticalization of the verb yàá ‘come.’ Following the terminology used by Ebert (1987: 53–54), Heine invokes an “anaphoric link” between the two occurrences of ‘come.’ According to him, this link initially involved iterated, and redundant, reference to the motion event which preceded event A and which ipso facto also preceded event B. This iterated reference was the original impetus for the grammaticalization of ‘come.’ At a later stage, which happens to be illustrated by (17), the second occurrence of ‘come’ is an example of coexistence, or overlap, between the original lexical meaning and the new grammaticalized meaning: while yàá does refer back to the motion event immediately anterior to event A (i.e. the coming of the hyena, not that of the hare), it simultaneously functions as a connective which foregrounds the newness of event B and as such may be glossed “watch out, now something new is going to happen that is relevant for what follows” (Heine 2000: 274). At a more advanced stage of grammaticalization, yàá — or, for that matter, cìí ‘go’ — functions as a stand-alone foregrounding connective, without any antecedent and without reference to a pre4. In his own words, “[Das wiederholte Verb ist] ein stilistisches Mittel des Anschlusses an das vorher Gesagte.”

On the grammaticalization of ‘come’ and ‘go’

vious motion event. Crucially, Heine shows that the three successive stages in what he terms the grammaticalization chain are synchronically coexistent, so that the diachronic route followed by the relevant markers can be read off from structures which are all actually attested in the contemporary language. While the particular route travelled by ‘come’ and ‘go’ in Kxoe is likely to have been in some measure sui generis, there is fairly substantial crosslinguistic evidence that genetically and geographically unrelated languages do make iterated use of their markers of directional deixis for purposes which include textual cohesion. According to Ebert (2003: 116–117), Margi is one of a number of Chadic languages which behave in precisely this way (cf. example (18) below) and so do such diverse languages as those illustrated in (19) to (21). (18) Margi (Afro-Asiatic, Chadic, Biu-Mandara; Nigeria) mái dǝ́rì, kǝ̀ nyà-rá tsía, kyàntǝ́kía. go with.it sbjv you-go kill sbjv.1sg.divide ‘Go with it, that you [go and] kill it, that I divide it.’ (Ebert 2003: 117) (19) Yakoma (Niger-Congo, Adamawa-Ubangi; Central African Republic) […] à gá ̀í à gā ̀lú 3pl realis.come with.that 3pl come realis.bury gè kɔ̄dɔ̄rɔ̄. here.at village ‘[… they find her skull, they take it,] they bring it back and bury it in the village.’ (Boyeldieu 1995: 138) (20) Tsonga (Niger-Congo, Bantu (S50); Mozambique) mi-ta-ta mí-ta-nghéna ándlwíni. 2pl-fut[
5. That the third occurrence of -ta- in this example is a ventive affix, and not a second occurrence of the ‘come’-derived future marker, is borne out by the fact that the two verbal forms may coalesce, with the result that the three occurrences of ta are contiguous: mi-ta-ta-ta-nghéna (Sitoe 2001: 107).

47

48

Philippe Bourdin

While these facts are suggestive and lend typological weight to the iteration scenario, there are inherent dangers in a lumping approach which would gloss over some very real differences between the various structures and putative diachronic pathways involved. The first type of difference concerns the shape and morphosyntactic status of the items in a relationship of iteration. In some languages, like Saramaccan, they both have verbal status and are formally identical. In Yakoma, however, the level tone on gā is symptomatic of its auxiliary status, while the high tone on antecedent gá indicates that it is a full verb. In Tsonga and Mixtec, there is formal identity, or near identity, between the two items, but while one behaves like a verb, the other is an affix. In the case of Margi, they both have verbal status but represent different forms of the same verb; as a result, the iteration is semantic rather than formal. The Saramaccan construction in (16) above instantiates what has been termed the double-go construction by Sebba (1987: 193), who observes that a parallel double-come construction is attested in Sranan, another English-based creole of Suriname. In those languages, there does not appear to be any possibility of a third, hybrid construction built on the pattern or . However, in at least one of the languages mentioned above, namely Yakoma, iteration is only one variant of a broader phenomenon involving the coexistence, within a given sentence, of the same motion verb or else of motion verbs which are deictically divergent. Thus, example (19) above, where ‘come’ is iterated, stands in contrast with (22) below, which instantiates a < ‘come’ … ‘go’ > pattern, where ‘come’ is presumably a grammaticalized marker of sequentiality and ‘go’ a fully lexical item. (22) Yakoma (Niger-Congo, Adamawa-Ubangi; Central African Republic) ʔálà gā à nṍ tí they connective[

On the grammaticalization of ‘come’ and ‘go’

(23) Akan (Niger-Congo, Kwa; Ghana) mé‑kɔɔ nkran kɔ‑tɔɔ ntamá. 1sg-go Accra go-buy cloth ‘I went to Accra in order to buy cloth and I did.’ (Sebba 1987: 193) On the other hand, purposiveness seems to have played a somewhat elusive role in the grammaticalization of iterated yàá and cìí into connectives in Kxoe: while Heine (2000: 279) gives an example in which cìí functions as a connective-cumpurposive, it is clear from the meaning of (17) above that the second occurrence of yàá does not encode purposiveness.

4.2.2 The futurity scenario There is an implicit link between narrative sequentiality and deictic subsequence. Events A and B stand in a sequential relationship if B occurred later than A. Likewise, a future event is by definition one which will occur later than the time of speech. This has prompted Marchese (1986: 142) and Kihm (1994: 108ff) to make the explicit claim of a direct link between grammaticalization as a future marker and grammaticalization as a sequential. Thus, Kihm does not differentiate the use of the verb ‘come’ as a sequential in Guinea-Bissau Crioulo from its use as a future marker. In turn, according to Marchese, the ‘come’-derived sequential of Godié is a diachronic by-product of its ‘come’-derived future. That being the case, the sentences in (24a–b) correspond to two successive stages along the same grammaticalization pathway. (24) Godié (Niger-Congo, Kru; Ivory Coast) a. ɔ yi lɨ. he fut[=come] eat ‘He will eat.’ (Marchese 1986: 126) b. nʉ́ wa yi m giyeé-nɛdɛ̀ mʉ. and 3pl sequential [=come] there ocean.relator-middle go ‘And they went right into the middle of the ocean.’ (Marchese 1986: 225) Neither Kihm nor Marchese, however, have provided any solid proof to substantiate their claim of a direct link between futurity and sequentiality. Such evidence may well be provided by Mere and Sara. In these two dialects of Gula, futurity is encoded by a set of virtual pronouns and, according to Nougayrol (1999: 126–127), their vocalic segment is derived from the verb ē/é¯ ‘come,’ as shown in (25). Crucially, the virtual pronouns have gone on to evolve into sequential markers, as can be seen in (26).

49

50

Philippe Bourdin

(25) Gula Sara (Nilo-Saharan; Central African Republic) […] ī-tɔ̄bɔ̄ 3sg.virtual[
6. I am grateful to an anonymous reviewer for this observation.

On the grammaticalization of ‘come’ and ‘go’

5. Concluding observations This study has left unanswered a number of questions which warrant the investigation of a broader data base as well as further, detailed research into specific languages. Thus, there is a clear need to gain greater insight into the diachronic pathways which have led to the genesis of ‘come’- or ‘go’-derived textual connectives. Because the paucity of historical data available to typologists is the rule rather than the exception, this is a difficult, though not insuperable, task. Another issue which needs to be explored involves the boundaries of textual connectivity: should markers of foregrounding or unexpectedness be considered to belong in the same broad functional category as sequentiality, or is it empirically and theoretically more defensible to delineate a restrictive definition of connectivity? Without prejudice to these and other questions, it seems appropriate to set down the basic assumption which has guided my work on the topic at hand and some related issues. The grammaticalization of ‘come’ and ‘go’ into sequentials is yet another instance of the metaphorical mapping of space on time which also underlies their grammaticalization into future markers (cf. Bourdin 2000) and into markers of continuous or progressive aspect (cf. Bourdin 2004b). As the term storyline indicates, a narrative text constitutes a structured space made up of a main path against a more or less static background. The path itself can be thought of as a succession of segments which are by definition goal-oriented — a reality which no morphemes are better suited to encode than the directional deictics ‘come’ and ‘go.’ To that extent, the languages which have been referred to in this paper do not so much confer directionality upon the text as recognize, sanction and even mimic the directionality already built into it. The question then is no longer why some languages use ‘come’ and ‘go’ as sequentials but rather why the majority of languages do not. It is a basic question and one which is to a certain extent obscured by the notion of grammaticalization pathway. What empirical studies like the present one suggest is that, though such pathways do exist, by and large the default option in a language is for any one of them to go untravelled. Why this should be so is an issue which will need to be addressed as the diachronic processes subsumed by grammaticalization receive further empirical attention. Another issue which is raised specifically by the semantic reanalysis of ‘come’ and ‘go’ is the interplay of deicticity with grammaticalization. There are at least two paradoxes here. One has already been mentioned: there does not seem to be a grammaticalization pathway which is travelled only by ‘come’ or only by ‘go’ in all languages. In other words, if in some languages ‘come’ has evolved into a specific type of marker, it is safe to predict that in a number of other languages, and in-

51

52

Philippe Bourdin

deed sometimes in the same languages, it is ‘go’ that will be shown to have followed that particular route. The other paradox concerns the mapping of spatial deixis on temporal deixis. Although their grammaticalization into markers of future tense is far more frequent, ‘come’ and ‘go’ have also provided markers of past tense in a number of languages. It would appear that, in at least some of those languages, such as Catalan, certain varieties of Arabic and possibly a number of Bantu languages, the motivation for this lies in their prior grammaticalization into connectives, followed by the routinization of their use in narrative contexts. The paradox, however, is that there may well also be a diachronic link between the use of ‘come’ and ‘go’ as sequentials and their evolution into future markers. In that event, sequentiality will have been involved in two stories with opposite endings.

Abbreviations 1 first person 2 second person 3 third person 3>3 agent in the third person, patient in the third person com comitative cvb converb def definite dem demonstrative f feminine fut future imp imperative loc locative m masculine

n neg nmlz obj pl poss prep prf prs pst ptcp sbjv sg

neuter negation nominalizer, nominalization object plural possessive preposition perfect present past participle subjunctive singular

References Aaron, Uche. 1999. Tense and aspect in Obolo grammar and discourse. Dallas TX: The Summer Institute of Linguistics and the University of Texas at Arlington. Alexandre, R. P. 1953. La langue möré. Vol. 2: Dictionnaire moré-français. Dakar: Institut Français d’Afrique Noire. Arnold, Werner. 1990. Das Neuwestaramäische: V. Grammatik. [Semitica Viva 4/5]. Wiesbaden: Otto Harrassowitz. Beavon, Keith H. 1991. Kɔɔzime verbal system. In Stephen R. Anderson & Bernard Comrie (eds.), Tense and aspect in eight languages of Cameroon. Dallas TX: The Summer Institute of Linguistics and the University of Texas at Arlington: 47–103. Bisang, Walter. 1992. Das Verb im Chinesischen, Hmong, Vietnamesischen, Thai und Khmer (Vergleichende Grammatik im Rahmen der Verbserialisierung, der Grammatikalisierung und der Attraktorpositionen) [Language Universals Series 7]. Tübingen: Gunter Narr.

On the grammaticalization of ‘come’ and ‘go’

Boretzky, Norbert. 1983. Kreolsprachen, Substrate und Sprachwandel. Wiesbaden: Otto Harrassowitz. Bourdin, Philippe. 2000. À propos des futurs ventifs et itifs: Remarques sur un paradoxe. Verbum 22(3): 293–311. Bourdin, Philippe. 2003. On two distinct uses of go as a conjoined marker of evaluative modality. In Roberta Facchinetti, Manfred Krug & Frank Palmer (eds.), Modality in contemporary English [Topics in English Linguistics 44], 103–127. Berlin: Mouton de Gruyter. Bourdin, Philippe. 2004a. Espace, temporalité, personne: Parcours en linguistique anglaise, française et générale. Thèse de Doctorat d’État, Université Paris XIII. Bourdin, Philippe. 2004b. La resémantisation des directionnels déictiques comme marqueurs aspectuels du «transcursif»: Vers une typologie. In Jiri Mirovsky, Anna Kotesovcova & Eva Hajicova (eds.), Proceedings. XVII International Congress of Linguists. Prague, July, 2003. Prague: Faculty of Mathematics and Physics, Charles University (on CD-Rom, Section W7). Bourdin, Philippe. 2004c. On the ‘great modal shift’ sustained by come to VP. Paper presented at the Second International Conference on Modality in English. Université de Pau, September 2004. Bourquin, Walther. 1913–1914. Adverb und adverbiale Umschreibung im Kafir. Zeitschrift für Kolonialsprachen 4: 118–155. Boyeldieu, Pascal. 1975. Études yakoma langue du groupe Oubanguien (R. C. A.). Morphologie Synthématique. Paris: Société d’études linguistiques et anthropologiques de France. Boyeldieu, Pascal. 1995. Le yakoma. In Raymond Boyd (ed.), Le Système verbal dans les langues oubanguiennes [Lincom Studies in African Linguistics 07], 113–139. München: Lincom. Boyeldieu, Pascal. 2000. La langue bagiro (République Centrafricaine) [Research in African Studies 4]. Bern: Peter Lang. Brustad, Kristen E. 2000. The syntax of spoken Arabic. A comparative study of Moroccan, Egyptian, Syrian and Kuwaiti dialects. Washington DC: Georgetown University Press. Carlson, Robert. 1994. A grammar of Supyire [Mouton Grammar Library 14]. Berlin: Mouton de Gruyter. Caubet, Dominique. 1995. Ža, element narrative dans le récit familier en arabe marocain. Studia Orientalia 75: 41–48. Chhangte, Lalnunthangi. 1989. The grammar of simple clauses in Mizo. In David Bradley (ed.), Papers in South-East Asian linguistics n° 11: South-East Asian syntax [Pacific Linguistics, Series A-77], 93–174. Canberra: Research School of Pacific and Asian Studies, The Australian National University. Correll, Christoph. 1978. Untersuchungen zur Syntax der neuwestaramäischen Dialekte des Antilibanon [Abhandlungen für die Kunde des Morgenlandes 44]. Wiesbaden: Kommissionsverlag Franz Steiner. Crystal, David. 2003. A dictionary of linguistics and phonetics. 5th edn. Oxford: Blackwell. Davis, Karen. 2003. A grammar of the Hoava language, Western Solomons [Pacific Linguistics 535]. Canberra: Research School of Pacific and Asian Studies, The Australian National University. Day, Christopher. 1973. The Jacaltec language [Language Science Monographs 12]. Bloomington IN: Indiana University Publications. De Jong, Rudolf. 2000. A grammar of the Bedouin dialects of the Northern Sinai littoral [Handbuch der Orientalistik. Erste Abteilung: Der nahe und mittlere Osten 52]. Leiden: Brill.

53

54

Philippe Bourdin

Detges, Ulrich. 2004. How cognitive is grammaticalization? The history of the Catalan perfet perifràstic. In Olga Fischer, Muriel Norde & Harry Perridon (eds.), Up and down the cline: The nature of grammaticalization [Typological Studies in Language 59], 211–227. Amsterdam: John Benjamins. Dez, Jacques. 1980. La syntaxe du malgache. Tome 1. (Thèse de Doctorat d’État, Université Paris VII). Paris: Librairie Honoré Champion. Disterheft, Dorothy. 1986. The structure and diachrony of consecutives. Diachronica 3(1): 1–14. Doke, Clement Martyn & S. Machabe Mofokeng. 1957. Textbook of Southern Sotho grammar. London: Longman. Drame, Mallafé. 1983. Axes de la morphologie du mandinka [Les langues nationales au Sénégal 3]. Dakar: Centre de linguistique appliquée de Dakar. Ebert, Karen. 1987. Discourse function of motion verbs in Chadic. Afrikanistische Arbeitspapiere 10: 53–71. Ebert, Karen. 2003. ‘Come’ and ‘go’ as discourse connectors in Kera and other Chadic languages. In Erin Shay & Uwe Seibert (eds.), Motion, direction and location in languages. In honor of Zygmunt Frajzyngier [Typological Studies in Language 56], 111–122. Amsterdam: John Benjamins. Egner, Inge. 1989. Précis de grammaire wobé [Linguistique XV]. Abidjan: Annales de l’Université d’Abidjan. Elbert, Samuel H. 1974. Puluwat grammar [Pacific Linguistics, Series B-29]. Canberra: Research School of Pacific and Asian Studies, The Australian National University. Everett, Daniel L. & Barbara Kern. 1997. Wari’. The Pacaas Novos language of western Brazil. London: Routledge. Fedson, Vijayarani Jotimuttu. 1987. A note on aspect in Tamil. In Elena Bashir, Madhav M. Deshpande & Peter Edwin Hook (eds.), Select Papers From Sala-7: South Asian Languages Analysis Roundtable Conference, Ann Arbor, Michigan, May 1985, 118–131. Bloomington IN: Indiana University Linguistics Club. Feghali, Michel. 1928. Syntaxe des parlers arabes actuels du Liban [Bibliothèque de l’École des langues orientales vivantes 9]. Paris: Librairie orientaliste Paul Geuthner. Fischer, Olga, Muriel Norde & Harry Perridon (eds.). 2004. Up and down the cline: The nature of grammaticalization [Typological Studies in Language 59]. Amsterdam: John Benjamins. Frajzyngier, Zygmunt. 2002. A grammar of Hdi. With Erin Shay. Berlin: Mouton de Gruyter. Gordon, Raymond G., Jr. (ed.). 2005. Ethnologue: Languages of the world. 15th edn. Dallas TX: The Summer Institute of Linguistics. Hahn, Reinhard F. 1991. Spoken Uyghur. With Ablahat Ibrahim. Seattle WA: University of Washington Press. Heine, Bernd. 2000. On the rise of new-event markers in Kxoe. In Rainer Voßen, Angelika Mietzner & Antje Meißner (eds.), Mehr als nur Worte…: Afrikanistische Beiträge zum 65. Geburtstag von Franz Rottland, 269–284. Köln: Rüdiger Köppe. Heine, Bernd & Tania Kuteva. 2002. World lexicon of grammaticalization. Cambridge: CUP. Jones, Ross McCallum. 1998. The Boko/Busa language cluster [Lincom Studies in African Linguistics 30]. München: Lincom. Judaxin, Konstantin Kuzmič. 1965. Kirgizsko-russkij slovar’. Moscow: Izdatel’stvo Sovetskaja èntsiklopedija. Keesing, Roger M. 1991. Substrates, calquing and grammaticalization in Melanesian pidgin. In Elizabeth Closs Traugott & Bernd Heine (eds.), Approaches to grammaticalization. Vol. 1. [Typological Studies in Language 19], 315–342. Amsterdam: John Benjamins.

On the grammaticalization of ‘come’ and ‘go’

Kihm, Alain. 1994. Kriyol syntax. The Portuguese-based creole language of Guinea-Bissau [Creole Language Library 14]. Amsterdam: John Benjamins. Lafon, Michel. 1995. Le Shona et les Shonas du Zimbabwe: Recueil d’informations sur la langue et la culture. Gweru (Zimbabwe): Mambo Press, & Paris: L’Harmattan. Lehmann, Christian. 1995. Thoughts on grammaticalization. München: Lincom. Longacre, Robert E. 1990. Storyline concerns and word order typology [Supplement to Studies in African Linguistics 10]. Los Angeles CA: The James S. Coleman African Studies Center and the Department of Linguistics, UCLA. Luraghi, Silvia. 1993. Verb serialization and word order: Evidence from Hittite. In Henk Aertsen & Robert J. Jeffers (eds.), Historical linguistics 1989. Papers from the 9th International Conference on Historical Linguistics, Rutgers University, August 1989, 267–281. Amsterdam: John Benjamins. Mabille, Adolphe & Hermann Dieterlen. 1961. Southern Sotho-English dictionary (new edn. revised by R. A. Paroz). Morija (Lesotho): Morija Sesuto Book Depot. Marchese, Lynell. 1986. Tense/Aspect and the development of auxiliaries in Kru languages. Dallas TX: The Summer Institute of Linguistics and the University of Texas at Arlington. Maurer, Philippe. 1995. L’Angolar: Un créole afro-portugais parlé à São Tomé [Kreolische Bibliothek 16]. Hamburg: Helmut Buske. Mitchell, Terence F. & Shahir A. al-Hassan. 1994. Modality, mood and aspect in spoken Arabic with special reference to Egypt and the Levant [Library of Arabic Linguistics]. London: Kegan Paul. Morphy, Frances. 1983. Djapu, a Yolngu dialect. In Robert M. W. Dixon & Barry J. Blake (eds.), Handbook of Australian languages. Vol. 3, 1–188. Canberra: The Australian National University Press. N’Diaye-Correard, Geneviève. 1970. Études fca ou balante (dialecte ganja). Paris: Société pour l’Étude des Langues Africaines. Nicolle, Steve. 2003. Distal aspects in Bantu. In Katarzyna M. Jaszczolt & Ken Turner (eds.), Meaning through language contrast. Vol. II. [Pragmatics and Beyond, New Series 100], 3–22. Amsterdam: John Benjamins. Noss, Philip A. 1973. Introduction to Gbaya. Meiganga (Cameroon): Centre de traduction Gbaya. Nougayrol, Pierre. 1999. Les Parlers gula. Centrafrique, Soudan, Tchad: Grammaire et lexique. Paris: CNRS Éditions. Poppe, Nicholas. 1962. Uzbek newspaper reader (with glossary) [Uralic and Altaic Series 10]. Bloomington IN: Indiana University. Prost, André. 1964. Contribution à l'étude des langues voltaïques. Dakar: Mémoires de l’Institut Français d’Afrique Noire 70. Quesada, Juan D. 1994. Periphrastische Aktionsart im Spanischen: Das Verhalten einer Kategorie der Übergangszone [European University Studies, Series XXI, Linguistics 144]. Bern: Peter Lang. Reh, Mechthild. 1996. Anywa language. Description and internal reconstructions. [Nilo-Saharan, Linguistic Analyses and Documentation 11]. Köln: Rüdiger Köppe. Rodegem, Firmin M. 1967. Précis de grammaire rundi. Brussels: Éditions scientifiques E. StoryScientia. Rosenhouse, Judith. 1984. The Bedouin Arabic dialects. General problems and a close analysis of North Israel Bedouin dialects. Wiesbaden: Otto Harrassowitz. Rowlands, Evan C. 1969. Teach yourself Yoruba. London: Hodder & Stoughton.

55

56

Philippe Bourdin

Sachnine, Michka. 1997. Dictionnaire yorùbá-français. With Akin Akinyemi. Paris: Éd. Karthala, & Ibadan: IFRA-IBADAN. Samarin, William J. 1970. Sango, langue de l’Afrique centrale. Leiden: Brill. Santandrea, Stefano. 1961. Comparative outline-grammar of Ndogo, Sere, Tagbu, Bai, Bviri [Collana di Studi Africani dei Missionari Comboniani, Museum Combonianum 13]. Bologna: Editrice Nigrizia. Sebba, Mark. 1987. The syntax of serial verbs. An investigation into serialisation in Sranan and other languages [Creole Language Library 2]. Amsterdam: John Benjamins. Sitoe, Bento. 2001. Verbs of motion in Changana. Leiden: Research School of Asian, African, and Amerindian Studies, Leiden University. Small, Priscilla C. 1990. A syntactic sketch of Coatzospan Mixtec. In C. Henry Bradley & Barbara E. Hollenbach (eds.), Studies in the syntax of Mixtecan languages. Vol. 2, 261–479. Dallas TX: The Summer Institute of Linguistics and the University of Texas at Arlington. Taylor, John R. 1987. Tense and metaphorisations of time in Zulu. In Wolfgang Lörscher & Rainer Schulze (eds.), Perspectives on language in performance. Studies in linguistics, literary criticism, and language teaching and learning. To honour Werner Hüllen on the occasion of his sixtieth birthday. Vol. I, 214–229. Tübingen: Gunter Narr. Thompson, Katrina Daly & Antonia Folárìn Schleicher. 2001. Swahili Learners’ Reference Grammar. Madison WI: National African Language Resource Center, University of WisconsinMadison. Timofeeva, V. T. 1959. Pridatočnye predloženija vremenu v Marijskom jazyke. Trudy (Marijskogo Naučno-Issledovatel’skogo Instituta Jazyka, Literatury i Istorii) 13: 137–143. Traugott, Elizabeth Closs. 1978. On the expression of spatio-temporal relations in language. In Joseph H. Greenberg, Charles A. Ferguson & Edith A. Moravcsik (eds.), Universals of human language. Vol. 3: Word structure, 369–400. Stanford CA: Stanford University Press. Velázquez-Castillo, Maura. 2004. Serial verb constructions in Paraguayan Guarani. International Journal of American Linguistics 70(2): 187–213. Westermann, Dietrich. 1907. Grammatik der Ewe-Sprache. Berlin: Dietrich Reimer. Wilson, William A. A. 1962. The crioulo of Guiné. Johannesburg: Witwatersrand University Press. Wolff, Ekkehard. 1983. A grammar of the Lamang language (Gwàɗ Làmàŋ) [Afrikanistische Forschungen 10]. Glückstadt: J. J. Augustin. Ziervogel, D. 1959. A grammar of Northern Transvaal Ndebele. Pretoria: J. L. Van Schaik.

Appendix Table 1. Languages with a sequential derived from ‘come’ or ‘go’ Afro-Asiatic Chadic Hdi: ‘go’ > sequential (Frajzyngier 2002: 428–430) Lamang: ‘go’ (?) > sequential (Wolff 1983: 244–245) Kera: ‘come’ > consecutive sequential; ‘go’ > counter-consecutive sequential (Ebert 2003: 115) Kwaami: ‘come’ > sequential (Ebert 1987: 67) Semitic Western Neo-Aramaic (Semitic; Israel): ‘come’ and ‘go’ > sequentials (Correll 1978: 81; Arnold 1990) Bedouin Arabic of the northern Sinai: ‘go’ > sequential (De Jong 2000: 570) Bedouin Arabic of northern Israel: ‘come’ > sequential (Rosenhouse 1984: 122) Spoken Moroccan Arabic: ‘come’ > foregrounding sequential (Caubet 1995) Australian Dhuwal: ‘go’ > sequential (Morphy 1983: 32, 89) Austronesian Malagasy: ‘come’ > sequential (Dez 1980: 140) Oceanic Hoava: ‘go’ > sequential (Davis 2003: 119–120) Chapacura-Wanham Wari’ (Brazil): ‘go’ > sequential (Everett and Kern 1997: 72–74) Creoles and pidgins Sranan (English-based; Suriname): ‘come’ > anaphoric verb > sequential (Boretzky 1983: 170) Saramaccan (English-based; French Guiana and Suriname): ‘go’ > anaphoric verb > sequential (Boretzky 1983: 170) Melanesian Pidgin (Solomons Islands; English-based): ‘go’ > sequential (Keesing 1991: 332–333) Sango (Ngbandi-based; Central African Republic): ‘come’ and ‘go’ > foregrounding markers in coordinate constructions (Samarin 1970: 90, 123) Guinea Bissau Crioulo (Portuguese-based; Guinea-Bissau): ‘come’ > sequential (Wilson 1962: 5) Indo-European Hittite: ‘come’ and ‘go’ > sequentials (?) (Disterheft 1986: 9–10; Luraghi 1993: 272) Khoisan Kxoe: ‘go’ > sequential meaning ‘later, then’ (Heine 2000: 279); ‘come’ and ‘go’ > iterated verbs > sequential foregrounding the event as new (Heine 2000) Mayan Jakalteko: ‘come’ (?) > sequential (Day 1973: 32)

58

Philippe Bourdin

Table 1. (cont.) Niger-Congo Atlantic-Congo Atlantic Balanta-Ganja: ‘come’ > sequential (N’Diaye-Correard 1970: 110–111) Volta-Congo Benue-Congo Narrow Bantu Koonzime: ‘come’ > sequential (Beavon 1991: 83) Swahili: ‘go’ (?) > sequential (Thompson and Schleicher 2001: 184) Fuliiru: ‘go’ (?) > sequential (Nicolle 2003: 14) Rundi: ‘go’ (?) > sequential (Rodegem 1967: 157) Shona: ‘come’ > sequential/connective encoding subsequentiality or finality (Lafon 1995: 94) Ndebele: ‘go’ > sequential (Ziervogel 1959: 152) Zulu: ‘come’ > sequential subordinator (Taylor 1987: 227) Xhosa: ‘come’ and ‘go’ > sequentials (Bourquin 1913–1914: 143–144) Tsonga: ‘come’ (?) > sequential (Sitoe 2001: 90) Yoruba: ‘come’ > sequential (Sachnine 1997: 278) Izi: ‘go’ > sequential (Longacre 1990: 149) Kru Bété: ‘come’ > sequential (Marchese 1986: 144) Godié: ‘come’ > sequential (Marchese 1986: 143, 225) Siamou: ‘go’ > sequential (Prost 1964: 377) Dewoin: ‘come’ > sequential subordinator (Marchese 1986: 162) Kwa Ewe: ‘go’ > sequential (Westermann 1907: 98) Adamawa-Ubangi Mumuye: ‘come’ and ‘go’ > sequentials (Longacre 1990: 149–150) Gbaya (Northwest): ‘come’ and ‘go’ > sequentials (Noss 1973: 158) Belanda Viri: ‘come’ > sequential (Santandrea 1961: 115) Ndogo: ‘come’ > sequential (Santandrea 1961: 115) Gur Mòoré: ‘go’ > sequential (Alexandre 1953: 393–394, quoted in Heine and Kuteva 2002: 156) Supyire: ‘come’ (?) > sequential in serial construction; ‘go’ (?) > sequential (Carlson 1994: 337, 348ff) Toussian: ‘come’ > sequential (Prost 1964: 305) Mande Boko: ‘go’ > sequential in ‘before’ clauses (Jones 1998: 171, 259–260) Mandinka: ‘come’ and ‘go’ > sequentials (Drame 1983: 75–76) Nilo-Saharan Bongo-Bagirmi Bagiro: ‘come’ > sequential (Boyeldieu 2000: 202) Gula Mere and Gula Sara: ‘come’ > irrealis > sequential (Nougayrol 1999: 128) Nilotic Anuak: ‘come’ > sequential (Reh 1996: 422–424)

On the grammaticalization of ‘come’ and ‘go’

Table 2. Languages with a peripheral connective derived from ‘come’ or ‘go’ Afro-Asiatic Spoken Egyptian Arabic: ‘go’ > marker encoding suddenness (Mitchell and al-Hassan 1994: 77) Altaic Turkic Uyghur: ‘go’ > marker encoding suddenness (Hahn 1991: 614–615) Uzbek: ‘go’ > marker encoding suddenness (Poppe 1962: 167) Kirghiz: ‘go’ > marker encoding suddenness (Judaxin 1965: 381) Creoles and pidgins Angolar (Portuguese-based; São Tomé): ‘come’ > marker encoding subsequentiality/ culminativity (Maurer 1995: 106) Dravidian Tamil: ‘come’ > marker encoding subsequentiality/culminativity (Fedson 1987: 126) Indo-European Germanic English: ‘come’ > marker encoding subsequentiality/culminativity (Bourdin 2004c) Romance Old Catalan: ‘go’ > marker encoding suddenness (Detges 2004: 217) Costa-Rica and Mexican Spanish: ‘go’ > marker encoding suddenness (Quesada 1994: 167) Niger-Congo Volta-Congo Benue-Congo Narrow Bantu Ndebele: ‘come’ > marker encoding finality (Ziervogel 1959: 152) Obolo: ‘come’ > foregrounding marker; duplicated ‘come’ > sequential usually encoding unexpectedness (Aaron 1999: 84–89, 99, 159) Yoruba: ‘come’ > marker encoding subsequentiality/culminativity (Rowlands 1969: 66) Kru Northern Wè: ‘come’ > marker of inevitable consequence in apodosis following protasis (Egner 1989: 124) Adamawa-Ubangi Tupuri: ‘go’ > marker encoding suddenness (Ebert 2003: 116) Nilo-Saharan Jur Modo: ‘go’ > foregrounding marker (Longacre 1990: 107) Sino-Tibetan Mizo: ‘come’ > marker encoding unexpected change of state (Chhangte 1989: 119) Tupi-Guarani Guarani: ‘come’ > marker encoding unexpectedness (Velázquez-Castillo 2004: 203) Uralic Mari: ‘go’ > marker encoding suddenness (Timofeeva 1959: 139)

59

chapter 3

Grammaticalization, typology and semantics: Expanding the agenda* Zygmunt Frajzyngier University of Colorado The paper aims at providing an expanded model of grammaticalization, where the term is understood as the process through which a language develops grammatical means of coding various formal, semantic or pragmatic functional domains. It thus subsumes the traditional scope of grammaticalization, but also covers the development of devices such as linear order, phonological means and the repetition of phrases or lexical items, among others. In this expanded scenario of grammaticalization, the paper addresses such key issues as the motivations for grammaticalization, the choice of coding means, and the consequences of grammaticalization for grammatical systems and for language change. The article also discusses the relevance of the principle of functional transparency, the principle of indirect means and the consequences of the initial state for both principles.

1. Introduction 1.1 Goal, scope and organization of the paper The goal of the present study is to sketch an expanded scope of grammaticalization which covers the emergence of all coding means, including the following: * Research for this work was supported by an award from the Charles and Jane Butcher Foundation and by the National Science Foundation grant Nr. 0439940 for the study of Wandala and Giziga. I am grateful to Eric Johnston for sharing his data on Mutsuvan and to Erin Shay for reading an earlier version of this paper and giving valuable comments. Henry Tourneux provided the correct identification for the plant Cyperus esculentus, which in Cameroon is often referred to by the French name chiendent, a term which in France designates a different plant. My thanks to María José López-Couso, whose sharp eye spared me an embarrassing error and to Marian Safran for editorial work. I am also grateful to an anonymous referee whose close and critical reading and constructive comments prompted me to revise the paper, tighten up the presentation and check additional references. Finally, this study would not have been possible without the speakers of Wandala, Mina, Hdi, Giziga, Gidar, Hona and Kafa who over the years were willing to sit with me and help me in the study of their languages. Any errors of fact or interpretation remain my sole responsibility.

62

Zygmunt Frajzyngier

a. All phonological devices. b. Lexical categories and sub-categories. c. Linear order. d. Repetition of phrases and lexical items. e. Systems of nominal classifications and classifiers. f. Inflectional and derivational morphology. g. Free grammatical morphemes. The traditional scope of grammaticalization covers only the development of grammatical morphemes from free lexical items, (g) above, touching also on (f), namely inflectional and derivational morphology. The overlap between the expanded and the traditional scope of the term grammaticalization is therefore limited to very narrow areas of the grammar of some languages. Moreover, the expanded scope raises questions which do not arise in the traditional scope, even with respect to free grammatical morphemes and inflectional and derivational morphology. Grammaticalization through the first four means on the above list will be discussed in the present study. Grammaticalization in the expanded sense also includes the emergence of semantic, pragmatic and discourse functions regardless of how these functions are coded within the grammatical system. Traditional studies of grammaticalization deal with the issue of function only as an outcome of the evolution of lexical items into grammatical morphemes. By contrast, grammaticalization in the expanded sense covers all processes through which the grammar comes to be. Theoretical questions regarding grammaticalization of means and functions should include the following issues: a. Motivations for grammaticalization. b. Choice of formal means in grammaticalization. c. Consequences of grammaticalization for the grammatical system. d. Consequences of grammaticalization for language change. There is no overlap between the theoretical questions of the traditional scope of grammaticalization and those of the expanded scope proposed here. The present study attempts to answer these four theoretical questions. Grammaticalization as conceived of herein therefore provides a powerful tool for explaining language and language change and for the discovery of functions (meanings) encoded in the lexical categories and in the grammatical system of the language. The study is organized as follows. In the Introduction I define the relevant terms and comment on the major concerns and criticisms of the traditional

Grammaticalization, typology and semantics

approach to grammaticalization. In Section 2 the expanded agenda for grammaticalization is described in detail. Section 3, in turn, presents the major theoretical issues and the potential outcomes of the expanded model; once questions are formulated, a brief discussion of the possible answers to each will follow. In Section 4, I examine the principle of functional transparency (a major motivating principle for grammaticalization), the principle of indirect means (responsible for grammaticalization in some domains) and the consequences of the initial state for both principles. Section 5 deals with the grammaticalization of selected coding means, namely phonological devices, linear order, repetition of phrases and the grammaticalization of lexical categories and sub-categories. A comparative study of the principle of functional transparency as applied to the coding of grammatical relations in various Chadic languages is given in Section 6. Section 7 raises the question of the choice of coding means in grammaticalization, while the importance of grammaticalization as a motivation for language change is discussed in Section 8. Finally, Section 9 points to the use of grammaticalization as a research tool in the study of meaning and Section 10 provides some concluding remarks.

1.2 Terminology The term grammar refers to the totality of coding means other than individual lexical items in a given language and to the functions coded by those means. The coding means thus include phonological devices, inflectional and derivational morphology, lexical categories and sub-categories, linear order, repetition of lexical categories, reduplication and all possible combinations thereof. These means, separately or in combination, interact in the coding of various functional domains. One of the characteristics of the interaction of various coding means is their complementarity in the sense that, if a given function is coded by one means or a combination of means within an utterance, it is not coded by another means or combination of means within the same utterance. Thus, if the future tense is coded by the auxiliary will in English, it is not coded in the same utterance by the structure be going to. Similarly, in Polish, the subject is coded through the agreement system on the verb; when a pronominal subject is used in addition to agreement, it codes a different function than agreement, usually the switch-reference function and contrastive focus (Frajzyngier 1997). The prevalent, but by no means unique, understanding of the term grammaticalization is confined to the development of grammatical morphemes from lexical items and constructions. This approach has its roots in the works of von Humboldt (1825), later Meillet (1912) and Kuryłowicz (1964) and more recently Samuels (1972), and was implemented with respect to a large number of grammat-

63

64

Zygmunt Frajzyngier

ical morphemes across languages by Lehmann (2002), Heine and his collaborators in a large number of well-known publications (cf., among others, Heine et al. 1991, 1993; Heine and Kuteva 2002) and a host of other scholars.1 The term grammaticalization is understood here as the evolutionary process whereby a language develops grammatical means to code various functional domains, whether formal, semantic or pragmatic. In this usage the label grammaticalization is in contrast with lexicalization, which refers to the development of various lexical means to code the same functions.2 This extended understanding of the term has its predecessors in Frajzyngier (1991, 1996), Giacalone Ramat and Hopper (1998) and Hopper and Traugott (2003). Grammaticalization so understood is also akin to Hopper’s (1987) notion of emergent grammar, which refers to the conception of the grammatical system as constantly undergoing change. The belief of constant change is not controversial, and the present study is very much in the spirit of Hopper’s emergent grammar, albeit without some of Hopper’s conclusions. Thus, for instance, the fact that some grammatical categories are emerging does not imply that there are no categories which have already been grammaticalized. The term grammaticalization also includes the development of functional domains and sub-domains as coded by the grammatical means.3

1.3 Major concerns of the lexical item to grammatical morpheme approach The purpose of the present section is to consider some theoretical questions which arise with use of the traditional approach to grammaticalization. The discussion below also addresses the issues raised by critics of the traditional approach, most fully exemplified in the work of Heine and his collaborators, although many other scholars (including this author) work or have worked within this approach. As recently as 2002, Heine and Kuteva defined grammaticalization as “the development from lexical to grammatical forms and from grammatical to even 1. For a concise but excellent history of grammaticalization studies in the traditional approach, see Hopper and Traugott (2003). 2. An anonymous reviewer asks whether this means “that every pattern of language structure fits under one or the other of these two notions.” At present I would say that it does, bearing in mind, however, that there are languages where the notion of lexical item may be difficult to establish and where lexicalization includes formation of complex structures from the available syntactic means. Such is the case in some Amerindian languages. The instance of ritualization and conventionalization in language as described in Haiman (1994) (pointed out by the anonymous reader) may also lie outside the lexicalization vs. grammaticalization dichotomy. 3. The notion of functional domains has been used by a number of scholars, including Givón (1990). For recent discussions, see Frajzyngier and Mycielski (1998) and Croft (2001).

Grammaticalization, typology and semantics

more grammatical forms” (2002: 2). They further stated that grammaticalization involves four interrelated mechanisms (Heine and Kuteva 2002: 2): a. Desemanticization or semantic bleaching. b. Extension, context generalization or use in new contexts. c. Decategorization or loss in morphosyntactic properties of lexical or other less grammaticalized forms. d. Erosion, phonetic reduction or loss in phonetic substance. Heine and Kuteva did not define their understanding of the term mechanism. The relevant ordinary English sense of the term4 is “instrument or process, physical or mental, by which something is done or comes into being” (American Heritage Dictionary 1980). Similarly, the ordinary French meaning of the term is a combination of organs or pieces destined to assure the functioning of a system or “the ways a system functions” (Larousse de poche 2005: 503). Similar understandings obtain in other languages where this term has been borrowed. However, Heine and Kuteva (2002) have explained the mechanisms of grammaticalization as the direct or indirect effects of grammaticalization on the lexical sources of grammatical morphemes rather than the means through which the grammaticalization is realized. A full description of the evolution of a lexical item or a construction into a grammatical morpheme must answer the following questions: a. What is the motivation for the grammaticalization? b. Under what conditions are lexical items rather than other means selected as the sources of grammaticalization? c. What determines the choice of one rather than another lexical item? Despite the very large number of traditional grammaticalization studies completed within the past 30 years devoted to the change from lexical item to grammatical morpheme, we have only partial answers to the last question. Heine and Kuteva (2002), for instance, is wholly devoted to this issue. Nevertheless, the first two questions remain open. The answers to the theoretical questions of the expanded model which will be provided in the present study will, however, also subsume answers to these two questions.

4. The term is not part of standard linguistic terminology as evidenced by the fact that it is not included in standard linguistic dictionaries, such as Matthews (1997).

65

66

Zygmunt Frajzyngier

1.4 Criticisms of the traditional approach to grammaticalization Another controversial issue in the traditional approach to grammaticalization has been that of unidirectionality. Grammaticalization from a lexical item to a grammatical morpheme is certainly more frequent than the process in the opposite direction. There is an explanation for this fact, one based on the lack of motivation for such a process in the latter direction: the shift from grammatical structure to a lexical item would, in most cases, involve the lexicalization of a function which is already coded in the language either by a specific construction or by lexical items which are components of the construction. The creation of another means of coding a function already coded in the language would be an unmotivated language change.5 This explanation for the statistical unidirectionality of the change from lexical item to grammatical morpheme is preferable to Haspelmath’s (2004) defence of unidirectionality in which he pointed out, among other things, that the development of lexical items from grammatical morphemes is non-systematic. But is not any lexicalization by means other than derivational morphology non-systematic? What is systematic about English whey or Migaama (East Chadic) kàmé ‘hunt mice’ (Frajzyngier 2004b)? The relatively rare attested cases of the change of a grammatical construction into a lexical item bear the same unsystematic characteristics as any other lexicalization. Consider, for example, the jocular but widespread druther ‘choice, preference’ in English, as in Will you let me know your druthers, from would rather. Frajzyngier (1996, 1997) showed that unidirectionality is theoretically unjustified and factually incorrect with respect to one grammatical morpheme changing into another. The notions more grammatical and less grammatical and more concrete and less concrete first used by Kuryłowicz with respect to lexical items, have no known definitions or diagnostic tools with respect to relationships among grammatical morphemes. Heine et al. (1991) saw communicative needs, problem solving and creativity as the main motivations for the employment of old means for new functions. Hopper and Traugott (2003) postulated speakers’ interactions, communicative processes and economy as among the motivations for grammaticalization. These motivations are, however, much too vague to provide predictions about grammatical structures and, as Hopper and Traugott noted, may imply the teleological nature of language. None of these motivations can be used as research tools, because every 5. An anonymous reader points out that the existence of synonymous constructions contradicts this statement. But do we really have synonymous constructions in languages, i.e. constructions that, in addition to any referential meaning, also have the same function within a clause or a sentence and in discourse, and are used by the same speakers? Claims of such synonymy should be proved rather than asserted.

Grammaticalization, typology and semantics

instantiation of language use is motivated by communicative needs, by speakers’ interaction or by some kind of communicative problem solving. Cliché and idiomatic expressions apart, every instantiation of language use is also an example of some type of creativity. Much of the previous work on grammaticalization has been vigorously criticized by Campbell (2001), Campbell and Janda (2001), Joseph (2001, 2004) and Newmeyer (2001). These scholars have pointed out that unidirectionality is not absolute, that there is grammaticalization without semantic bleaching and that phonological reduction is a result not of grammaticalization but rather of the frequency of use. They also raised objections related to the status of grammaticalization among linguistic sub-disciplines, objections which were partially addressed in Haspelmath (2004). However, all the criticisms of grammaticalization theory and practice pertained only to the lexical item to grammatical morpheme scope, even though studies of grammaticalization with a wider scope have been available for some time.

2. The content of the expanded model: Means and functions In the present section I discuss the expanded notion of grammaticalization as a process through which the grammatical means and the grammatical functions come about. The grammar of any language is the product of this process, a product which undergoes continuing change. The formal means resulting from grammaticalization include the following: a. Phonological means, such as the use of tone to code grammatical distinctions, intonation, pauses, vowel reduction and retention, and vowel and consonant harmony. b. Lexical categories and sub-categories (but not individual lexical items), such as — but not limited to — nouns, verbs, adjectives, auxiliary verbs, prepositions and postpositions, complementizers, determiners, subordinating particles and intensifiers. c. Linear order in all of its instantiations, i.e. with different lexical categories in its scope; for instance, use of word order to code questions, grammatical relations, aspects, information structure and head-modifier relations. d. Inflectional means on all lexical categories, for example on nouns, verbs, auxiliaries, adjectives, adverbs, adpositions, complementizers and determiners. e. Repetition of lexical items and phrases (as illustrated below). It is important to recognize that none of these means, with the exception of free

67

68

Zygmunt Frajzyngier

grammatical morphemes, inflection and derivation, has inherent or iconic semantic properties which make them particularly appropriate for the grammaticalization of any specific functional domain. Even the repetition of a lexical item or phrase, which is often taken for granted, does not need to have iconic properties. In many Chadic languages, repetition codes, among other things, perfect or perfective aspect, progressive aspect, adverbs and discourse continuity. Plurality of nouns or plurality of the event is about the only iconic function of repetition. Out of the five coding means listed above, only free grammatical morphemes and some inflectional and derivational morphemes may have lexical items as their source. For many languages, however, these morphemes represent a relatively small part of the grammatical system. It is this aspect of grammaticalization that has received the most attention, resulting in abundant lists of lexical sources of grammatical morphemes. Given the advances of knowledge in this type of grammaticalization, this issue will not be developed further in the present study. Traditional grammaticalization was often formulated in terms of the source and the target. This conceptualization is applicable only to the evolution from a lexical item into a grammatical morpheme, but it has no place when we are referring to the grammaticalization of formal means other than inflectional and free morphemes or the grammaticalization of functional domains. Grammaticalization also includes the emergence of means which do not carry a specific meaning by themselves but which contribute to the creation of other means used to code functional domains. To this group belong: a. Systems of nominal classification (including genders) to enable a system of reference (cf. Frajzyngier and Shay 2003). b. Systems of nominal classifiers to enable a system of individuation, of number coding and, in a few cases also, of reference across discourse.6 c. Phonological means, such as pauses, vowel reduction and retention. The study of the emergence of these coding means requires much work. For example, we know next to nothing about the grammaticalization of phonological means, although we have very substantial knowledge of the structure of phonological systems. An example of a possible question here is why some languages have grammaticalized vowel retention as a marker of phrasal boundary while others have not. Similarly, we know from the pioneering work of Kuryłowicz (1964), followed by Greenberg (1978) and subsequent studies by others, how grammatical gender emerges, but we know very little about the conditions under which a language develops gender systems. The crucial question here is why some languages 6. For a different interpretation of both of these systems, see Aikhenvald (2000).

Grammaticalization, typology and semantics

have developed a gender system while others have not. The answer must be that some languages have resolved the issue of the coding of reference by means which do not require gender. Another facet of grammaticalization is the emergence of functions. The grammaticalization of functions involves the coding of a function which hitherto might not have been available in the language or which might not have been coded by grammatical or lexical means. The latter case occurs when a given function is expressed by periphrastic means potentially different for different speakers. The types and number of functions constitute an open set and traditionally include various types of modalities, predications, semantic relations between predicates and noun phrases, relationships among noun phrases, pragmatic functions (including information structure), discourse functions (including backgrounding) and all kinds of relationships between the speaker and the listener. This is merely a selection of a few very general domains. Any non-aprioristic grammar will describe a host of functions which have not been noted in other languages or which do not fit within the known functional domains. The reason why I do not identify the grammaticalization of means with the grammaticalization of functions is that, once a means is grammaticalized, its function may change. Moreover, once a function has been grammaticalized, it can be coded by a number of means. Studies of language change provide ample evidence for the substitution of coding means and for the emergence of new formal means to code the same function, albeit often with functional extensions or reductions (compare gender coding through nominal endings in Latin and gender coding in French through articles). Furthermore, the grammaticalization of one function may in itself be the motivation for the grammaticalization of another function within the same domain. An example of such a scenario is the grammaticalization of the polite imperative after the grammaticalization of the imperative in several Chadic languages. All in all, the study of the grammaticalization of functions is in its infancy and awaits both theoretical and methodological developments. In this paper I shall illustrate grammaticalization for the remaining types of coding means (cf. Section 5) and concentrate on the following issues: the grammaticalization of lexical categories, linear-order constraints, phonological means and repetition, as factors motivating further grammaticalizations. A well-argued explanation of the coding means used for one function requires the explanation of other coding means used for other functions within the same domain, a requirement which by far exceeds the limitations of space in the present study. In addition, given that grammaticalizations from sources other than the lexicon have not received much attention, my discussion below contains more questions than answers.

69

70

Zygmunt Frajzyngier

Campbell (2001) challenged the legitimacy of using the term grammaticalization in the expanded sense advocated in the present study. His argument was that the term grammaticalization, when it applies to the whole grammar, overlaps with the terms grammar and emergent grammar. As a consequence, he stated that the terms grammar, emergent grammar or typology suffice (Campbell 2001: 155). Since Campbell did not define his understanding of the term grammar, his claim that other disciplines cover the same issues as grammaticalization remains quite vague. Behind Campbell’s criticism of the expanded scope of grammaticalization lies the claim that other sub-disciplines within linguistics, presumably descriptive grammar, historical linguistics and typology, deal with the expanded issues of grammaticalization adequately. However, very few descriptive grammars discuss the grammaticalization of either the coding means or the functions in the language under description. And even the grammars which do discuss grammaticalization cannot devote much space to this issue. Likewise, traditional historical linguistics could and should, but has not, addressed the fundamental questions of grammaticalization as conceived in the present study, namely the motivation for grammaticalization, the choice of grammatical means and the consequences of grammaticalization. One would look in vain for discussion of those issues in Campbell’s Historical Linguistics (2004), even though that second edition contains an expanded examination of grammaticalization in comparison with the first edition. Campbell’s objections to the use of the term grammaticalization in the expanded sense can be rejected on two grounds. The first is that no single theoretical approach, methodology, linguist or group of linguists has exclusive custody of the term grammaticalization. The second reason for ignoring Campbell’s objections is that the expanded scope of the term grammaticalization subsumes the traditional scope of grammaticalization as one of the means by which languages build a grammatical system, rather than the means of building a grammatical system. Moreover, typological linguistics does not usually address the diachrony of typological phenomena, the motivation for the existence of the coding means, the grammaticalization of meaning or the change of functions. The problem here is that there are too many potential causes for any single phenomenon. Descriptive grammars, historical linguistics and typological linguistics do provide, however, necessary data and tools for the study of grammaticalization. Grammaticalization with an expanded scope should become a fundamental part of historical linguistics. Traditional historical linguistics, that is, the study of language change, typological linguistics and grammaticalization together, combined with the study of neurological constraints, can explain why languages have the structures they do.

Grammaticalization, typology and semantics

3. Theoretical questions in the expanded model Within the expanded scope of grammaticalization there is one fundamental question about grammaticalization and two subsidiary questions. The fundamental question is the following: what are the motivations for grammaticalization? The traditional assumptions about motivation as listed earlier in this study are much too general to explain differences across languages. Within the expanded model, the question about motivation concerns specific forms and functions rather than the behaviour of language in general (cf. Section 4 below). The first subsidiary question relates to the choice of formal means for the grammaticalization of a given function. This issue can be divided into several questions: (i) is there any specific order in the choice of coding means, that is, do speakers explore some means first and explore others later on? (ii) is the choice determined by the functional domain to be coded? and (iii) what properties (if any) must a given means have in order to be used for the coding of a given function? For example, why are polar questions coded by word order in some languages and by dedicated interrogative particles in others? We know what the correlates of each means are, but we do not know enough about the choices in particular languages. The second subsidiary question is: what are the consequences of grammaticalization for language use and language change? Once a function has been grammaticalized, it forces different speech behaviours on the part of speakers. The speakers must henceforth encode the grammaticalized domain, as per Jakobson’s (1971: 264) observation that languages differ not in what they may express but in what they must express. Given the common human anatomic make-up, common cognitive apparatus, common living and cultural concerns, the question why some languages have grammaticalized functions which other languages have not may well be one of the most important and most difficult questions in grammaticalization theory. Many functions are grammaticalized opportunistically, as a by-product of the grammaticalization of other functions. Thus, in Gidar (Central Chadic) content questions about human participants in an event, equivalents of Who came?, can make a distinction between ‘who (feminine),’ ‘who (masculine)’ and ‘who (plural)’ because question words code gender and number distinctions in this language (Frajzyngier 2008). But is this an explanation for the grammaticalization of all functions? Some logicians and even some linguists (for example, Wierzbicka 1996) argue that there exists a set of semantic universals which must be expressed in every language. Even if that were true (and there is no evidence that it is), what about the many functions which are not universal? These theoretical questions are taken up in the ensuing parts of this paper.

71

72

Zygmunt Frajzyngier

Another important question in grammaticalization is concerned with the conditions under which a given formal means acquires a new function. One of the conditions under which such a process might occur is that the new function constitutes a metaphorical or metonymic extension of the function already realized by the formal means, as amply documented by Heine’s work with respect to lexical categories (cf. Heine et al. 1991, Heine and Kuteva 2002). Change in the functions of linear order, from the coding of the pragmatic function topic to the coding of the grammatical function subject, as described by Shibatani (1991), would fall into this category. But under what conditions would a formal means which has been grammaticalized for one function carry an unrelated function? Are there any such cases? This is an open question which merits further investigation.

4. The motivations for grammaticalization In this section I shall concentrate on three factors which play a role in grammaticalization, though it is very likely that they are not the only motivations involved: (i) the principle of functional transparency (Frajzyngier and Shay 2003; Frajzyngier 2004a) (cf. Section 4.1 below); (ii) pragmatic motivations, including the principle of indirect means (Frajzyngier and Jirsa 2005), which play an important role in the continuing regrammaticalization of a large number of functions which have already been grammaticalized (cf. Section 4.2); and (iii) the initial state of the language at any given time (cf. Section 4.3).

4.1 The principle of functional transparency One of the fundamental principles driving grammaticalization is that of functional transparency, which states that the function of every form in an utterance must be transparent to the hearer. Transparency refers here to the functional domains coded in a given language, rather than to the hearers’ need to understand the role of an element in reality. The referential meaning of an utterance may be, and often is, completely obscure to the hearer, but, if the role of every element in the utterance is marked through the coding means available in a given language, that utterance satisfies the principle of functional transparency, which has in its scope every element in the utterance.7 7. An anonymous reader questions whether the principle of functional transparency is the same as the principle of anti-ambiguity. Ambiguity is widely attested in languages, either as a conscious means used in conversations and literature or as an accidental or necessary outcome of various phonological, morphological and syntactic constraints. Therefore, the principle of functional transparency is not a principle of anti-ambiguity.

Grammaticalization, typology and semantics

Fundamentally, the construction of any utterance is ruled by the principle of functional transparency, whereby the role of every element in the utterance must be transparent to the hearer. The principle can be assured either by the inherent properties of the lexical items used or by external means, which would mark the role of a given item. For example, the principle of functional transparency for English predicts, among other things, the use of prepositions. Every noun phrase other than the subject or object must be marked for its role. One of the means of marking the role of the noun phrase is through the use of a preposition. Consider the ungrammatical utterance in (1a).

(1) a. *You are not connected Internet

The utterance is ungrammatical because the English hearer/reader does not know the role of the noun Internet. The addition of a preposition and a definite article, as in (1b), makes the role of the noun transparent, even for people who do not know its referential meaning.

(1) b. You are not connected to the Internet

Similarly, the addition of a graphic coding means, a period, as in (1c), makes the role of the noun Internet transparent (presumably as the institution which signed the message).

(1) c. You are not connected. Internet

Compliance with the principle of functional transparency is responsible for the grammaticality of Chomsky’s Colorless green ideas sleep furiously (Chomsky 1957: 15), because the role of every element in the utterance is transparent through the use of morphological means, namely plural coding on the noun, suffix -ly coding the adverb, suffix -less coding the modifier, and through the use of configuration, green before ideas as a modifier and affixless form sleep after ideas.8

4.2 The principle of indirect means The principle of indirect means states that, whenever the goals of speech involve certain functional domains, indirect rather than direct means of expression are used. Direct means of expression are those which grammatically or lexically encode a given function without dependence on contextual knowledge. In turn, indirect means of expression are those that, though coding function A within the grammatical system of the language, are deployed for the coding of function B, 8. For a full discussion of the operation of the system of functional transparency, see Frajzyngier and Shay (2003) and Frajzyngier (2004a).

73

74

Zygmunt Frajzyngier

for which the language has already developed other coding means. The functional domains involved here concern speaker-hearer relations: at the level of grammar, the principle of indirect means operates in the domain of interpersonal relations, such as deontic and epistemic modality, negation, content and polar interrogatives, reference systems and forms of address. The principle of indirect means starts operating after some means have been grammaticalized to code a given function and have thus become the direct means of coding. Languages vary with respect to the scope of the principle of indirect means, and it is theoretically possible that there are languages in which this principle does not operate. Moreover, within the same language, various social groups may have different norms with respect to the scope of the principle. Thus, in many languages, interrogative constructions are used to code deontic modality (for example, English Can you pass the salt, please?). The principle of indirect means is responsible for the abundance of formal coding means in the domains which are in its scope. Thus, in many languages there are a large number of means to code deontic modality, for instance modal auxiliaries, inflectional coding on the verb or modal adverbs, in comparison to the means to code tense, a functional domain which is usually not in the scope of the principle of indirect means. In English, for instance, the means of coding deontic modality include the auxiliaries must, need to, have to, ought to, should, shall, will, etc., whereas the past tense is coded by inflectional coding on the verb.9

4.3 The initial state The principle of functional transparency and the principle of indirect means are intimately connected with the initial state of the language. The term initial state does not refer to a state in a protolanguage, but to the state of the language at any given time with respect to two areas: (i) the coding of some functional domain and (ii) the availability of coding means. A coding means can be conceived of as a formal niche. If a given formal niche is not used to code a function, it is available as a coding means. By contrast, if such a formal niche is already occupied by a function, it is not available as a coding means for another function within the same formal or functional domain (Frajzyngier et al. 2002). The functional properties of the initial state pertain to the functional domains and sub-domains which are coded at any given time by all the coding means available, that is, the lexicon, phonological means, linear order, inflectional and derivational means and lexical repetition. One of the important features of formal means, neglected in the study of grammaticalization so far, is that the function9. For a full discussion see Frajzyngier and Jirsa (2005).

Grammaticalization, typology and semantics

al properties of lexical items having the same reference may differ substantially across languages. Thus, for example, the fact that in English most singular nouns must be preceded by an article, but not in Polish, may be due to the inherent properties of nouns in the two languages.10 The motivations for grammaticalization, namely the principle of functional transparency, the principle of indirect means and economy in the sense of coding the same function over many instantiations in discourse, all take place under the following conditions: a. The initial state with respect to the functional domains coded. b. The initial state with respect to the formal niches already occupied. c. The degree of stability with respect to the functions coded. The information about the initial state must also include information as to whether that state is stable or unstable.11 An unstable system is one in which given functional distinctions are neutralized in significantly extensive environments. An example of such neutralization would be an increased lack of distinction between subject and object and the neutralization of person and number distinctions resulting from the collapse of the inflectional coding on nouns, articles, adjectives and verbs in English, which, in turn, was provoked by rather simple phonological changes. Once a form has been grammaticalized, it may create an unstable system in some parts of the grammar. Thus, for example, the grammaticalization of linear order as a coding means for grammatical relations largely constrains the use of linear order for the coding of information structure and other pragmatic functions.

5. The grammaticalization of means 5.1 The grammaticalization of phonological means We know how the systems of underlying segments change: by either the addition or the reduction of segments. We also know how their realizations change and what the causes of these changes are, for instance, changes in stress, tone, intonation patterns and borrowings. Likewise, we have some very precise hypotheses concerning the mechanisms of those changes (cf. Blevins 2005). However, the fundamental questions as to why languages differ with respect to syllable structures, 10. On properties of nouns, cf. also Rijkhoff (2002: 50–59). 11. For one understanding of the notion stable, cf. Samuels (1972).

75

76

Zygmunt Frajzyngier

phonotactics rules, vowel and consonant harmony and rhythmic structure of the utterance have yet to be explained. These constraints and rules are relevant for the understanding of larger structures, hence, for grammaticalization theory. Thus, for example, there exist phonological rules whose role is to indicate whether a given constituent is part of a larger construction. An illustration of this type of rule is the phonological reduction responsible for the formation of the construct state in Semitic and other Afroasiatic nouns, as shown in examples (2a–b) from Hebrew.

(2) Hebrew a. anfey ha-ec branches the-tree b. ha-anafim shel ha-ec the-branches of the-tree ‘the tree’s branches’ (cited in Heller 2002)

There also exist rules marking phrasal boundaries. An illustration of such rules is the phrase-internal vowel reduction and phrase-final vowel retention in Chadic. In some Chadic languages these rules have only morpheme-final vowels in their scope, whereas in other languages they also have morpheme-internal vowels in their scope. In example (3) from Mina (Central Chadic), the noun trá ‘month’ is reduced to tr, and a high central vowel is subsequently inserted for syllabification, to indicate that it constitutes part of the construction with the next item.

(3) Mina tr láy t mìtš month time gen hunger ‘the year of the hunger’ (Frajzyngier and Johnston 2005)

In Giziga (Central Chadic), internal vowels are reduced in the phrase-internal position, as shown in (4) and (5).

(4) Giziga náŋ á ɓ á ɬkàɗ, á ɬkɗ tàtìi tâ. 3.ipron 3 irr 3 get.up 3 get.up intens neg ‘She tried to get up, but she could not get up at all.’

(5) Giziga à rô á lm-á mándâf mándf à ɓ-áa màdàŋɡòs: ‘k kl à má kâ? 3 go 3 find-3 hare hare 3sg say old.woman 2 run prep what q ‘She went and found the hare. The hare said to the old woman, “Where are you running?”’ (Shay, in preparation; Frajzyngier’s field notes)

The grammaticalization of word-final and word-internal vowel reduction probably follows a path similar to the grammaticalization of apophony. First, the

Grammaticalization, typology and semantics

word-final vowels are reduced when the word is followed by another word within the same phrase. Words which do not have a final vowel undergo internal vowel reduction. Then, the phrase-final vowel retention acquires a function of its own, namely that of indicating phrasal boundary. The grammaticalization of the vowel retention as a grammatical marker at this stage is complete. The grammaticalization of tone as a coding means is another instance not motivated by any inherent semantic properties of the coding means. In Hdi (Central Chadic) the functions of the prepositions da and ta depend on the tone which these markers have: tà ‘stative locative preposition’ vs. tá ‘second argument marker;’ dà ‘directional locative proposition with non-toponyms’ vs. dá ‘directional locative preposition with toponyms and purpose subordinator.’ Compare (6a–b) with (7a–b), where the complements of the preposition da are not toponyms and, therefore, the preposition has low tone.

(6) Hdi a. lá-b-l-ì dá xdì. go-out-go-1sg prep Hdi ‘I went to Hdi.’ [place name] b. lá-b-là dá bàtà mákwà. go-out-go prep woo girl ‘He went to woo a girl.’

(7) Hdi a. láb-lì dà màkàràntà. go-out-go prep school ‘I went to school.’ b. lá-b-lì dà Mbitsa. go-out-go prep Mbitsa ‘I went to Mbitsa.’ [proper name]

Such tone distinctions are exploited to code semantic functions, and their grammaticalization is not linked to any specific semantic function to be grammaticalized or to any inherent semantic properties of high or low tone.

5.2 The grammaticalization of linear order Linear order is a readily available means to code a relationship between any two entities, whether they belong to the same or to different lexical categories. When linear order involves lexical items belonging to the same category, one function is assigned to the first element and another to the second element. This is the situation when one noun modifies another in English: one noun is the head and the other is the modifier. On the other hand, when linear order is applied to different

77

78

Zygmunt Frajzyngier

lexical categories, order codes the function of only one element, the function of the other being that of the reference point, determined probably by its categoriality. Thus, in a VS language, the function of the noun which serves as the subject is coded by its position after the verb. By contrast, the function of the verb as the predicate is not coded by its position before the subject but by some other means, presumably morphological devices. Linear order as a sole coding means inherently has very limited possibilities. It requires a point of reference (for a discussion, see Frajzyngier and Shay 2003). With lexical items belonging to the same categories, linear order can code one functional sub-domain, for example modification, when the modification relates to two nouns. When it concerns the relationship between the verb and its arguments, linear order can code at most two functions, one by the position preceding the verb and another by the position following the verb. One of the open questions with respect to grammaticalization is why some languages are verb-initial, some are verb-final, and some are verb-medial. Given the implications of the position of the verb for the other elements of the grammar, this remains one of the key questions for the understanding of grammatical systems.

5.3 Repetition Repetition as a coding means is different from reduplication, which involves contiguous repetition of a lexical item or part of a lexical item, since repetition does not need to be contiguous. It may have various lexical categories in its scope, that is, it may involve the repetition of nouns, pronouns, verbs and adjectives, as well as of whole phrases. Across languages, the repetition of a lexical item may be used to code different functions. Contrary to frequent tacit assumptions, the repetition of lexical items is not necessarily an iconic means. Since this category has not been isolated as an independent coding means so far, the present study seeks to illustrate the mechanisms and functions of repetition and, in one case, a motivation for its existence. In Wandala (Central Chadic, spoken in Northern Cameroon and North Eastern Nigeria), a pronoun in the subject function is deployed twice in the sentence, once before the verb and once after the verb. Consider in this connection example (8) below. The function of this repetition is not yet clear, but the same means is used to code focus on a pronominal subject.

(8) Wandala šóì ŋánnà kà cìn-ná-kà story dem 2sg listen-3sg-2sg ‘the story that you heard’ (Frajzyngier’s field notes)

Grammaticalization, typology and semantics

In Gidar (Central Chadic, Northern Cameroon), the verb is repeated at the beginning of a clause, sometimes with its object, sometimes without it, to code focus on the predicate, as in (9).

(9) Gidar ílèŋ à-dàw lèŋé-ŋ káká. sleep 3m-d.prog sleep-pl only ‘They are asleep.’

In this language the answer to the question ‘what did he do with this thing [visible]’ is (10), whereas non-focussed expressions show no repetition of the verb, as in (11). (10) Gidar ddò à-dó-nì. reheat 3m-reheat-3m ‘He reheated it.’ (11) Gidar à-dóhò-n-n-kò. 3m-reheat-3m-pl-pfv ‘They reheated it.’ (Frajzyngier 2008) Cognate objects, which require a formation of a verbal noun in a number of languages (cf. Goldenberg 1971), most likely represent the same type of grammaticalization, albeit used for different purposes. Another (different) function of the repetition of a lexical item occurs in Kafa (Omotic). Cognate objects are derived from verbs and thus represent an instantiation of a repetition. Transitive verbs in this language appear to require an object. If no object occurs, a cognate object is used instead (cf. example (12) below). (12) Kafa bóonò dóocò ùcyétè bóonò búunò ùcyétè bóonò máyò they beer drink.3pl they coffee drink.3pl they food mà-hé-tè ‘ínòcé šùnòc’í mmó nà yùbbò yùbbè ‘í eat-prs-3pl when work finish.give ass choir singing šòddè ‘ì nòcé. dùubó dùbb ‘í šòddò sing sing responsive singing while ‘They drink beer, drink coffee, eat the food. When the work is finished, they sing (choir-like style), sing, sing in a responsive style.’ (Frajzyngier’s field notes)

79

80

Zygmunt Frajzyngier

5.4 The grammaticalization of lexical categories The grammaticalization of the major lexical categories, nouns, verbs, adverbs and adjectives, has largely been neglected in grammaticalization studies. However, whether a language has a given lexical category or not may determine the formal characteristics of clauses and sentences in that language. In contrast to the so-called major lexical categories, the grammaticalization of adpositions, complementizers and subordinating particles has received ample attention and is relatively well understood as far as the sources of these morphemes are concerned. This section sketches only how the grammaticalization of the major lexical categories might be studied and provides a principled answer as to why certain languages have some categories rather than others. Once the grammaticalization of lexical categories is understood, we can place in proper perspective the grammaticalization of derivational morphemes, another area which has been relatively well developed in grammaticalization studies so far.

5.4.1 The grammaticalization of nouns, verbs and adjectives As demonstrated in Frajzyngier and Shay (2003), the existence of the lexical categories verb and noun is motivated by propositional predication, which requires the identification of a predicate and of arguments. However, this general requirement can be satisfied in a number of ways, not necessarily through the creation of lexical categories. One of these ways is the use of free grammatical morphemes coding the categories tense and aspect, whose presence in the utterance indicates that an accompanying lexical item is a predicate, or, in Southeast Asian languages, the existence of classifiers, whose presence in the utterance indicates that the accompanying lexical item is an argument. A modifying function can be coded by the lexical category adjective, but it can also be coded by constructions akin to relative clauses and by a number of other means (Frajzyngier and Shay 2003). We can only speculate at this stage about the mechanisms for the grammaticalization of lexical categories. One such process would be as follows. A lexical item undifferentiated for a lexical category would be accompanied by other items, such as determiners, pronouns, time adverbials and quantifiers, to code those functions, thereby identifying the lexical item as a predicate or an argument. These lexical items would eventually fuse with the items which they were modifying, thus creating the categories of gender-marked nouns, as per the scenario conceptualized by Kuryłowicz (1964) and elaborated by Greenberg (1978), and tense or aspect coding on verbs. Mandarin would represent a language where individual lexical items do not have an inherent categoriality and where predicate and argument functions are coded by isolated grammatical morphemes. I will not pursue this issue in

Grammaticalization, typology and semantics

depth in the present study. Rather I would like to concentrate on the grammaticalization of a lexical category which has not been discussed in the literature other than in my own work on Mina, namely the locative predicator.

5.4.2 The grammaticalization of the locative predicator The grammaticalization of the lexical category locative predicator is interesting for a number of reasons. First, we can describe, document and prove how this lexical category is grammaticalized. We can also provide a motivation for this grammaticalization. Moreover, based on this grammaticalization, we can also offer a finegrained semantic analysis of nouns and verbs and provide the evidence for the proposed sub-categories of nouns and verbs. Mina, a Central Chadic language, possesses the category locative predication, which can be either stative (presence of an entity or an activity at some place) or dynamic (movement toward or from a place). The form of locative predication in Mina depends on the semantic feature LOCATIVE in the predicate and in the intended locative complement. This semantic feature is present in directional verbs of movement and in the verb meaning ‘be in place.’ In nouns it is present in toponyms and in a few other items, such as those meaning ‘room,’ ‘hut,’ ‘home’ and ‘compound.’ When both the predicate and the locative complement have the semantic feature LOCATIVE, locative predication consists of the mere juxtaposition of the predicate and the locative noun. For example, the verb tsú ‘go to’ is inherently locative, as is the noun dámù ‘bush.’ Consequently, no additional predicators or prepositions are used in locative predication, as shown in (13).12 (13) Mina séy m nɡùl nɡùl tìy á tìy-ú wàl tsú z dámù. so rel husband husband see 3sg see-3sg wife went ee bush ‘So the husband saw that the wife went to the bush.’ By contrast, if the complement is not inherently locative, locative predication requires the use of the preposition n, as in example (14), where the preposition n precedes the noun hìd ‘man.’ Note also that in this instance the verb táŋ ‘return’ in the first part of the example is inherently locative and, hence, no additional predicator is needed; similarly, the noun wùtá ‘house’ is also inherently locative and, hence, no preposition is present either.13 12. All data are from natural discourse, as presented in Frajzyngier and Shay (2003) and Fraj zyngier et al. (2005). 13. As pointed out by an anonymous reviewer, the lack of a preposition with locative verbs and noun phrases is also witnessed in English: go home, go West, go right vs. go to my brother.

81

82

Zygmunt Frajzyngier

(14) Mina tséy hìdì wàcíŋ táŋ z wùtá à ɮá á n médìg so man dem return ee house 3sg say pred prep neighbour ngŋ wàcíŋ ngámbù há k déɓ-é-ŋ dál n 3sg dem friend 2sg inf bring-3sg-go money prep hìd wà dál vànú. man dem money how.much ‘When the man came back to the house, he said to his neighbour, “Friend, you brought money to this man. How much money?”’ The absence of the feature LOCATIVE in nouns constitutes the motivation for the emergence of the locative preposition, whose function is to code the grammatical role of the noun phrase rather than a spatial configuration, a function which is coded by other markers. When the predicate is not inherently locative, locative predication requires the use of the locative predicator a. The verb ɗáhà ‘exist’ in example (15) below is not inherently locative; hence, the locative predicator a is required in locative predication. By contrast, the complement noun bíŋ ‘house’ is inherently locative, so that no locative preposition is needed. (15) Mina háɮm ɗáhà á bíŋ ngŋ. girl exist pred house 3sg ‘There is a girl at her house.’ Similarly, in (16) the verb déɓ ‘take’ is not inherently locative, while the noun ìdá ‘home’ is. Hence, locative predication requires only the predicator á. (16) Mina séy báy déɓ z ngn k á ìdá. then chief take ee 3sg pvsbj pred home ‘Then the chief took the calabash home.’ In those cases in which neither the predicate nor the complement is inherently locative, locative predication requires both the locative predicator a and the locative preposition n. Consider example (17). (17) Mina séy hìdì m s tápá ɗáhà á n mŋ màcíŋ. so man rel drink tobacco exist pred prep ana dem ‘So there is a smoker among them.’

Grammaticalization, typology and semantics

The analysis of locative predication in Mina provided in the preceding paragraphs has the following implication for the general theory of language and for grammaticalization theory. The grammaticalization of lexical categories and grammatical markers, in this case the locative predicator a and the locative preposition n, is motivated by the principle of functional transparency, as discussed in Section 6.

6. A case study: The principle of functional transparency and grammatical relations In this section, the principle of functional transparency (cf. Section 4.1 above) is illustrated with the relative clause construction in Wandala, a Central Chadic language. This discussion also serves to exemplify the grammaticalization of the repetition of a lexical item as described in Section 5.3 above.

6.1 Relative clauses in Wandala Wandala clauses have subject pronouns obligatorily preceding the verb and nominal subjects or nominal objects following the verb. Moreover, relative clauses are preceded by their heads. The relativization of the subject has the form: noun phrase (+demonstrative) + a (+subject pronoun) + verb (+noun phrase object), as illustrated in (18). (18) Wandala á m gdz ŋánnà à ɓlà láx nà sà-wà. pred place child dem 3sg sing song def arrive-vent ‘Where is this child who sings the song, let him come!’ The relativization of the object, in turn, has the form: noun phrase (+demonstrative) + subject pronoun + verb + object pronoun + subject pronoun. In other words, the subject pronoun both precedes and follows the verb, as in (19).14 (19) Wandala názùyèkátáŋdàkàyà vátnwá názù yè kàtà ndà-ká-yà vá-t-n wá […] thing 1sg want say-2sg-1sg time-prep-def comp ‘what I would like to tell you today is […]’ 14. The first line in examples (19) and (20) provides a broad phonetic transcription with spaces indicating short pauses, while the second line provides the underlying morphemes. The same applies to examples (28)-(31) and (44)-(45) below.

83

84

Zygmunt Frajzyngier

If the subject of a relative clause with a relativized object is nominal rather than pronominal, the third person subject pronoun precedes the verb and the nominal subject follows the verb. An example is given under (20). (20) Wandala názù àfámyàrmàlámì nákátáfká mínà názù à fá-myà-r màl-á-mínà kà-t fk-á thing 3sg put-1pl.incl-on boss-gen-1pl.incl prep-prep front-gen mínà [. . .] 1pl.incl ‘that what has been asked by our boss who is in front of us […]’ Although the position of the nominal subject after the verb is the default position for this category, the position of the pronominal subject after the verb is not, at least not for the great majority of verbs. Therefore, the position of subject pronouns after the verb in relative clauses needs to be explained. In my view, the function of the subject pronouns occurring postverbally is to indicate that the noun which is the head of the relative clause is not interpreted as the subject of the clause. Such an explanation follows from the principle of functional transparency. Note that, if the head of the relative clause is the subject, as in (18) above, the subject pronoun is not repeated. Interestingly, the principle of functional transparency is not universally required with respect to relative clauses, as there are languages, for example, some Semitic languages, where relative clauses are ambiguous with respect to the grammatical role of their heads (cf. Frajzyngier and Shay 2003, with reference to Goldenberg, personal communication). Why the distinction between the subject and the object may be neutralized in relative clauses across languages merits further exploration. Tentatively, I propose that the relative clause is merely a clausal comment on the head. In many languages the head does not need to be an argument of the relative clause and, therefore, the question about the role of the head of the relative clause is not within the scope of the principle of functional transparency. So, even if the head happens to be an argument of the relative clause, the role of such argument is not overtly coded.

6.2 Coding grammatical relations in a simple clause 6.2.1 Word order and the coding of grammatical relations The discussion of coding grammatical relations in a simple clause offered in the present section aims to examine the interaction of the characteristics of the initial state, the principle of functional transparency and the choice of grammatical means. In several Central Chadic languages the verb occupies the clause-initial

Grammaticalization, typology and semantics

position. This position of the verb constrains the coding of the role of the argument by position to one argument only: V NP. The argument occupying such a position can be associated with one grammatical relation, for example, subject or object, categories coded in many Chadic languages, or with one semantic role, categories also coded in some languages with respect to the second argument. The presence of the verb in the clause-initial position therefore posits the following questions: a. How is a second argument added to a proposition? b. How is the grammatical role of the single argument of the transitive verb coded? This is not a trivial question, because the single argument may be the outcome of the omission of another argument, either the subject or the object. c. If a verb can occur with two arguments, what happens if one of the arguments is fronted? How is its role coded and how is the role of the remaining argument coded? Questions (b) and (c) are likely to have the same answers in each individual language. All of these questions are issues for grammaticalization, that is, for the development of coding means for various functions. Chadic languages with the verb in the initial position display a variety of strategies for the coding of an additional argument. Each strategy is an instantiation of a different grammaticalization and that is why they deserve to be discussed in this study. Moreover, Chadic languages have also developed different strategies for the coding of the roles of the single argument in cases of omission and fronting. Significantly, none of these strategies involves a process from lexical item to grammatical morpheme. In what follows, the coding of grammatical relations is discussed for four languages showing an initial state with verb-initial position: Hona (spoken in Nigeria, data from Frajzyngier’s field notes and Jordan’s field notes), Hdi (Northern Cameroon, Frajzyngier 2002), Wandala (North Eastern Nigeria and Northern Cameroon, Frajzyngier’s field notes) and Mutsuvan (Northern Cameroon, Johnston, personal communication, from his field notes). Each of these languages has solved the issue of adding one more argument to the clause in a different way, despite the fact that they are genetically related, belong to the same branch within the family and are spoken within the same geographical area. Crucially, however, none of these languages is an immediate neighbour of any of the other three. Hdi is verb-initial in all aspects in pragmatically neutral clauses, the same as Mutsuvan. Wandala, in turn, has pronominal subjects preceding the verb, but nominal arguments following the verb. Finally, Hona is verb-initial in the perfective aspect and verb-medial in the imperfective aspect. Assuming verb-initial to be the initial state, the four languages have grammaticalized different strategies for the coding of one argument and distinguishing

85

86

Zygmunt Frajzyngier

V-INITIAL 1 NP Wandala (PRO[S}) V NP V-PRO-V NP

Hdi V NP

Hona V NP

Mutsuvan V NP

2 NPs: ∅

V NP tá NP

V NP NP

V NP NP

Figure 1. Coding of arguments in four Chadic languages

etween the subject and the object functions of two noun phrases. These stratb egies are summarized in Figure 1. Both Hdi and Wandala have reduplicated forms of the verb in pragmatically independent clauses. In Wandala, the subject pronoun is inserted between the reduplicated forms, whereas in Hdi the subject pronoun is suffixed to the reduplicated form of the verb. As seen in Figure 1, with intransitive verbs, Hdi, Hona and Mutsuvan have the structure V NP. Moreover, there are no structural distinctions between Hona and Mutsuvan at the two-nominal-argument configurations. The difference between the two languages emerges in other structures, namely those involving either the omission or the fronting of one of the arguments. What is interesting here is whether there is a principled reason which explains why the four genetically related languages, starting from what appears to be the same initial state, have grammaticalized different strategies for the coding of subject and object. In what follows I describe the inclusion of two arguments (cf. Section 6.2.2), the grammatical role of the single argument (cf. Section 6.2.3) and the coding of the roles when one of the arguments is fronted (cf. Section 6.2.4). Given that the four languages under consideration are verb-initial, placement of an argument in the clause-final position does not exist as a coding means.

6.2.2 Two arguments in a pragmatically neutral clause The simplest means to code a relationship between two elements is juxtaposition. This device is deployed in Hona and in Mutsuvan. In Hona, the nominal subject follows the verb and the nominal object follows the subject, the verb does not have any pronominal arguments and prepositions are not used to code grammatical relations. Consider (21).15 15. All examples in Hona are elicited.

Grammaticalization, typology and semantics

(21) Hona fí nú-nà ɬú-wà. bake wife-1sg meat-indf ‘My wife baked meat.’ In turn, as illustrated in (22), pronominal subjects are suffixed to the verb and the nominal object follows the verbal complex. The noun phrase which follows the verb is interpreted as the object because the subject is coded by the suffix on the verb. (22) Hona kyàtlá-ndà dèfí-nà-d pat. cut-3pl tree-def-all all ‘They cut all the trees.’ Pronominal objects can also be suffixed to the verb. The third person singular subject pronoun is formally different from the third person singular object pronoun. If the clause has a nominal object, as in (23), the pronominal object suffix codes the definiteness of the object. (23) Hona ɬe-n Ali hur-ghen. slaughter-obj Ali neck-3sg ‘Ali slaughtered it.’ (lit. ‘Ali cut its neck.’) In Mutsuvan, if two arguments are present in a pragmatically neutral clause, the first argument is the subject, whereas the second one is the object. Thus, the sole coding means in clauses of this kind is linear order, as seen in (24).16 (24) Mutsuvan mà ʔt hdá ndé. pst bite dog man ‘The dog bit the man.’ The question remains why the simple juxtaposition used in Hona and Mutsuvan is not deployed in Hdi and Wandala. Each of these languages presents a different case and will be discussed separately in the following paragraphs. The means used for coding grammatical relations in Hdi depend on the aspect, the information structure and other pragmatic functions coded in the clause. 16. All data from Eric Johnston (personal communication); analyses and interpretation by this author.

87

88

Zygmunt Frajzyngier

There are two aspectual systems, one coding pragmatically dependent and the other pragmatically independent clauses. Furthermore, pronominal subjects are suffixed to the simple or to the reduplicated form of the verb. The third person singular subject is unmarked in the perfective and, in some types of clauses, is marked in the imperfective. If there is a nominal subject, no pronominal subject is added to the verb. Pronominal objects, in turn, are infixed between the reduplicated form of the verb. The third person pronominal direct object for inanimate nouns is unmarked. By contrast, the nominal object can be marked by the position following the verb if the subject precedes the verb and is marked for its role. This is the case in sequential clauses, as illustrated in (25) below. (25) Hdi ɗg-áy-tán tá ɗgú yá mbàɗ ká pákáwá ghúvì kà thresh-pobj-3pl obj threshing dem then comp hyena seq klà-á-tá vàrà. take-prtv-ref beans ‘While they were threshing, Hyena took some beans.’ If the nominal subject follows the verb, the nominal object, whether direct or indirect, is marked by the preposition tá, derived most likely from the locative stative preposition tà through tone-raising. Consider (26a–b). (26) Hdi a. ngá dà-gá-ghà-tà índà grá-xà-ní tá ghzú. norm cook-inn-d.pvg-ref.sbjv all friend-pl-3sg obj beer ‘All of his friends should cook beer and bring it there.’ b. kà zl-í-n-tá tsá m ` ghám yá tá tsá màrkw-á-tàn yá. seq chase.away-3-ref def chief dem obj def wife-gen-3pl dem ‘The chief chased away his wife.’ On the other hand, pronominal subjects are suffixed to the verb and the second argument which follows the verbal complex is marked by the preposition tá (cf. (27) below), just as in those cases in which the nominal object follows the nominal subject (cf. (26a–b)). (27) Hdi skwá-ghá-skwá tá skw-à z-áy. buy-2sg-buy obj thing-gen eat-pobj ‘She bought you something to eat.’ (Frajzyngier with Shay 2002) The reason why Hdi codes the second argument by means of the preposition is that the role of the second noun phrase would otherwise not be interpretable as an argument of the verb. Thus, if the nominal argument after the verbal complex were

Grammaticalization, typology and semantics

to be unmarked for its role, it could be interpreted as the subject. The grammaticalization of the preposition to code the object is, therefore, motivated by the principle of functional transparency. In the case of Wandala the reason why two arguments cannot occur after the verb (cf. Figure 1 above) has to do with the fact that the role of the single argument following the verb is unmarked and must be calculated from the coding of the other elements in the clause. The verb s ‘arrive, come’ can be intransitive or transitive and, therefore, the nominal argument following it can be the intransitive subject, the transitive subject or the object, depending on the inflectional coding on the verb. In example (28) below the nominal argument following the verb is the subject. (28) Wandala émlmàɮárà sàrə`msámdnà é màɮárà sá-r-m-s-à md ná. well now arrive-3pl-in-arrive-pvsbj people def ‘And now, those people came.’ On the contrary, the nominal argument following the verb is interpreted as the object if the pronominal subject has a different number than the nominal argument, as happens in (29). (29) Wandala tàdssá dàdámúvgè tà d s-s-á dàdà á-m úvgè. 3pl seq arrive-arrive-go father pred-in grave ‘And they raised the father from the grave.’ Placing two nominal arguments after the verb would leave the clause uninterpretable with respect to the distinction between the two arguments. One means of including two arguments in the proposition is to use one clause with an intransitive verb and a nominal subject and then refer to that subject by means of a pronoun in the subsequent clause with another nominal argument, as in (30) below.17 The verb tsá ‘get up’ and equivalents of the verbs ‘go’ and ‘come’ are used in the introductory clauses. (30) Wandala tátsámdá žílé tá tsá md-á žílé 3pl get.up people-gen man 17. All data from natural discourse.

89

90

Zygmunt Frajzyngier

tásbɗyà mdámùksé tá s bɗyá md-á mùksé. 3pl arrive meet people-gen woman ‘People of the groom get up. They come to find the people of the bride.’ Another means to include two arguments in the proposition is the fronting of one of them, mainly for topicalization. In these cases, the grammatical role of such an argument is computed from the inflectional coding on the verb and from the presence of the other arguments. In example (31), the evidence that the noun žílé ‘man’ is the topic is provided by the fact that it retains the word-final vowel. Its interpretation as the subject is assured by the presence of the goal marker on the verb d-á ‘go-go’ = ‘bring.’ (31) Wandala žílé ádánázù ádndàvànánŋàné žílé á d-á názù à d ndà-v-àn-án ŋánè. man 3sg go-go that 3sg go ask-aff-3sg-ass 3sg ‘The man should bring that with which he is going to ask for [the hand of the girl].’ It appears that the only cases in Wandala where a clause has two nominal arguments in natural discourse occur precisely when one of the arguments is topicalized or focussed. Table 1 presents a summary of the coding of clauses with two arguments in the four languages under discussion. Table 1. Coding of clauses with two arguments in Wandala, Hona, Hdi and Mutsuvan Wandala

Hona

Hdi

Mutsuvan

Ø (or S[TOP]VO), O[TOP] VS

VSO

VS tá O

VSO

6.2.3 The role of the single argument The determination of the role of the single argument following the verb is interesting for those verbs which can occur with two arguments. It must be noted that most grammars of Chadic languages do not recognize a passive form of the verb or a passive construction; the default role of the single argument of potentially transitive verbs depends on the category of the verb. Wandala has grammaticalized several means which interact with the inherent properties of the verb to code the distinction between the subject and the object. In the perfective aspect the noun phrase which follows a transitive verb without any markers is the object, as in (32).

Grammaticalization, typology and semantics

(32) Wandala à jj zàhè. 3sg kill snake ‘He killed a snake.’ The addition of the goal marker or object marker to such verbs, as in (33a–b), indicates that the following noun phrase is the controlling participant. (33) Wandala a. à jj-á zàhè. 3sg kill-pvsbj snake ‘A snake killed him/it.’ b. à jj-á-trá zàhè. 3sg kill-go-3pl snake ‘A snake killed them.’ For verbs whose inherent meaning does not imply either the controlling or the affected role of the single participant, the interpretation of the noun phrase as either subject or object is computed from the properties of the noun phrase, such as [+human] vs. [−human]. Compare (34a) and (34b) below, where the postverbal noun phrases function as object and subject respectively. (34) Wandala a. à hrɗ-á zárvà. 3sg farm-ipfv sesame ‘He planted sesame.’ [not ‘The sesame farmed.’] b. à hrɗ-á kèllú. 3sg farm-ipfv Kellu ‘Kellu farmed.’ [not the nonsensical ‘He farmed Kellu.’] With verbs whose objects are affected, the noun phrase which follows the verb without an object marker is the object, as shown in (35). (35) Wandala à šíɓ-á gdzrè. 3sg hide-go child ‘He hid a child.’ In order to code the role of the noun phrase as the controlling participant, the verb, whether in the simple, as in (36a), or in the reduplicated form, as in (36b), must have an object pronoun.

91

92

Zygmunt Frajzyngier

(36) Wandala a. à šíɓà-ná gdzrè. 3sg hide-3sg child ‘The child hid it.’ b. šíɓ-á-n-v-šíɓà gdzrè. hide-go-3sg-aff-hide child ‘The child hid it.’ The absence of the goal or the object marker on the verb, as in (37), indicates that the following noun phrase is affected, without implying that there was another agent that performed the action. (37) Wandala šíɓé-v–šíɓè gdzrè. hide-aff-hide child ‘The child hid.’ To summarize, the facts of Wandala demonstrate that grammaticalization is motivated by the principle of functional transparency, whereby the role of every element in the utterance must be transparent to the listener. If the role is transparent because of the nature of the lexical items deployed, no other coding means are necessary. Let us now turn to Hdi. In this language, the single noun phrase following the verb is marked as the object by the preposition tá, the same way as the second nominal argument in a clause (cf. Section 6.2.2). The reason why Hdi codes the object role of the single argument by a preposition is that, contrary to Wandala, the verb does not code the role of the arguments. (38) Hdi ngá zá tá tsá vàkú yá. norm eat obj dem year dem ‘He should wait out this year.’ Finally, in Hona, the semantic role of the single argument is determined by the inherent properties of the verb. The single argument of the verb fí ‘bake’ in (39) below is the affected object. (39) Hona fi ɬú-dì. bake meat-det ‘The meat baked.’

Grammaticalization, typology and semantics

A noun which cannot plausibly be the affected argument may not be used as the single argument of such a verb, as shown by the ungrammaticality of (40) below for ‘my wife baked.’ (40) Hona *fi núnà bake wife Compare the subject coding with intransitive verbs in (41). (41) Hona wúd íyá ɓà à húkásù. return mother vent prep market ‘My mother returned from the market.’ A single nominal argument is interpreted as the object if the verb has a pronominal subject suffix, as happens in example (42). By contrast, if the verb has an object suffix, as in (43) below, the noun phrase following the verb is interpreted as the subject. (42) Hona sán-ɗ ám-dì pát. drink-3sg water-det all ‘He drank all the water.’ (43) Hona ɬe-n Ali. slaughter-3sg Ali ‘Ali slaughtered.’

6.2.4 Coding the role of arguments in fronting operations Let us now consider the means, if any, which the four Chadic languages under analysis grammaticalized to code the grammatical relations when one of the arguments is fronted for whatever pragmatic function. As will be seen, despite some similarities, there are interesting differences among the four languages. The fronted noun phrase in Wandala is interpreted as the subject if the noun which follows the verb is coded as the object through the absence of a goal marker and of an object pronoun on the verb, as in (44). On the contrary, the fronted argument is interpreted as the object if the noun following the verb is coded as the subject through the presence of an object pronoun on the verb, as in (45) (cf. Section 6.2.3 above).

93

94

Zygmunt Frajzyngier

(44) Wandala dàdá átàdzgwánkà kwàrhárkà dàdá á tà dzgwá-n kà kwàr hár kà father 3sg neg be.able-3sg neg order home neg ‘The father cannot order his family, can he?’ (45) Wandala éemákínnì dàdà ákwàrná màmà ée má kínnì dàdà á kwàr-ná màmà. well hyp bckg father 3sg order-3sg wife ‘If it is like that, the father is ordered by the mother.’ In Hdi, the subject function of the fronted noun phrase is computed from the object coding of the noun phrase which follows the verb. As seen in Section 6.2.3 above, the object is coded by the marker tá; cf. (46). (46) Hdi tsá Gulu yá ná yà-yà tá zwànì ndá ndghà. def Gulu dem dem beget-beget obj child-pl ass many ‘That Gulu begot many children.’ By contrast, the fronted noun phrase is interpreted as the object if the noun phrase following the verb is not marked by the object marker tá, that is, if it is unmarked and, hence, must be interpreted as the subject. (47) Hdi a. ndá ndghà zwànì yà-yà tsá Gulu yá. ass many child.pl beget-beget def Gulu dem ‘Many children were begot by that Gulu.’ b. xìyá xìyá skwì txàf-í tà ná ɗèlèwèr ná ná skwì corn corn thing expel-1sg prep dem notebook dem dem thing 3 nghá-nà-ghá-tsí. three look-dem-d.go-3sg ‘What I have written about in this notebook regards three things.’ (Frajzyngier with Shay 2002)18 If there is only one argument in the clause, the relevant distinction is between the controlling and the affected argument rather than between the subject and the object. In the imperfective aspect, the noun phrase which precedes the verb is interpreted as the subject if the verb has the potential object marker -ay, as in (48a). In 18. The sentence is drawn from written sources with subsequent tonal notation added.

Grammaticalization, typology and semantics

turn, the subject is interpreted as the non-controlling argument by the absolutive form of the verb (suffix -kú), as in (48b–c). (48) Hdi a. mbàɗ ká krì kà ɗg-áy. then comp dog seq thresh-pobj ‘And Dog kept threshing.’ b. mbàɗ ká xìyà kà ɗg-ákù-á-nì. then comp corn seq thresh-abs-gen-3sg ‘And corn kept threshing.’ (e.g. in a machine) c. ndá sn-í ká tsá marriage-xà yá tà mággá-kú gà stat hear-1sg comp def marriage-pl dem ipfv make-abs prep xdí ká-xn mándí ɓángál-xà yá. Hdi comp-3pl like marriage-pl dem ‘I have heard that marriages are being made in Hdi.’ Let us now turn to Mutsuvan. As seen in Section 6.2.2 above, in pragmatically neutral clauses with two arguments, the first argument is the subject, whereas the second one is the object (cf. example (24) above). If the subject is fronted, however, the object is marked by the preposition t (cf. examples (49a–b) below). This enables the interpretation of the fronted noun phrase as the subject. (49) Mutsuvan a. à mí hdá ʔt t ndé. pfv dog bite obj man ‘The dog had bitten the man.’ b. ndé dà ʔt t hdá. man prs bite obj dog ‘The man is biting the dog.’ Thus, the preposition t in Mutsuvan is deployed only when the other coding means, namely the linear order V NP NP, is not available. This fact is significant for two reasons: (i) it demonstrates the existence of the principle of functional transparency and (ii) configuration is chosen first as the coding means, but, when linear order is not available, a preposition is grammaticalized to code the grammatical relation. The interpretation of the fronted noun phrase as the object is assured by the suffix a, probably the goal marker added to the verb (cf. Frajzyngier 2005), and by the fact that the noun phrase following the verb is interpreted as the subject. Consider the examples under (50a–b).

95

96

Zygmunt Frajzyngier

(50) Mutsuvan a. hé pìt-á nzè. corn harvest-go 3.indf ‘The corn is being harvested.’ (right now) b. ándàva ná’w-á snù. Cyperus esculentus L. (Cyperaceae) eat-go Sunu ‘Sunu eats Cyperus esculentus.’ The evidence that the ending a is not the verbal ending before the subject or an inherent ending of the verb is provided by clauses in which the subject is not preceded by any suffix, as in example (24) above, repeated here for convenience as (51). The absence of any preposition before the noun phrase following the verb is important because it indicates that this noun phrase is the subject. (51) = (24) Mutsuvan mà ʔt hdá ndé. pst bite dog man ‘The dog bit the man.’ To summarize, the limited Mutsuvan data demonstrate the existence of the principle of functional transparency. Moreover, the coding means are deployed only when such principle may be violated. In Hona, in turn, the fronted argument is coded as the subject through the third person subject suffix added to the verb, as shown in (52). Recall that, when a nominal subject follows the verb, there is no third person subject suffix added to the verb (cf. Section 6.2.2 above). (52) Hona Ali ɬe-n-ɗə. Ali slaughter-obj-3sg ‘Ali slaughtered it.’ By contrast, the fronted argument is coded as the object through object suffixes which follow the verb, as illustrated in example (53) below. (53) Hona mbe-nè iya-ɗí fwo-né-n. enjoy-1sg food-def cook-1sg.dat-3sg.obj ‘I liked the food that she cooked for me.’ Table 2 summarizes the coding with fronted arguments in the four Chadic languages.

Grammaticalization, typology and semantics

Table 2. Summary of the coding with fronted arguments Interpretation

Wandala

Hona

Hdi

Mutsuvan

Fronted NP as S

If the NP following the verb is coded as O If the NP following the verb is coded as S Phrase-final form of the verb Bare stem (pfv) -a (ipfv)

If the verb has subject marker

If the NP following the verb is coded as O

If the NP following the verb is coded as O

Fronted NP as O

Postverbal NP as S Postverbal NP as O

If the verb has object marker

∅

If the NP following the verb is coded as S

∅

∅

Goal marker

Subject pronoun on the verb

Preposition tá

Preposition t

7. The choice of formal means As mentioned in Section 1.1 above, when the need to code a given function arises, speakers have a large choice of means at their disposal, among them linear order, phonological means (tone, stress, pauses, intonation, vowel deletion and retention, vowel harmonies), lexical categories and sub-categories, and individual lexical items, such as specific nouns, verbs or numerals and quantifiers, inflections, etc. Linear order (cf. Section 5.2) has an obvious limitation, in that it allows at best the coding of two functions. This is the case with English noun + noun combinations, where the functions of both nouns are coded, the first as head and the second as modifier. Thus, we can have either a house tree or a tree house. Linear order is, therefore, quite limited in the numbers of distinctions which it can code. Another means of coding is through tonal changes. Here the number of distinctions coded depends on the number of underlying tones in the language. But even in a two-tone language, the number of distinctions coded by tone can be twice as many for any functional domain. Thus, if a low tone on the verb codes one tense, the high tone can code another tense. Similarly, if a low tone pronoun codes indicative modality, the high tone can code deontic modality. Similar constraints operate with respect to other prosodic means, such as stress and pauses, each of which offers at best a binary distinction. Thus, when any of the binary distinctions is overlaid on some other coding means, such as segmental grammatical morphemes or word order, the number of coding means can double rather than being simply augmented by two for the language as a whole. In the case of the four Chadic languages with the verb in the clause-initial position considered in Section 6, the position of the nominal argument before the verb in combination with other coding means codes topicalization, focus or

97

98

Zygmunt Frajzyngier

backgrounding of the given noun. This position is, therefore, not available to code grammatical relations, such as subject or object. The fact that the formal means have a function already assigned at the initial state plays a fundamental role in the choice of the formal devices for the grammaticalization of other functions. Simply put, languages must look for other formal means. In most languages, when a distinction for more than two relations between any two elements is called for, a means with richer coding possibilities is required. Drafting a lexical item to code the desired function is one of the ways in which languages enrich the coding means available.

8. Grammaticalization and language change The implications of grammaticalization for language change follow from the grammaticalization of functions to a greater extent than from the grammaticalization of a given coding means. Once a function has been grammaticalized, the loss of the coding means which codes such a function may have one of two consequences: (i) the given function disappears or (ii) the language grammaticalizes other means to code the same function. The loss of the coding means for a given function is paradoxically often a consequence and a historical result of grammaticalization in that the form is used more frequently and, therefore, it becomes predictable, allowing increased phonological reduction. Grammaticalization theory should address the issue of which grammaticalized functions are allowed to disappear and which are provided with new means of coding. One possible interpretation is that functional domains, once grammaticalized, remain all the time, as claimed by Aikhenvald and Dixon (2001). But is this true? Historical linguistics attests to functions which have disappeared from grammatical systems of unrelated languages. Wellknown examples are the loss of the category dual from many Indo-European languages and that of the singular vs. plural distinction in the second person pronominal system in English. It would appear that functions which are used extensively, that is, functions which are coded in many types of utterance, are provided with new means of coding once the old means disappear. On the other hand, functions with a limited use appear to be more vulnerable once their coding means disappear. Unfortunately, we sorely lack crosslinguistic data to generalize on the retention and disappearance of functions even in languages with long written traditions, and in languages without a written tradition, we lack even a reliable methodology to determine whether a given function is a retention or an innovation.

Grammaticalization, typology and semantics

9. Grammaticalization as a research tool: Methodological considerations Grammaticalization as understood here can also be used as a diagnostic tool in semantic analysis. The underlying principle is that, if a function has been grammaticalized in a language, this function was not otherwise coded by lexical or grammatical forms of the language at the initial state, unless the motivation was the principle of indirect means (cf. Section 4.2). With respect to semantic analysis, grammaticalization provides the evidence that lexical items which appear to have the same referent across languages have different semantic properties. These semantic properties are those which have been grammaticalized in a given language, either through the grammatical system or through the lexical categories and sub-categories. Languages may grammaticalize different lexical categories as required by the types of predications they have. The grammaticalization of a given functional domain can be used as a diagnostic tool for semantic analysis, that is, for the discovery of the meaning of a given lexical item in a given language. Thus, the noun room in English is not inherently [+LOCATIVE], in the sense of having this semantic feature, whereas its equivalent in Mina is.

10. Conclusions Against the recent vigorous and extensive challenges to grammaticalization research and grammaticalization theory in its narrow sense of the change from lexical item or construction to grammatical morpheme (cf. Campbell 2001; Campbell and Janda 2001; Joseph 2001, 2004; Newmeyer 2001), the present study proposes that grammaticalization, as here defined, is at the core of any linguistic theory which aims to explain why languages are similar and why they are different. The study of grammaticalization can contribute to the foundations of linguistic theory by explaining, in addition to the development of grammatical morphemes from lexical items and constructions, the emergence of lexical categories, of word order as a coding means, of phonological means and of repetition. Within the expanded scope of grammaticalization proposed here, the motivation for each specific instance of grammaticalization becomes the most important object of research. Once the motivation is known, the next object of investigation is the choice of the means to code a given function.

99

100 Zygmunt Frajzyngier

Abbreviations 1 2 3 abs aff all ana ass bckg comp d dat def dem det ee gen go hyp in incl indf inf inn intens ipfv ipron

first person second person third person absolutive affected argument allative anaphoric associative backgrounding complementizer dependent (aspect) dative definite demonstrative determiner end of episode genitive goal orientation hypothetical extension coding the movement in inclusive indefinite infinitive inner space extension intensifier imperfective independent pronoun

irr m neg norm obj on out pfv pl pobj pvsbj pred prep prog prs prtv pst pvg q ref rel sbjv seq sg stat vent

irrealis masculine negation normative object extension coding location on extension coding the movement out perfective plural potential object point-of-view of subject predicative preposition progressive present tense partitive past tense point-of-view of goal question marker referential relative subjunctive sequential singular stative ventive

References Aikhenvald, Alexandra Y. 2000. Classifiers. Oxford: OUP. Aikhenvald, Alexandra Y. & Robert M. W. Dixon (eds.). 2001. Areal diffusion and genetic inheritance: Problems in comparative linguistics. Oxford: OUP. Blevins, Juliette. 2005. Evolutionary phonology: The emergence of sound patterns. Cambridge: CUP. Campbell, Lyle. 2001. What’s wrong with grammaticalization? Language Sciences 23(2–3): 113–161. Campbell, Lyle. 2004. Historical linguistics. 2nd edn. Cambridge MA: The MIT Press. Campbell, Lyle & Richard D. Janda. 2001. Introduction: Conceptions of grammaticalization and their problems. Language Sciences 23(2–3): 93–112. Chomsky, Noam. 1957. Syntactic structures. The Hague: Mouton de Gruyter. Croft, William. 2001. Radical construction grammar: Syntactic theory in typological perspective. Oxford: OUP.

Grammaticalization, typology and semantics 101

Frajzyngier, Zygmunt. 1991. The de dicto domain in language. In Elizabeth Closs Traugott & Bernd Heine (eds.), Approaches to grammaticalization. Vol. I, 219–251. Amsterdam: John Benjamins. Frajzyngier, Zygmunt. 1996. Grammaticalization of the complex sentence: A case study in Chadic. Amsterdam: John Benjamins. Frajzyngier, Zygmunt. 1997. Pronouns and agreement: Systems interaction in the coding of reference. In Hans Benis, Pierre Pica & Johan Rooryck (eds.), Atomism and binding, 115–140. Dordrecht: Foris. Frajzyngier Zygmunt, with Erin Shay. 2002. A grammar of Hdi. Berlin: Mouton de Gruyter. Frajzyngier, Zygmunt. 2004a. Principle of functional transparency in language structure and language change. In Zygmunt Frajzyngier, David Rood & Adam Hodges (eds.), Linguistic diversity and language theories, 259–283. Amsterdam: John Benjamins. Frajzyngier, Zygmunt. 2004b. On segment counting, OCP, and heavy segments. Folia Orientalia 40: 93–134. Frajzyngier, Zygmunt. 2005. L’augment télique (goal) dans les langues tchadiques. In Amina Mettouchi & Antoine Lonnet (eds.), Les langues Chamito-Sémitiques. Faits de langues, 215–230. Paris: Ophrys. Frajzyngier, Zygmunt. 2008. A grammar of Gidar. Frankfurt: Peter Lang. Frajzyngier, Zygmunt & Bill Jirsa. 2005. The principle of indirect means in language use and language structure. Journal of Pragmatics 38(4): 513–542. Frajzyngier, Zygmunt & Eric Johnston, with Adrian Edwards. 2005. A grammar of Mina. Berlin: Mouton de Gruyter. Frajzyngier, Zygmunt, Holly Krech & Armik Mirzayan. 2002. Motivation for copula in equational clauses. Linguistic Typology 6(2): 155–198. Frajzyngier, Zygmunt & Jan Mycielski. 1998. On some fundamental problems of mathematical linguistics. In Carlos Martin-Vide (ed.), Mathematical and computational analysis of natural language, 295–310. Amsterdam: John Benjamins. Frajzyngier Zygmunt & Erin Shay. 2003. Explaining language structure through systems interaction. Amsterdam: John Benjamins. Giacalone Ramat, Anna & Paul J. Hopper. 1998. Introduction. In Anna Giacalone Ramat & Paul J. Hopper (eds.), The limits of grammaticalization [Typological Studies in Language 37], 1–12. Amsterdam: John Benjamins. Givón, Talmy. 1990. Syntax. A functional-typological introduction. Vol. II. Amsterdam: John Benjamins. Goldenberg, Gideon. 1971. Tautological infinitive. Israel Oriental Studies 1: 36–85. Greenberg, Joseph H. 1978. How does a language acquire gender markers? In Joseph H. Greenberg, Charles A. Ferguson & Edith Moravcsik (eds.), Universals of human language. Vol. 4, 47–82. Stanford CA: Stanford University Press. Haiman, John. 1994. Ritualization and the development of language. In William Pagliuca (ed.), Perspectives on grammaticalization, 3–28. Amsterdam: John Benjamins. Haspelmath, Martin. 2004. On directionality in language change with particular reference to grammaticalization. In Olga Fischer, Muriel Norde & Harry Perridon (eds.), Up and down the cline: The nature of grammaticalization [Typological Studies in Language 59], 17–44. Amsterdam: John Benjamins. Heine, Bernd, Ulrike Claudi & Friederike Hünnemeyer. 1991. Grammaticalization: A conceptual framework. Chicago IL: University of Chicago Press.

102 Zygmunt Frajzyngier

Heine, Bernd, Tom Güldemann, Christa Kilian-Hatz, Donald A. Lessau, Heinz Roberg, Mathias Schladt & Thomas Stolz. 1993. Conceptual shift: A lexicon of grammaticalization processes in African languages. Cologne: Institut für Afrikanistik. Heine, Bernd & Tania Kuteva. 2002. World lexicon of grammaticalization. Cambridge: CUP. Heller, Daphna. 2002. Possession as a lexical relation: Evidence from the Hebrew construct state. In Line Mikkelsen & Christopher Potts (eds.), Proceedings of the 21st West Coast Conference on Formal Linguistics (WCCFL21), 127–140. Sommerville MA: Cascadilla. Hopper, Paul J. 1987. Emergent grammar. Berkeley Linguistic Society 13: 139–157. Hopper, Paul J. & Elizabeth Closs Traugott. 2003. Grammaticalization. 2nd edn. Cambridge: CUP. Jakobson, Roman. 1971. On linguistic aspects of translation. Selected Writings 2. Word and language, 260–266. The Hague: Mouton de Gruyter. Joseph, Brian D. 2001. Is there such a thing as ‘grammaticalization’? Language Sciences 23(2–3): 163–186. Joseph, Brian D. 2004. Rescuing traditional (historical) linguistics from grammaticalization theory. In Olga Fischer, Muriel Norde & Harry Perridon (eds.), Up and down the cline: The nature of grammaticalization [Typological Studies in Language 59], 45–71. Amsterdam: John Benjamins. Kuryłowicz, Jerzy. 1964. The inflectional categories of Indo-European. Heidelberg: Winter. Larousse de poche. 2005. Paris: Larousse. Lehmann, Christian. 2002. Thoughts on grammaticalization: A programmatic sketch. 2nd edn, rev. [Arbeitspapiere des Seminars für Sprachwissenschaft der Universität Erfurt 9]. Erfurt: University of Erfurt. Matthews, Peter. 1997. The concise dictionary of linguistics. Oxford: OUP. Meillet, Antoine. 1912. L’évolution des formes grammaticales. Scientia (Rivista di Scienze) 12/26(6): 384–400. Newmeyer, Frederick J. 2001. Deconstructing grammaticalization. Language Sciences 23(2–3): 187–229. Rijkhoff, Jan. 2002. The noun phrase. Oxford: OUP. Samuels, Michael L. 1972. Linguistic evolution with special reference to English. Cambridge: CUP. Shay, Erin. In preparation. Grammar of Giziga. Shibatani, Masayoshi. 1991. Grammaticalization of topic into subject. In Elizabeth Closs Traugott & Bernd Heine (eds.), Approaches to grammaticalization. Vol. 2, 93–134. Amsterdam: John Benjamins. The American Heritage Dictionary of the English Language. 1980. 2nd edn. Boston MA: Houghton Mifflin. Von Humboldt, Wilhelm. 1825. Über das Entstehen der grammatischen Formen und ihren Einfluß auf die Ideenentwicklung. Gelesen in der Academie der Wissenschaften am 17. Januar 1822. In Abhandlungen der königlichen Academie der Wissenschaften zu Berlin: 401–430. Wierzbicka, Anna. 1996. Semantics: Primes and universals. Oxford: OUP.

Mismatch: Grammar distortion and grammaticalization

chapter 4

Mismatch Grammar distortion and grammaticalization Livio Gaeta University of Naples ‘Federico II’ The paper investigates the potentially distorting impact of grammaticalization on grammar through the analysis of different types of synchronic form-meaning mismatches which may result from the expansion of a grammaticalization process. Grammaticalization may induce mismatches as a consequence of the development of abnormal structures or new constructions incompatible with the general patterns of the language. Mismatches may also be due to an unchanged residue of grammaticalization, which may derive either from inertial resistance or from a superordinate restriction which prevents the grammaticalization wave from extending to specific domains. Finally, mismatches may also be the result of language contact, either by making use of alternative devices of a foreign origin or by preventing the grammaticalization of native structures due to the introduction of ready-made borrowed constructions.

1. Introduction In recent times, considerable attention has been devoted to the concept of mismatch, understood as a synchronic form-meaning discrepancy in which form– function mappings are “incongruent with respect to more general patterns of correspondence in the language” (cf. Francis and Michaelis 2003: 2). The emphasis on synchrony is aimed at exploring “the implications of mismatch phenomena in general for grammatical theory,” in that “[m]ismatch phenomena challenge our conceptions of grammar and are thereby of vital importance for the development of grammatical architectures” (Francis and Michaelis 2003: 5). Two main types of mismatch have been distinguished: complexity and content mismatches. In a complexity mismatch, there is a discrepancy in the number of elements involved (and, consequently, in the complexity of the structure) at different levels of representation. This is exemplified, for example, by the place-holder es in German (cf. Eisenberg 1999: 175), as in (1).

(1) German Es sind Studenten gekommen, die mit mir sprechen wollten. it are students come.pst.ptcp who with me.dat speak want.pst ‘There came some students, who wanted to talk to me.’

104 Livio Gaeta

In this example, the incongruence is the result of the parasitic role of es, which occupies the first syntactic position (known as “Vorfeld;” cf. Eisenberg 1999: 389) but is not licensed by the verb. Observe how this place-holder (or “Vorfeld-es”) does not force morphosyntactic agreement with the verb, which agrees with the subject Studenten, in neat contrast, for instance, with the argumental es of metereological verbs, as in es regnet ‘it rains.’ A content mismatch, meanwhile, is characterized by incongruous mapping in the content of items from two different levels of representation. This type of mismatch is illustrated in the Sorbian example in (2) below (cf. Corbett 1987: 303), in which the possessive pronoun semantically modifies a noun embedded as a derived adjective within a noun phrase.

(2) Sorbian mojeho mužN-ovaADJ sotra my.gen husband-poss.adj.f.sg.nom sister.f.sg.nom ‘my husband’s sister’

The mismatches reviewed by Francis and Michaelis (2003) are discussed within a purely synchronic framework, in keeping with the general aim of understanding the architecture of synchronic grammars. As an important antecedent, the authors refer to the Stoic anomalist view of grammar, which attributed the presence of exceptions to the obscuring effects of language change. The broader diachronic viewpoint is intentionally left aside, although it is claimed to be “highly compatible” with the main theoretical approaches represented in their volume. However, such an approach runs the risk of undermining the whole enterprise, precisely because of the obscuring effect of language change. This can be seen in the first, and highly illustrative, example of mismatch quoted by Francis and Michaelis (2003), namely the verb form kuti ‘say’ in Chicheŵa, which also performs the function of clause-linkage marker. The emphasis on a synchronic account here overshadows an important characteristic of grammaticalization, that of the effect of layering, that is, the coexistence of older and newer forms or meanings (Hopper and Traugott 2003: 49). In this paper, then, the opposite approach is taken: to investigate synchronic mismatches from a diachronic perspective. In a sense, it is a truism that any change distorts a grammar, at least from the point of view of a fixed grammar (cf. Hopper and Traugott 2003: 49 for a discussion). My intention is to investigate only those cases of distortion resulting directly from grammaticalization, that is, from evolutive changes. Accordingly, I will disregard synchronic mismatches which can be ascribed to common processes of divergence and specialization as discussed by Hopper and Traugott (2003: 115–126), and which often produce an effect of layering. Instead, I will focus on mismatches which are due solely to the process of grammaticalization as the latter expands and is generalized throughout the paradigmatic di-

Mismatch: Grammar distortion and grammaticalization 105

mension of a grammar. In this way, the paper will investigate the distorting impact of grammaticalization on grammar. The general assumption underlying this approach is that, by highlighting the disintegration and dispersal of forms, synchronic mismatches emphasize how “[g]rammaticalization tends to undermine the picture of stability, of clear categorial boundaries, and of structured groups of forms, showing these to be at the most temporary way-stations between different kinds of dispersal, emergence, and fragmentation” (Hopper and Traugott 2003: 165). The paper is structured as follows: in Section 2 the general impact of grammaticalization on grammar is discussed; Section 3 focusses on mismatches due to an unchanged residue of grammaticalization. The latter can result from (i) inertial resistance, which often has the effect of creating complexity mismatches; (ii) a well-defined superordinate restriction, which prevents the grammaticalization wave from extending to a certain domain; or (iii) language contact, which may create mismatches either by hindering the potential grammaticalization of endogenous constructions due to ready-made borrowed structures or by exploiting new grammatical options of foreign origin. Finally, Section 4 draws the conclusions of this study.

2. Mismatch and grammaticalization Non-canonical patterns in a language can be a direct consequence of a grammaticalization process. On the one hand, they may be related to the emergence of structures considered abnormal from the point of view of the structural conditions of a given language system. These mismatches are therefore the direct result of grammaticalization which is in turn responsible for the abnormal structure. For instance, grammaticalization can lead to the formation of marked structures displaying introflecting morphology in a language such as Latin, in which introflection does not normally occur, as can be seen in (3).

(3) Latin quīdam ‘a certain one’ ips-e, -a, -um ‘self ’ Pre-classical hybrid forms Nom. quī-dam/quae-dam *is-pse/ea-pse ea-ps-a Gen. cuius-dam Acc. quen-dam/quan-dam eum-pse/eam-pse eam-ps-am Abl. quō-dam/quā-dam eō-pse/eā-pse

Classical ipse, -a, -um ips-īus ips-um, -am ips-ō, -ā

A sort of content mismatch may arise here caused by inflection occurring in the wrong place as a result of the grammaticalization of the markers -dam and

106 Livio Gaeta

-pse with the relative and the demonstrative pronoun respectively. However, this should not be considered part of the expansion of the grammaticalization process, but rather symptomatic of its own (agglutinative) nature. It remains to be determined how far grammaticalization is responsible for the emergence of such abnormal structures in a language or, in other words, how far grammaticalization is blind with respect to the organization of a given grammatical system (cf. Haspelmath 1998a for a radical view). On the other hand, the expansion of a grammaticalization process may induce mismatches indirectly because new constructions resulting from the generalization of the change arise which are incompatible with the general pattern of a language. For instance, Creissels (2006) mentions the case of the progressive periphrasis in Basque, caused by the grammaticalization of an intransitive construction containing an absolutive subject. When the construction was grammaticalized, it also extended to transitive verbs, giving rise to a split alignment in which it is a transitive verb that governs the absolutive subject, as illustrated in (4d).

(4) Basque a. Jon paseatzen da. Jon.abs walk.ipfv be.prs.sbj.3sg ‘Jon is walking.’ (non-periphrastic present) b. Jon [[paseatzen] ari] da. Jon.abs [[walk.ipfv engaged be.prs.sbj.3sg ‘Jon is walking.’ (lit. ‘Jon is engaged in walking’) (progressive periphrasis) c. Jonek berriak ikusten ditu. Jon.erg news.sg.abs see.ipfv aux.prs.sbj.3sg.obj.3pl ‘Jon is watching the news.’ (non-periphrastic present) d. Jon [[berriak ikusten] ari] da. Jon.abs [[news.sg.abs see.ipfv engaged be.prs.sbj.3sg ‘Jon is watching the news.’ (progressive periphrasis)

In some Russian dialects the grammaticalization of the perfective periphrasis has also given rise to a sort of split alignment, since the possessive construction which is at the origin of the perfective periphrasis contains a non-nominative subject, as observed in (5) below.

(5) Russian a. U Ivana novaja mašina. at Ivan.gen new.sg.f.nom car.sg.nom ‘Ivan has a new car.’

Mismatch: Grammar distortion and grammaticalization 107

b. U volkov s’’edeno korovu. at wolf.pl.gen eaten.sg.n cow.sg.acc ‘The wolves have eaten the cow.’ The Latin, Basque and Russian cases just discussed show how the expansion of a grammaticalization process may bring about new non-canonical patterns within a grammar. Synchronic mismatches may then be eliminated by additional changes, which are not directly related to the logic of grammaticalization but correspond to readjustment strategies which are active in a grammar to increase form-meaning consistency.1 In fact, in the case of *is-pse, the abnormal feature was eliminated by the externalization of inflection, as in the Classical Latin forms ips-e, ips-a, ips-um, etc. (cf. Haspelmath 1993). Similarly, Creissels (2006) reports for Basque that the effects of the grammaticalization of an aspectual periphrasis can be cancelled by a readjustment under the weight of analogy. Thus, in the habitual construction of the Bizkaian dialect of Basque with the verb jakin ‘know’ in auxiliary function, the subject has the coding properties expected of the auxiliated verb, and not those expected of the auxiliary, as we saw in (4d).

3. Mismatch and the unchanged residue of grammaticalization A second source for mismatches can be traced back to an unchanged residue of grammaticalization which is due to the underexploitation of a domain potentially exposed to a process of grammaticalization. The focus on the expansion or generalization of grammaticalization emphasizes the relevance of the scenario onto which a change is projected. There are many factors involved here, and a distinction should be made between actualization and extension of an evolutive change (cf. Andersen 2001). Furthermore, the different roles played by reanalysis in creating new grammatical structures and by analogy in their generalization (cf. Hopper and Traugott 2003: 63–69) may prove very useful. A plausible explanatory model for such a complex scenario is Himmelmann’s (2004: 33) formula and exemplification, which conceives of grammaticalization in terms of context expansion:

(Xn) An B | Kn → (Xn+x) An+x b | Kn+x

where A and B represent full lexical items, b is a grammaticalized element and the following three types of contextual changes (K) occur: 1. How grammaticalization interferes with the naturalness (or markedness) of a given linguistic system is discussed more extensively in Gaeta (forthcoming).

108 Livio Gaeta

a. Host class formation: An → An+x; e.g. common nouns → common and proper nouns. b. Change of syntactic context: Xn → Xn+x; e.g. core argument position → core and peripheral argument position. c. Change of semantic-pragmatic context: Kn → Kn+x; e.g. anaphoric use → anaphoric and associative anaphoric use. The speed at which the innovation is extended may vary considerably depending on a number of different factors.2 I imagine two possible scenarios for such mismatches to occur, which correspond to two possible outcomes of the generalization of a grammaticalization process, as in Figure 1. Figure 1 illustrates the scenarios in which a new marker (the black circle) emerges and expands to cover a given morphological domain. It leaves behind unchanged residues which can be either minimal, as in Figure 1a, or systematic, as in Figure 1b, where a welldefined area of the expansion domain is left uncovered by the new marker. As laid down by Himmelmann’s formula, the target domain consists in the systematic exploitation of larger and larger contexts of the application of a rule. I leave open the question of whether such a target domain should be understood in paradigmatic terms of a Jakobsonian style or as a prototype network based on family resemblance. a.

b. →

→

Figure 1. Possible outcomes of the generalization of a grammaticalization process

These different scenarios relate to forces which hinder or delay the expansion, that is, to the sources of distortion, which can be categorized into three main types. The first is lexical inertia, which plays a major role in interfering with rule generalization, especially for form renewal. The effect of this interference can also 2. The sociolinguistic (and the extra-grammatical factors in general) accompanying a change will not be discussed in this paper. I am aware of the fact, pointed out by Andersen (2001: 243), that “[a]s a consequence of shifts in social valuation, many linguistic changes do not run their full course; some barely get off the ground before they peter out, some get under way in one style of speaking and are ‘reversed’ when another style of speaking becomes preferred in the community, and some changes lose their momentum along the way and leave unchanged residue of one kind or another.” This section really only discusses the typology of such unchanged residues.

Mismatch: Grammar distortion and grammaticalization 109

be observed in cases where an innovating grammaticalization has to face a certain inertia which hinders the full exploitation of a target domain. This can lead to truly defective paradigms (cf. Section 3.1 below). Lexical defectiveness of this kind can be distinguished from a second type of distortion in which defectiveness comes about as a result of grammatical factors exerting a hindering effect on the possible expansion of grammaticalization. In this more complex case, a grammaticalization process cannot be extended further to cover a certain domain owing to a superordinate motivation (discussed in Section 3.2). Finally, external forces can play a role in limiting the development of a grammaticalization process, particularly in situations of close language contact (cf. Section 3.3). Let us turn now, therefore, and take a more detailed look at this increased complex typology.

3.1 Mismatch and inertia Grammaticalization normally leads to the paradigmaticization of a given construction which spreads through the grammar saturating a particular target domain. The latter may already be covered by an older construction, in which case we commonly speak of form renewal or renovating grammaticalization. A classical example of this is the strong/weak verb inflection in the Germanic languages, as exemplified below by Gothic (cf. Braune 1981: 131; Lehmann 1989: 178).

(6) Gothic a. *nasjan dēdum ‘we did save’ > nasidēdum ‘we saved’3 b. hafjan ‘we raise’ > hōfum ‘we raised’

In most cases, renovating grammaticalization leaves behind unchanged residues because of ‘lexical inertia,’ which is the counterpart of the speed at which an innovation becomes extended through the lexicon.4 The effect of this is paradigm layering, that is, inflectional classes. Hafjan in (6b) displays a strong inflection with stem vowel alternation; this is an example of complexity mismatch. Observe Figure 2. 3. The interpretation of the dental suffix of weak preterites as dating back to the grammaticalization of an Indo-European verbal form *dhō-/dhē- (cf. Greek tí-thē-mi, Sanskrit dá-dhā-mi, Latin fēcī) is not undisputed (cf. Ramat 1986: 208 for a quick survey). A second, less popular, interpretation treats the dental suffix as the result of an extension of the Indo-European suffix of the participle of intransitive verbs -tó- > Germanic -ða- and the secondary addition of verbal desinences. In this study, the former analysis is assumed and exemplified by the rather artificial form in (6a), which is taken from Lehmann (1989). 4. It would be interesting to investigate whether a quantitative measure of inertia is even possible, let alone meaningful. However, this must be left for a future study.

110 Livio Gaeta

PAST

hōfum

nasidēdum

Figure 2. Complexity mismatch

Older and newer inflectional classes can survive side by side for centuries. Usually, the lexical domain covered by the older forms is slowly but systematically eroded by the newer forms following lexical diffusion paths related mainly to frequency (cf. Hopper and Traugott 2003: 126–130 for a discussion) but also obeying internal system adequacy, as illustrated by Wurzel (2000) for German verb classes. Consider (7) below.

(7) a. German gären – gor – gegoren > gären – gärte – gegärt ‘ferment’ glimmen – glomm – geglommen > g limmen – glimmte – geglimmt ‘glimmer’ melken – molk – gemolken > melken – melkte – gemelkt ‘milk’ weben – wob – gewoben > weben – webte – gewebt ‘weave’ b. Old High German Middle High German New High German skeran – skar > schar/schor > scherte ‘care’ sweran – swar > swar/schwor > schärte ‘ulcerate’ dreskan – drask > drasch/drosch > drosch ‘thresh’ irleskan – irlask > erlasch/erlosch > erlosch (löschte) ‘burn out’ c. Old High German New High German ladōn – ladōta – giladōt > laden – lud – geladen ‘invite’ swīgēn – swīgēta – giswīgēt > schweigen – schwieg – geschwiegen ‘be silent’ skinten - skindita – giskindit > schinden – schund – geschunden ‘skin’ frāgēn – frāgēta – gifrāgēt > fragen – fragte – gefragt ‘ask’ now obsolete: frug – gefragen

The examples in (7a) show some inflectional class changes from the strong to the weak conjugation which are relatively new. In turn, (7b) illustrates some changes which involve an Old High German verb which is first transferred to another inflectional class of strong verbs in Middle High German (e.g. skar ‘cared’ > schor), before becoming a weak verb in New High German. Moreover, the process of lexical diffusion may not always be unidirectional, but may exhibit deviations like

Mismatch: Grammar distortion and grammaticalization

those reported in (7c). In this instance, an Old High German weak verb joins the strong class or passes through a strong stage before returning to the weak conjugation, as attested by the New High German form fragen ‘ask.’5 At an extreme point, form renewal leaves behind lexical islands not reached by the grammaticalization wave, such as the suppletive and defective paradigms. As in our earlier example of strong/weak verbs in Germanic, Afrikaans has been radically denuded of strong verbs (cf. König and van der Auwera 1994: 490).

(8) Afrikaans breek ‘to break’: ek het gebreek ‘I have broken’ gebroke: ’n gebroke hart ‘a broken heart’ gebreek: ’n gebreekte koppie ‘a broken cup’

Some strong forms still survive as deverbal adjectives but are no longer used in a verbal context. In many instances the newly formed weak past participle coexists with the older strong form, but with a distinction in meaning, the strong form usually denoting the more figurative meaning, as shown in (8b). Older forms need not disappear, but as a side effect of form renewal may change their status within a paradigm. Consider, for example, the case of Tsakonian Greek, in which old present forms develop a future meaning under the force of new grammaticalized present forms acting as a push chain (cf. Haspelmath 1998b).

(9) Tsakonian Greek Indicative émi ftén-u I.am arriv-ing.m ‘I arrive’

Subjunctive (na) ften-u (that) arriv-1sg ‘(that) I arrive’

The new present tense is a participial periphrasis of the type ‘be + present participle.’ The old present (and modern subjunctive) dates back to the present of lateantiquity Greek, where the Classical Greek indicative and subjunctive forms were merged into a single form: 5. One anonymous reviewer interestingly asks what consequences such a behaviour may have for the unidirectionality hypothesis commonly assumed by grammaticalizationists. Although the issue of unidirectionality may, to a certain extent, be considered an open question (cf. Gaeta 2004 for a discussion), I would not treat the German verbs mentioned in (7c) as an argument against unidirectionality, because the change of inflectional class does not have the effect of upgrading the grammatical status of the suffix and turning it into a more autonomous unit. Such changes should be regarded as readjustments within the inflectional morphology of German (which, nevertheless, does not solve the question of why such deviations from the general trend occurred in the first place).

111

112

Livio Gaeta

(10) a.

Tsakonian Greek b. na ftén-u that arrive-1sg na ftén-ere that arrive-2sg na ftén-i that arrive-3sg

Classical Greek (hína) phthaín-ō (that) arrive-1sg (hína) phthaín-īs (that) arrive-2sg (hína) phthaín-ī (that) arrive-3sg

The result is a markedness mismatch, since, contrary to the general expectation, present forms are morphologically more marked than future forms. In this case, “[t]here is no lexical item that is turned into a grammatical item, but a present-tense form which already has a strongly grammaticalized status changes its meaning as a side effect of another grammaticalization process” (Haspelmath 1998b: 34). In the case of verbal inflection in Germanic, the context expansion which lies behind any grammaticalization process cannot be traced back to its origin: we can only record the inertia which is the counterpart of the expansion. An example which closely reflects Himmelmann’s concept of context expansion is provided by the Germanic collective prefix ga-, which was grammaticalized as an aspectual marker for past participles in Old High German (witness (11a)). In Old High German times, however, the marker is not attested with telic verbs, such as those in (11b). (11) Old High German a. ginoman ‘taken,’ gisalbōt ‘greased’ b. queman/quoman ‘come,’ funtan ‘found,’ wortan ‘become,’ brāht/ brungan ‘brought,’ troffan ‘met’ We can presume that the telic character of those predicates were responsible for delaying the process of expansion of the prefix within the Old High German verbal basin. In other words, the context expansion first entailed non-telic, and only later telic verbs. In some modern dialects of German this expansion never took place. Consider the examples in (12). (12) a.

Ripuarian vo·ŋ.ə ku·m.ə wū:ədə(n) brę¯t krējə ble·və vrę·s.ə tro˛·f.ə

‘found’ ‘come’ ‘become’ ‘brought’ ‘taken’ ‘remained’ ‘devoured’ ‘met’

Mismatch: Grammar distortion and grammaticalization

b. c.

Lotharingian gåŋ ‘gone’ gin ‘given’ košt ‘costed’ brāt ‘brought’ gol ‘been worth’ plīw ‘remained’ kreit ‘taken’ kōif ‘bought’ Upper Vinschgau pikxt ‘stuck’ tre:tn ‘trodden’ kaoklt ‘juggled’ kxent ‘known’ deŋkxt ‘thought’ goltn ‘been worth’

kfrɔk ‘asked’ kfuntn ‘found’ ksuɒxt ‘sought’ kxẹp ‘held’ kxɔp/kxẹt ‘had’ gvist ‘known’

gmuɒnt gnumən glɔxt graontst gjo:gn gεsn

‘meant’ ‘taken’ ‘laughed’ ‘grumbled’ ‘hunted’ ‘eaten’

As we can see from Ripuarian (cf. Schirmunski 1962: 517), the prefix ge- never reached the participles of telic verbs. In Lotharingian, the context expansion documented in Old High German came to be limited by lexical inertia (cf. Schirmunski 1962: 517), and the non-occurrence of the prefix ge- is nowadays restricted to the arbritrary set of verbs listed in (12b). Synchronically, the picture resembles Figure 1a above: a number of lexical lacunas appear now surrounded by the systematic extension of the marker ge- and are now completely opaque, that is, only morpho lexically motivated. For the Bavarian dialect of Upper Vinschgau (cf. Alber and Lanthaler 2005), the distribution of the prefix has been reanalysed in a completely different way, acquiring a strictly morphophonological motivation: it does not occur with initial plosives (e.g. deŋkxt ‘thought’ vs. gmuɒnt ‘meant’). Even though it results from a reanalysis of a different nature, the unchanged residue is homogeneous and resembles the scenario presented in Figure 1b.

3.2 Mismatch as a side effect of grammaticalization The under-exploitation of a domain potentially exposed to grammaticalization is not necessarily due to lexical inertia, even if it results from a normal context expansion, as happens in the case of the prefix ge-. Perhaps more interestingly, highordered constraints can also exert a hindering force, especially for an innovating grammaticalization, whose potential domain is usually free from pre-existing constructions. This leaves only a limited possibility for mismatches due to lexical inertia to emerge. We will now discuss at greater length a particular case in which the

113

114 Livio Gaeta

expansion of the grammaticalization of a perfective construction meets with some grammatically superordinate restraining forces, whereby a synchronic mismatch is produced as a side effect. In German, an unexpected infinitive is found in the Perfekt construction when modals or some other verbs governing a bare infinitive occur. See (13) below. (13) German a. Elena hat Deutsch lernen wollen. Elena has German learn.inf will.inf ‘Elena had to learn German.’ b. *Elena hat Deutsch lernen gewollt. Elena has German learn.inf will.pst.ptcp c. Elena hat den Wagen kommen sehen. Elena has the car come.inf see.inf ‘Elena saw the car coming.’ d. Elena hat den Wagen kommen gesehen. Elena has the car come.inf see.pst.ptcp ‘Elena saw the car coming.’ Notice that, while both the past participle and the infinitive are available for verbs such as sehen ‘see’ (cf. (13c–d)), the occurrence of a past participle with a modal is unacceptable (cf. (13b)), although a past participle form is attested in constructions such as Elena hat es gewollt ‘Elena wanted it.’ What we are clearly observing here, therefore, is a content mismatch, coupled with a mismatch relating to the serialization order of the verbal complex in embedded clauses.6 Consider (14a–c). (14) German a. Es ist schön, dass Elena Deutsch hat lernen wollen. it is beautiful that Elena German has learn.inf will.inf ‘It is beautiful that Elena wanted to learn German.’ 6. Notice, again, that for the other verbs the normal choice is available: (i) Es ist wahr, dass Elena den Wagen hat kommen sehen. it is true that Elena the car has come.inf see.inf ‘It is true that Elena saw the car coming.’ (ii) Es ist wahr, dass Elena den Wagen kommen gesehen hat. it is true that Elena the car come.inf see.pst.ptcp has ‘It is true that Elena saw the car coming.’ (iii) *Es ist wahr, dass Elena den Wagen kommen sehen hat. it is true that Elena the car come.inf see.inf has As (iii) shows, however, the normal serialization order is not compatible with the substitute infinitive.

Mismatch: Grammar distortion and grammaticalization

b. *Es ist schön, dass Elena Deutsch lernen gewollt hat. it is beautiful that Elena German learn.inf will.pst.ptcp has c. *Es ist schön, dass Elena Deutsch lernen wollen hat. it is beautiful that Elena German learn.inf will.inf has The same mismatch occurs in many West Germanic languages, such as Dutch, for instance, in spite of the different serialization of the verbal complex (cf. Schmid 2005 for a survey).7 (15) Dutch [. . .] dat ik dat altijd heb willen doen that I that always have want.inf do.inf ‘that I always wanted to do this’ An interesting point to note is that a serialization of the Dutch type was dominant in older stages of German, from which the modern situation slowly developed (cf. (16b) below and Härd 1981). Owing to space constraints, however, I will not consider the question of serialization here, as it must anyway be kept apart from the explanation of why such a morphosyntactic mismatch occurred. In fact, even though the other West Germanic languages vary in terms of the serialization order chosen, the substitute infinitive displays common features in all of them (cf. Gaeta 2005a, 2006 for details). The substitute infinitive is first attested in Middle High German (about the thirteenth century) in examples like those offered in (16). (16) Middle High German a. hâst du daz schif lâzen gân. have.2sg you the boat leave.inf go.inf ‘You left the boat go away.’ (Trist. 6796–6797) b. daz er die brieffe nit habe heissen machin. that he the letters not have order.inf make.inf ‘that he did not order to prepare the letters’ (Publ. 615, 31) Common to all substitute infinitives is the property of governing a following infinitive. In other words, their second argument is served by another verb. More importantly, this property interacts with a perfect construction which is in the process of being grammaticalized at exactly the same time. To understand why such a mismatch occurs, it is necessary to sketch how the verbal periphrasis is grammaticalized. The verbal semantics of the process is represented using Klein’s (2000) 7. One exception is English, where modals, as is well known, underwent major changes (cf., among others, Plank 1984 for a discussion).

115

116 Livio Gaeta

recent approach (cf. also Eisenberg et al. 2001), in which Aktionsart is represented in relation to the number of arguments selected by the predicate.8 This exposes the relation between the possible different temporal intervals present in the verbal description of an event and the participants involved in it. For example, the German verb schlafen ‘sleep’ displays only one argument-time window, in which argument A is associated with a single temporal interval, tsource (cf. (17a)). By contrast, in the case of the two-place predicate öffnen ‘open,’ the second argument, B, is associated with two argument-time windows. The second window represents the change of state resulting from the intervention of A, providing a second temporal interval, ttarget , in which B is attributed a different property with respect to the same B in ts (cf. (17b)). (17) German a. schlafen ‘sleep’ ⟨A, ts⟩ b. öffnen ‘open’ ⟨A, ts⟩, ⟨B, ts⟩, ⟨B, tt⟩ In order to be semantically licensed, a past participle must be anchored at an argument-time window where a second temporal interval for a given participant is specified, as illustrated in (18) below. (18) German a. *der geschlafene Riese b. die geöffnete Tür lit. ‘the slept giant’ ‘the opened door’ Ungrammaticality is also observed when the past participle governing an infinitive is used in the attributive function, as in (19). (19) German a. *der im Park schlafen gesehene Mann the in.def park sleep.inf see.pst.ptcp man b. *der hier kommen gehörte Mann the here come.inf hear.pst.ptcp man When the second argument-time window is not saturated by a governed infinitive, that is, when it can release its referential potential, the attributive participle is possible. Consider examples (20a–b). 8. In Klein’s words (2000: 366), “[w]hat constitutes the lexical content of a verb stem? It is often said that verbs somehow refer to events, whereas nouns refer to objects. This notion, familiar from the days of the Stoic grammarians, is at best sloppy and in fact is highly misleading. The lexical content of a verb contributes to the description of a situation. It contributes the specification of (qualitative or spatial) properties which some entities have during some verbal intervals.”

Mismatch: Grammar distortion and grammaticalization

(20) German a. der im Park gesehene Mann. the in.def park see.pst.ptcp man ‘the man seen in the park’ b. die von allen gehörte Musik. the by everybody hear.pst.ptcp music ‘the music heard by everybody’ With a verb like schlafen, the past participle is only possible in the Perfekt construction, in which the second temporal interval is provided by the auxiliary haben ‘have.’9 (21) German Der Riese hat geschlafen ⇐ habenAUX = ⟨A, tt ⟩ ‘The giant has slept.’ By contrast, the semantic restriction cannot be overcome when a verb governs another infinitive, because the second argument-time window, which is needed as a semantic anchor for the past participle, is already saturated by the governed infinitive (cf. (22) below). This explains the unacceptability of (19) above.10 (22) sehen schlafen

⟨A, ts⟩, ⟨B, ts⟩, ⟨B, tt⟩ | ⟨B, ts⟩

9. Notice that the second temporal interval provided by haben gives rise to a true perfective construction. By contrast, when the auxiliary sein ‘be’ is selected, a resultative state similar to the copula construction is observed: Der Riese ist eingeschlafen/tot ‘The giant has fallen asleep/ is dead’ (cf. Leiss 1992: 273). 10. The restriction is strictly related to the specific semantics and to the diachronic development of the (West) Germanic participle. How far similar effects can be observed in other languages in which a similar grammaticalization process occurs (e.g. Romance languages) is a complex question which requires a separate contrastive investigation. In any case, that the semantics of the participle respects different properties is shown, for instance, by the differences in auxiliary selection between German and Italian verbs: (i) German (ii) German (iii) Italian Die Blume hat geblüht Die Blume ist verblüht Il fiore è fiorito/sfiorito ‘the flower has flourished’ ‘the flower is withered’ ‘the flower is flourished/ withered’ The different grammaticalization processes display further differences (cf. Grønvik 1986), which are mirrored by different synchronic properties of, for instance, Italian participles, which can enter into an attributive construction and govern another infinitive: L’uomo visto dormire nel parco ‘the man seen sleeping in the park.’ I thank one anonymous reviewer for drawing my attention to this point.

117

118

Livio Gaeta

An important consequence of the representation in (22) is that the governed infinitive must now share the same temporal interval as the first argument-time window, which also means that no second temporal interval is available for licensing a past participle. In other words, the events described by both verbs must be simultaneous. (23) German a. Hans sah Karl schlafen. Hans saw Karl sleep.inf ‘Hans saw Karl sleeping.’ b. *Hans sah Karl geschlafen haben. Hans saw Karl sleep.pst.ptcp have Let us now look back at Old High German, when the Perfekt construction was in an early stage of development (cf. Braune 1987: 253). Consider example (24). (24) Old High German iz habet uberstigana in uns jugend managa. it has overcome.f in us youth.f big.f ‘It has overcome the big youth in us.’ (O 1,4,53) In this example, the Old High German past participle was not yet fully grammaticalized as an element of the modern periphrastic structure. For instance, it still displayed agreement with the object (jugend managa). To account for the biclausal nature of the sentence, we refer to the compositional syntactic and semantic structures offered in (25) (cf. Harris 2003: 542). (25) [ SBJj OBJi haben [ SBJj/k OBJi V . . .]];

⟨A, ts⟩, ⟨B, ts⟩ | ⟨X, ts⟩, ⟨B, ts⟩, ⟨B, tt⟩

A verb such as schlafen could not occur in such a combination (cf. Leiss 1992: 280). The process leading to the grammaticalization of the periphrasis first entailed an abductive inference based on Langacker’s (1991) subjectification,11 which reduced the possibilities of interpreting the first argument of the participle involved as different from the first argument of the main verb (cf. (26a)). Subjectification also implied that a second temporal interval ⟨A, tt⟩ had to be attributed implicitly to the first argument of the main verb. Secondly, a process of clause fusion took place, 11. Langacker (1991: 215) defines subjectification as “a semantic shift or extension in which an entity originally construed objectively comes to receive a more subjective construal.” For a more extensive discussion of the role played by subjectification in grammaticalization, see Mortelmans (2004) and Gaeta (2005b).

Mismatch: Grammar distortion and grammaticalization 119

whereby the two original predicates were fused intrinsically (cf. (26b)). The final semantics is contained in (26c), in which the function of the auxiliary haben is to provide the required second temporal interval. (26) a. Subjectification: X ≡ A b. Clause fusion: [SBJj OBJi haben [SBJj/k OBJi V . . .]] > [SBJj OBJi haben V . . .] c. ⟨A, ts⟩, ⟨A, tt⟩, ⟨B, ts⟩, ⟨B, tt⟩ When the Perfekt was grammaticalized in Middle High German, the potential domain of expansion provided by verbs governing an infinitive could only be exploited on condition that the restriction against employing the past participle illustrated in (22) was preserved, as the following compositional structure summarizes: (27) hat + sehen ⟨A, ts⟩, ⟨A, tt⟩, ⟨B, ts⟩, ⟨B, tt⟩ | schlafen ⟨B, ts⟩ Given the impossibility of using the past participle due to the lack of a second argument-time window ⟨B, tt⟩, the arguably default form, that is, the infinitive, emerged as a viable solution, giving rise to the content mismatch which occurs in sentences like those reported in (16) above. It should be stressed that the German infinitive qualified at this time as a default form employed to express pure lexical meaning. This can be seen in the expansion of its periphrastic uses, listed and exemplified below: a. Modal + infinitive periphrasis (especially mögen, müssen and wollen) to express a wish or order, even as early as Old High German (cf. Gaeta 2002). (28) Early New High German got müeze des gastes pflegen. god must.sbjv.3sg the.gen guest.gen care.inf ‘God protect the guest!’ (Iwein 6719) b. Tun periphrasis (Paul 1989: 313). (29) Early New High German die lieb sich meren tuot the love itself enlarge does ‘the love does enlarge itself ’ c. Würde + infinitive periphrasis for the past subjunctive, since the fourteenth century (cf. Ebert et al. 1993: 392).

120 Livio Gaeta

(30) Early New High German das mit dem Turcken vnd Frantzosen dis jar so stehen wurde that with the Turk and Frenchman this year so stand would ‘that this year it would be like this as for the Turk and the Frenchman’ d. Werden + infinitive periphrasis for the future, since the late fourteenth century (cf. Paul 1989: 298). (31) Early New High German daz wirt er eine klagen that will he one lament.inf ‘that he will lament one’ The future periphrasis is particularly significant for our purposes, because it developed in parallel with the substitute infinitive from the late fourteenth century. The infinitive replaced the previous present participle, giving rise to the modern periphrasis. It is also interesting to note that this periphrasis shows similarities with the substitute infinitive in (14a) above in regard to the serialization conditions. Consider (32) below. (32) German Teresa ist davon überzeugt, dass Elena Deutsch wird lernen Teresa is thereof convinced that Elena German becomes learn.inf müssen. must.inf ‘Teresa is convinced that Elena will have to learn German.’ Finally, it should be added that the substitute infinitive was not the only solution adopted in the course of the linguistic history of German to overcome the problem under analysis. Other escape strategies are attested, one of which is the expected haben + past participle + infinitive construction (cf. Ebert et al. 1993: 414), illustrated in (33). (33) Early New High German [. . .] daz sie die engel gehort het singen ein gesank that she the angels hear.pst.ptcp have.sbjv sing.inf a song ‘that she could hear the angels singing a song’ We may wonder why, in the course of time, paradigmatic forces did not act to eliminate this mismatch. Regularization tendencies are attested, as shown by the normalized form in (33) and also by the expansion of the regular construction in German, in which the substitute infinitive survives only in modals and a few other verbs, in contrast with other West Germanic languages such as Dutch, which still

Mismatch: Grammar distortion and grammaticalization

mirrors the original diffusion. On this question, we can only speculate that, for German at least, a certain regularization driven by normative impulses did indeed take place. As much can be inferred from the following passage of Hermann Paul’s Grammatik (1920: 128–129): Ausnahmsweise Verwendung des Part. der genannten Verba in der neueren Sprache ist wohl immer nicht durch landschaftlichen Sprachgebrauch, sondern durch logische Erwägungen der Schriftsteller veranlaßt [. . .] Besonders hat Rückert die Sprache zu korrigieren versucht [. . .] Auch Adelung hält bei hören das Part. für die regelmäßigere, edlere Form.12

3.3 Mismatch and language contact The last type of grammaticalization inducing the rise of synchronic mismatches concerns system-external motivation, namely language contact. In the two cases briefly discussed below (cf. Bisang 2001), a more productive and functionally more consistent morphological system fails to develop in Khmer because of contact with Thai and Vietnamese, on the one hand, while several morphological principles develop in Vietnamese as a result of contact with Chinese. Again, the focus here will not be on extra-grammatical factors, but on the impact of language contact on the development or under-development of a grammaticalization process. A predominantly prefixing and infixing language like Khmer productively employs, among others, the prefixes featured in (34), which cover a broad range of functions. (34) Khmer Prefix Verbal root

Derived verb/noun

p- k- crə- bvN- bvN- bvN-

phdac ‘break’ (tr.) kbaŋ ‘visor, guard’ crəmùć ‘sink’ (tr.) bɔŋrìən ‘teach’ bɔntùk ‘cargo, load’ bɔŋvëc ‘package’

dac ‘break’ (intr.) baŋ ‘use sth. to shade/cover’ mùć ‘sink’ (intr.) rìən ‘learn’ tùk ‘put, keep’ vëc ‘parcel up’

As is well known, Khmer has strong restrictions on word format: a word consists of either one syllable with the structure C(C)VC or C(C)VV(C) (major syllable) or two syllables with a reduced first syllable (minor syllable) which displays further 12. “Exceptional employment of the participle of the mentioned verbs in the modern language is certainly always not due to a regional usage, but to logical considerations of the writers [...] Especially Rückert has tried to correct the language [...] Also Adelung considers in the case of hören ‘hear’ the participle to be the more regular, noble form.”

121

122 Livio Gaeta

phonotactic restrictions which are of no interest here (cf. Haiman 1998 for more details). Furthermore, a subset of minor syllables is of great relevance to morphology, since a word structure consisting of the minor syllables Cə-, Crə- or CvNplus a major syllable is a common prefixed word. In spoken Khmer, phonological processes of strong reduction of minor syllables occur according to the cline in (35), in which the pronunciation in the first column is very careful and formal, and that in the last column colloquial and very informal. (35) Khmer rɔbɔːŋ rəbɔːŋ ləbɔːŋ ‘fence’ prɔtε`əh prətε`əh pətε`əh ‘meet’ kɔndɑːl kəndɑːl kədɑːl ‘middle, centre’ As a consequence of these phonological reductions, the distinction between bisyllabic words with a minor syllable and monosyllabic words with two onset consonants is blurred, creating a potential source of new prefixes. As Bisang (2001: 198) argues: pətε`əh ‘meet’ cannot be distinguished phonetically from phtε`əh ‘house’ with a CC onset. The convergence of bisyllabic words [. . .] and monosyllabic words with CC onset creates a situation in which almost any consonant in the CC onset may be analysed as a prefix derived from a bisyllabic word with a prefix on Crə- or CvN-. [...] In other words, we have a very productive device for the generation of new morphemes in Khmer.

Such techniques, however, did not develop because of the adoption of alternative devices from other languages. Bisang (2001) presents the case of the verb ‘get’ in particular, grammaticalized as a tense-aspect-modality marker in Khmer in parallel way with Thai and Vietnamese, even though the functions of the verbs for ‘get,’ ba:n in Khmer, dây in Thai and được in Vietnamese, are not fully identical in all the three languages. Consider, in this connection, examples (36a–f). (36) Khmer a. thŋay nìh khɲom thv` ː-kaː(r) baːn. day this I do-work can/be.able ‘Today, I can work.’ b. khɲom baːn t` u pìː(r) dɔːŋ haəy. I pst go two time pfv ‘I already went twice.’ Thai c. wan níi phŏm tham-ŋaam dây. day this I do-work can ‘Today, I can work.’

Mismatch: Grammar distortion and grammaticalization 123

d. phŏm dây pay sɔ˘ɔŋ khráŋ lέεw. I pst go two time pfv ‘I already went twice.’ Vietnamese e. tôi học tiếng Việt nam được. I learn language Vietnam able ‘I am able to learn Vietnamese.’ f. tôi được đi Việt nam. I be.allowed go Vietnam ‘I am allowed to go to Vietnam.’ A synchronic mismatch of a reverse type is provided by Vietnamese, in which the exploitation of the morphological parameter of right-headedness occurs because of contact with Chinese. As shown by (37), the head parameter behaves differently for Vietnamese and Chinese. (37) a.

Vietnamese b. xe mới [Head–Modifier] car new ‘a new car’

Chinese xīn chē [Modifier–Head] new car ‘a new car’

As a result of contact with Chinese, Vietnamese borrowed right-heading, which is used in word-formation processes which employ morphemes from Chinese, as illustrated in (38) below by the Sino-Vietnamese class noun gia (from Chinese jiā ‘house’) in contrast with the Vietnamese class noun nhà. (38) a.

Sino-Vietnamese b. khoa học gia science clf ‘scientist’

Vietnamese nhà phát minh clf invent ‘inventor’

Interestingly, this richness of options has been exploited for morphological purposes in Vietnamese. Thus, in some cases, such as chủ nghĩa ‘-ism’ (from Chinese zhŭyì ‘-ism’), the head position is used to distinguish between the nominal and the modifying functions of the abstract concept expressed by this word formative, as illustrated by the comparison of (39a) and (39b). (39) Vietnamese a. chủ nghĩa xã hội b. xã hội chủ nghĩa ism society society ism ‘socialism’ ‘socialistic’ We have analysed, therefore, two mirror-image cases, in which, on the one hand, the possible grammaticalization of a functionally more consistent morpho-

124 Livio Gaeta

logical system does not develop further because of alternatives borrowed from other languages, and, on the other, a morphological principle develops as a consequence of external influence. In both cases, the distorting effect induced by grammaticalization results into a mismatch between theoretically conceivable forms and concrete outcomes.

4. Conclusion This paper has examined the distorting impact of grammaticalization on grammar by focussing on synchronic mismatches resulting from the expansion of a grammaticalization process. These mismatches are caused by an unchanged residue of grammaticalization, in turn the result of inertial resistance, a superordinate restriction which prevents grammaticalization from extending to a particular domain, or language contact. However, the typology of mismatches reviewed in this paper is surely inadequate and needs to be enlarged. We have focussed mainly on the role played by forces which restrain the grammaticalization process, but mismatches may also arise from a grammaticalization process which creates new non-canonical constructions, as has been suggested by Creissels (2006) for Basque and Russian. Nonetheless, my hope would be that the issues raised in this study might serve as a stimulus for further independent research undertaken from a purely synchronic perspective. A first question which needs to be addressed concerns the way mismatches are constrained and the theoretical mechanisms which can best account for these constraints. A second, far bolder question asks whether a grammaticalization perspective can (always?) provide answers for mismatches. I hope that this paper has shown how fruitful grammaticalization can be in providing interesting insights regarding the first question. As for the second, I am not sure that there is a positive answer. For instance, grammaticalization may help explain content mismatches involving verbs like seem in sentences such as John seems to like it (cf. Francis and Michaelis 2003: 7), where diachronic development can be traced quite adequately along the path of auxiliarization. More difficult is the case of the prepositions under and at, which differ with respect to their employment as sentence subjects, as the contrast between Under the sofa is dirty vs. ??At the park is dirty illustrates (cf. Francis and Michaelis 2003: 13). The latter case seems to be more closely related to other principles not clearly amenable to grammaticalization. However, even if the second question should be answered in the negative, I am fully convinced that grammaticalization will shed light on many of the mismatch effects observed in synchronic grammars.

Mismatch: Grammar distortion and grammaticalization 125

Abbreviations 1 2 3 abl abs acc adj aux clf dat def erg f gen inf intr

first person second person third person ablative absolutive accusative adjective auxiliary classifier dative definite ergative feminine genitive infinitive intransitive

ipfv m n nom obj pfv pl poss prs pst ptcp sbj sbjv sg tr v

imperfective masculine neuter nominative object perfective plural possessive present past participle subject subjunctive singular transitive verb

References Alber, Birgit & Franz Lanthaler. 2005. Der Silbenonset in den Tiroler Dialekten. In Claudio Di Meola, Antonie Hornung & Lorenza Rega (eds.), Perspektiven Eins. Akten der ersten Tagung deutscher Sprachwissenschaft in Italien, 75–88. Roma: Istituto Italiano di Studi Germanici. Andersen, Henning. 2001. Actualization and the (uni)directionality of change. In Henning Andersen (ed.), Actualization: Linguistic change in progress, 225–248. Amsterdam: John Benjamins. Bisang, Walter. 2001. Areality, grammaticalization and language typology. On the explanatory power of functional criteria and the status of Universal Grammar. In Walter Bisang (ed.), Language typology and universals, 175–223. Berlin: Akademie-Verlag. Braune, Wilhelm. 1981. Gotische Grammatik. 19th edn. by Ernst A. Ebbinghaus. Tübingen: Niemeyer. Braune, Wilhelm. 1987. Althochdeutsche Grammatik. 14th edn. by and Hans Eggers. Tübingen: Niemeyer. Corbett, Greville G. 1987. The morphology/syntax interface: Evidence from possessive adjectives in Slavonic. Language 63(2): 299–345. Creissels, Denis. 2006. Direct and indirect explanations of typological regularities: The case of alignment variations. Paper presented at the 39th Annual Congress of the Societas Linguistica Europaea: Relativism and Universalism. Bremen, August-September 2006. Ebert, Robert, Oskar Reichmann, Hans-Joachim Solms & Klaus-Peter Wegera. 1993. Frühneuhochdeutsche Grammatik. Tübingen: Niemeyer. Eisenberg, Peter. 1999. Grundriss der deutschen Grammatik. Band 2: Der Satz. Stuttgart: Metzler. Eisenberg, Peter, George Smith & Oliver Teuber. 2001. Ersatzinfinitiv und Oberfeld. Ein großes Rätsel der deutschen Syntax. Deutsche Sprache 29(3): 242–260.

126 Livio Gaeta

Francis, Elaine J. & Laura A. Michaelis. 2003. Mismatch: A crucible for linguistic theory. In Elaine J. Francis & Laura A. Michaelis (eds.), Mismatch: Form-function incongruity and the architecture of grammar, 1–27. Stanford CA: CSLI. Gaeta, Livio. 2002. Umlaut extension in German modals as natural change. Diachronica 19(1): 1–41. Gaeta, Livio. 2004. Exploring grammaticalization from below. In Walter Bisang, Nikolaus P. Himmelmann & Björn Wiemer (eds.), What makes grammaticalization? A look from its fringes and its components, 45–75. Berlin: Mouton de Gruyter. Gaeta, Livio. 2005a. Ersatzinfinitiv im Deutschen: Diachrone Überlegungen zu einem synchronen Rätsel. In Claudio Di Meola, Antonie Hornung & Lorenza Rega (eds.), Perspektiven Eins. Akten der ersten Tagung deutscher Sprachwissenschaft in Italien, 149–165. Roma: Istituto Italiano di Studi Germanici. Gaeta, Livio. 2005b. Hilfsverben und Grammatikalisierung: Die fatale Attraktion von geben. In Torsten Leuschner, Tanja Mortelmans & Sarah de Groodt (eds.), Grammatikalisierung im Deutschen, 193–209. Berlin: Mouton de Gruyter. Gaeta, Livio. 2006. The myth of verb cluster languages. Paper presented at the 39th Annual Congress of the Societas Linguistica Europaea: Relativism and Universalism. Bremen, AugustSeptember 2006. Gaeta, Livio. Forthcoming. Does grammaticalization need prototypes? Language Sciences. Grønvik, Ottar. 1986. Über den Ursprung und die Entwicklung der aktiven Perfekt- und Plusquam perfektkonstruktionen des Hochdeutschen. Oslo: Solum Forlag. Haiman, John. 1998. Possible origins of infixation in Khmer. Studies in Language 22(3): 597–617. Härd, John E. 1981. Studien zur Struktur mehrgliedriger deutscher Nebensatzprädikate. Diachronie und Synchronie. Göteborg: Acta Universitatis Gothoburgensis. Harris, Alice C. 2003. Cross-linguistic perspectives on syntactic change. In Brian D. Joseph & Richard D. Janda (eds.), The handbook of historical linguistics, 529–551. Oxford: Blackwell. Haspelmath, Martin. 1993. The diachronic externalization of inflection. Linguistics 31(2): 279–309. Haspelmath, Martin. 1998a. Does grammaticalization need reanalysis? Studies in Language 22(2): 49–85. Haspelmath, Martin. 1998b. The semantic development of old presents: New futures and subjunctives without grammaticalization. Diachronica 15(1): 29–62. Himmelmann, Nikolaus P. 2004. Lexicalization and grammaticization. In Walter Bisang, Nikolaus P. Himmelmann & Björn Wiemer (eds.), What makes grammaticalization? A look from its fringes and its components, 21–42. Berlin: Mouton de Gruyter. Hopper, Paul J. & Elizabeth Closs Traugott. 2003. Grammaticalization. 2nd edn. Cambridge: CUP. Klein, Wolfgang. 2000. An analysis of the German Perfekt. Language 76: 358–382. König, Ekkehard & Johan van der Auwera (eds.). 1994. The Germanic languages. London: Routledge. Langacker, Ronald W. 1991. Foundations of cognitive grammar. Vol. 2: Descriptive application. Stanford CA: Stanford University Press. Lehmann, Christian. 1989. Markedness and grammaticalization. In Olga M. Tomić (ed.), Markedness in synchrony and diachrony, 175–190. Berlin: Mouton de Gruyter. Leiss, Elisabeth. 1992. Die Verbalkategorie des Deutschen: Ein Beitrag zur Theorie der sprachlichen Kategorisierung. Berlin: Mouton de Gruyter.

Mismatch: Grammar distortion and grammaticalization 127

Mortelmans, Tanja. 2004. Grammatikalisierung und Subjektivierung: Traugott und Langacker revisited. Zeitschrift für germanistische Linguistik 32: 188–209. Paul, Hermann. 1920. Deutsche Grammatik. Band IV. Halle: Niemeyer. Paul, Hermann. 1989. Mittelhochdeutsche Grammatik. 23th edn. by Peter Wiehl and Siegfried Grosse. Tübingen: Niemeyer. Plank, Frans. 1984. The modals story retold. Studies in Language 8(3): 305–364. Ramat, Paolo. 1986. Introduzione alla linguistica germanica. Bologna: il Mulino. Schirmunski, Victor M. 1962. Deutsche Mundartkunde. Berlin: Akademie-Verlag. Schmid, Tanja. 2005. Infinitival syntax: Infinitivus Pro Participio as a repair strategy. Amsterdam: John Benjamins. Wurzel, Wolfgang U. 2000. Verläuft Sprachwandel gezielt? In Dieter Herberg & Elke Tellenbach (eds.), Sprachhistorie(n). Hartmut Schimdt zum 65. Geburtstag, 43–56. Mannheim: Institut für deutsche Sprache.

chapter 5

Areal convergence in grammaticalization processes* Anna Giacalone Ramat University of Pavia This paper discusses a number of developments within Standard Average European, which prove that the universal properties of grammaticalization processes can be reconciled with the specificity of situations of language contact. The four changes examined in the article are the development of periphrastic ‘have’-perfects, the extension of the definite article area, the renewal of the relative pronoun paradigm and the development of reflexive markers towards passive and impersonal constructions. The selected case studies illustrate how language contact may play a role in the initial stage of a given grammaticalization process or may reinforce and accelerate grammaticalization. The paper also shows that canonical grammaticalization and contact-induced grammaticalization develop along similar lines and that contactinduced grammaticalization should be distinguished from cases of mere calquing or transfer of grammatical forms.

1. Preliminaries Strong criticism has been raised recently against some of the basic assumptions of grammaticalization studies (cf. Campbell 2001a, 2001b). Attempts have been made at “deconstructing” grammaticalization (cf. Newmeyer 1998, 2001) and especially at questioning one of its basic tenets, the principle of unidirectionality (on unidirectionality, see Giacalone Ramat 2002; Haspelmath 2004). This flow of criticism, however, has not hindered the growth of research into grammaticalization. In the last ten years, new research projects and new topics have been pursued which share a common approach, that of looking at grammaticalization from a broader perspective than the canonical view of change as a chain or path from lexical to grammatical items. The focus of recent research has shifted from lexemes to constructions, that is, strings of words which grammaticalize in certain * I would like to express my gratitude to one anonymous reviewer for insightful comments, as well as to the editors of the volume and organizers of New Reflections on Grammaticalization 3 (Santiago de Compostela, July 2005), the third conference of a series which testifies to the increasing attention that grammaticalization is currently receiving.

130 Anna Giacalone Ramat

morphosyntactic contexts under highly specific pragmatic conditions (cf. Traugott 2003). Growing attention has also been paid to the issue of the limits of grammaticalization and the problem of how to distinguish between grammaticalization and lexicalization (cf. Giacalone Ramat and Hopper 1998). The relevance of language contact to grammaticalization has been another recently debated issue, and indeed this will be the topic of the present paper. In many influential grammaticalization studies, the focus has been on internally motivated changes (cf. Hopper and Traugott 1993; Lehmann 1995) and on the universal character of generalizations based on grammaticalization processes. On the other hand, in research areas such as the study of pidgins and creoles, second language acquisition and bilingualism, it is widely recognized that linguistic meaning and linguistic structures may be transferred from one language to another. Until recently, the debate on the theoretical status and mechanisms of grammaticalization has generally ignored the possible consequences of contact in the development of grammaticalization (cf. Campbell 2001b). This is not the case, however, with Bernini and Ramat (1996), who argue that contact may have influenced grammaticalization patterns in the case of discontinuous negation in the languages of the Mediterranean (see also Ramat and Stolz 2002). Discontinuous negation is rare in the languages of the world; hence, its presence in geographically contiguous languages such as Romance, Germanic, Celtic and Arabic could be the result of contact and areal diffusion. Bernini and Ramat (1996) support the idea that, for example, Welsh and Breton were influenced by contact with English and French respectively. In the case of Arabic, by contrast, there is no historical evidence of long-lasting linguistic contacts between Arabic varieties and French, so that Bernini and Ramat conclude in favour of an independent development, resulting from typological properties, such as the fact that discontinuous negation is often found in SVO languages (both French and Arabic dialects are SVO). Further, it must be noted that (Sicilian) Italian and (Andalusian) Spanish, which did have long-lasting contacts with Arabic, do not show discontinuous negation. This illustrates the fairly complex relationship between contact and internal linguistic developments, and suggests that similar features in neighbouring languages do not necessarily stem from a common source. In what follows I will be concerned with the issue of grammaticalization processes in areal contexts. I will discuss how the universal properties of grammaticalization processes can be reconciled with the necessary specificity of historical contact. Section 2 briefly discusses some notions related to linguistic areas and language contact. Section 3 presents an analysis of ‘have’-perfects, while Section 4 deals with the evolution of definite articles. Section 5 focusses on the replication of grammatical structures in the development of Romance relative clauses and Section 6 goes on to discuss a number of developments concerning reflexive pro-

Areal convergence in grammaticalization processes 131

nouns and evaluates the areal significance of these changes. Finally, Section 7 provides the conclusions of this study.

2. The areality issue or areal grammaticalization Following Dahl (2001: 1456), I will adopt the term areal typology (rather than areal linguistics) to refer to “the study of patterns in the areal distribution of typologically relevant features of languages.” This notion is both descriptive and explanatory in that it involves both synchronic patterns and the processes which brought them about. In other words, it has both synchronic and diachronic significance. As Dahl (2000: 317) points out, “grammaticalization processes tend to cluster not only genetically, but also areally.” The contribution of grammaticalization to geographically defined linguistic areas is discussed in works such as Bisang (1998), Bisang et al. (2004) and Dahl and Koptjevskaja-Tamm (2001).1 A characterization of linguistic area will not be pursued here since, as noted by Campbell (2006), in spite of prolonged efforts no general agreement on its definition has been reached. I will, rather, focus on specific instances of linguistic changes in contact situations and evaluate their areal significance. A necessary precondition for linguistic areas to exist is intensive language contact extending over a long period of time. In such cases, bilingual speakers tend to develop mechanisms of equivalence between the languages they use and to adopt features from their non-native language, as evidenced by an extensive body of research on code-switching and second language acquisition, in particular those studies concerning the imperfect learning of the receiving language (cf. Thomason 2001). The degree of influence of one language upon another in contact situations may vary considerably because of the asymmetrical socio-political relations between the native speakers of the languages involved (cf. Myers-Scotton 2002: 101). Whenever there is contact, it does not follow automatically that a linguistic area will arise: languages may remain in contact for centuries and exchange lexical items as well as structural features, but may not develop truly areal features, that is, features due to areal diffusion. In order for an areal type to exist, the languages involved must share a significant number of non-correlating features (Comrie 2001: 37), that is, features which are independent of each other and rare among lan1. I do not use here the term Sprachbund, commonly used in language contact studies but which has given rise to many controversies (see the discussion in Koptjevskaja-Tamm and Wälchli 2001), especially concerning the distinction between areal features and Sprachbund phenomena. Indeed, the empirical cases discussed in the literature are often the same (Heine and Kuteva 2005).

132 Anna Giacalone Ramat

guages.2 The notion of areal type clearly differs from the notion of type in general linguistic typology, where features are assessed independently of the historical relations which may have existed among them. In studies which view grammaticalization as a factor relevant to the understanding of areal relationships (Dahl 2000: 317; Heine and Kuteva 2003, 2005; Bisang et al. 2004), it is generally assumed that grammaticalization processes can be triggered by the transfer of grammatical features from languages which have the relevant features to languages which lack them. Transfer mechanisms, however, warrant further scrutiny, in terms of the factors which may constrain or enhance the transfer of form-meaning units. Stolz and Stolz (2001: 1549) have used the term grammaticalization area for those languages in historical contact in which certain grammaticalization processes occur with particular frequency, thus providing a cue for the understanding of a given areal patterning. One case in point is the Mesoamerican area, where body-part nouns frequently come to express grammatical or spatial relations, developing into what roughly corresponds to adpositions of the European languages. This change per se is observable in many languages of the world, but is preferentially activated in the Mesoamerican area because of contact. Thus, the frequency of a given feature in genetically non-related or far-related languages may provide a viable criterion for the identification of a grammaticalization area. Under appropriate historical circumstances, frequency may lead to areal convergence. In this connection, Stolz (2003–2004) claims that, if neighbouring languages happen to display a statistically significant number of occurrences of a feature which behaves statistically differently outside the immediate geographical area in question, then this is not by mere chance but because of areal convergence. On the other hand, the universal nature of typical grammaticalization processes would suggest that changes occurring very frequently crosslinguistically are not areally significant. For instance, the relative clause formation strategy which employs an invariable marker, like English that or Spanish que, occurs in many European languages, but is also very common in other world languages, and hence is better attributed to universal typological tendencies rather than to contact (cf. Comrie 2002; Cristofaro and Giacalone Ramat 2002). 2. In certain cases a single feature may be sufficient to define an area. This is the case with the Semitic languages of Ethiopia, which borrowed the SOV order from the neighbouring Cushitic languages, and only subsequently developed a number of patterns consistent with SOV, such as relative-noun, adjective-noun, etc. (cf. Thomason 2001: 1642). The result was a typological change, in that the syntactic typology of the Semitic languages of that region was altered by the situation of contact. Some discussion on the requirement that the features of a Sprachbund should be unique, i.e. attested only in the member languages, can be found in Stolz (2003–2004: 3).

Areal convergence in grammaticalization processes 133

So, the question which arises is: what is then the role of contact? How can we distinguish between contact-induced grammaticalization and similar typological developments independent of contact? In what follows, I will propose that, in some cases at least, contact may trigger a grammaticalization process, that is, may be the source of a specific grammatical form, while the paths or mechanisms of change will follow universal principles of grammaticalization, that is, the same ones formerly followed by the source language. In some of the cases discussed in this paper I believe that there are good reasons to argue that two constructions in different languages are related to and dependent on each other via a contact-induced grammaticalization process. Other cases, by contrast, are problematic and do not provide conclusive evidence. In what follows, I will address the questions outlined above by drawing on cases from a specific grammaticalization area, namely the European area, in which both external features (that is, century-long mutual cultural relations) and structural features support the idea of a linguistic area. In recent typological research, the European area has become an established concept (called Standard Average European). A nucleus of shared features has been recognized in neighbouring languages, including French, German, Dutch and Northern Italian. Surrounding the nucleus is a less central area or core comprising English, Scandinavian languages, Czech, etc., and a peripheral area with fewer shared features, which includes Russian, Hungarian and Celtic languages.3

3. ‘Have’-perfects: Grammaticalization in steps A rather uncontroversial European areal feature is the ‘have’-perfect, that is, the transitive perfect formed by ‘have’ plus the passive participle of a lexical verb. The construction is widespread in all Romance and Germanic languages as well as in some other languages, but did not exist in the oldest Indo-European languages. This means that it is not a case of genetic inheritance, but of innovation. Figure 1 is based on Haspelmath (2001), with some modifications in order to include languages like Polish or Basque, which are developing ‘have’-perfects probably through contact. Predicative possession expressed by a transitive verb with the prototypical meaning ‘hold in one’s hand, possess,’ as in English have and German haben (from Indo-European *kap-) and Latin habeo (from Indo-European *ghabh-; Koch 1999: 284), is rare outside Europe (Benveniste 1960: 121), and in the European area it is 3. See Haspelmath (2001) for an overview and discussion of the distinctive features of the European linguistic area.

134 Anna Giacalone Ramat

Saa Ice

Swe

Fin

Nor

Mnx Iri Wel Eng Bre

Dan Dut Fre

Lit Ger SGer

NIt Gal Por

Bas Cat Spa

Sar

Rus

Est Ltv Blr

Pol Cze

Slk Hun Slo S.–Cr. Ita

Alb Gre

Ukr Rom Bul Tur

Mal Areal diffusion of ‘have’-perfects ‘Have’-perfects as past tenses ‘Have’-perfects as emergent structures Figure 1. Distribution of ‘have’-perfects in Europe (based on Halpelmath 2001)

present in the core, not in the periphery. Consequently, the development of a periphrastic perfect construction involving the verb ‘have’ is a typical Standard Average European feature (cf. Dahl 1990: 7), as confirmed by WALS (2005), which lists only seven perfects based on possessive transitive constructions, all of them in Europe. However, ‘have’-perfects are not found in Slavic languages (with the exception of some South-Slavic varieties and of Polish, as shown in Figure 1 above), where the transitive perfect is a participial construction combined with the copula. Furthermore, ‘have’-perfects are not found in Hungarian, Finnish or the Baltic languages. In the West, they are absent from Celtic languages (with the only exception of Breton), which use a prepositional construction.4 The question therefore arises as to whether it is possible to identify a language which promoted this innovation.

4. Interesting though it is, I will not discuss the complex alternation between ‘be’-perfects and ‘have’-perfects with intransitive verbs in Romance and Germanic languages (and especially in Italian dialects) since this issue is not immediately relevant to the present work.

Areal convergence in grammaticalization processes 135

3.1 Origin and diffusion of the ‘have’-perfects The history of the ‘have’-perfect in Latin has been traced in several works (cf., among others, Pinkster 1987; Ramat 1987). The reanalysis of habere as an auxiliary in association with a past passive participle is frequently quoted as a standard example of grammaticalization (cf. Hopper and Traugott 2003: 62ff). It is wellknown that the periphrastic perfect originated from a possessive construction consisting of transitive habere followed by a direct object and an agreeing past passive participle which formed a constituent of its own, as is illustrated in the following example:

(1) Latin multa bona bene parta habemus. many.pl.n good.pl.n well procure.ptcp.pl.n have.prs.1pl ‘We possess many goods which have been well procured.’ (Pl. Trin. 347)

In this much-quoted excerpt from Plautus, the subject of habemus does not necessarily coincide with the (understood) agent of the participle. It is a resultative construction (Nedjalkov 1988) with present time reference which constitutes the starting point for the development of habeo into an auxiliary (cf. Pinkster 1987). The construction is found with transitive verbs, more frequently with perfective than with imperfective meaning,5 although different contexts can also be observed, such as cognition verbs and verbs of saying, which involve identity of reference 5. Examples of such resultative constructions are not so scarce, although I cannot offer any quantitative data. Other frequently quoted cases are: (i) Latin (Flamines) caput cinctum habebant filo. head.sg.n encircle.ptcp.sg.n have.impf.3pl fillet.abl.sg ‘The flamines had their hair girt with a woolen fillet.’ (Varro L. L.5.84; from Pinkster 1987: 197) (ii)

Latin in ea provincia pecunias magnas conlocatas habent. in that province money.acc.pl.f great.acc.pl.f invest.ptcp.acc.pl.f have.prs.3pl ‘They have great capital invested in that province.’ (Cicero, de imperio Cn. Pompei, 7,18)

A semantic equivalent would be Italian Tengo gli occhi aperti ‘I have (lit. keep) my eyes wideopen,’ in which occhi and aperti belong to different constituents, as is shown by the fact that they can be pronominalized independently: Come tieni gli occhi? (lit. ‘How do you keep your eyes?’) Li tengo aperti (lit. ‘I keep them open’). The same distinction holds for English I have the letter written vs. I have written the letter. In Present-day English the semantic difference is reflected in the word order, but this is not the case in Old English, in which word order was relatively free (cf. Carey 1994: 104).

136 Anna Giacalone Ramat

etween the subject of habere and the agent of the past participle. Such cases show b a closer relation between the participle and the verb habere and suggest that the reanalysis of the bi-clausal pattern to a mono-clausal structure might have taken place, thus anticipating the functional shift of habere to an auxiliary (Pinkster 1987: 204–205).

(2) Latin de Caesare satis hoc tempore dictum habeo. about Caesar enough this time say.ptcp.acc.sg have.prs.1sg ‘I shall regard what I have said of Gaius Caesar as sufficient at present.’ (Cic. Phil. 5,52)

Many centuries later we find the following examples:

(3) Latin haec omnia probatum habemus. this all.acc.pl prove.ptcp.sg have.prs.1pl ‘We have proved all these things.’ (4th c., Oribasius, Syn. 7.48)

(4) Latin Episcopum […] invitatum habes. invite.ptcp.acc have.prs.2sg bishop.acc ‘You have invited the bishop.’ (6th c., Gregory of Tours, Vit.patr. 3,1)

In example (3) the lack of agreement between the participle and the direct object may be interpreted as a syntactic cue that the construction has been reanalysed as a single verbal complex; habeo has taken on an auxiliary function and the participle has become the main verb (Ramat 1987: 144). Germanic languages, with the exception of Gothic, also exhibit the periphrastic perfect with ‘have.’ They originally had a two-form temporal system contrasting present with preterite, both of which were inflectional, as in Gothic. As far back as the oldest documents of the West Germanic and Nordic languages, a periphrastic perfect construction can be found which strongly resembles the Latin one. Therefore, the question has often been asked as to whether Latin and Germanic followed the same path of development independently or whether a model for shaping equivalence relations was provided by one of the two languages. The Latin examples of the construction are older than any conceivable Germanic influence on Latin, while the opposite direction of influence is certainly easier to envisage. The cultural influence of Latin has a long history, since contacts between the Germanic and the Latin peoples had existed at least since Julius Caesar’s time. One might indeed assume that ‘have’-perfects originated in Latin and spread across Europe during the transition period between Antiquity and the Middle

Areal convergence in grammaticalization processes 137

Ages, when many features of the Standard Average European arose (Haspelmath 2001). The reason why Medieval Latin texts contain only sporadic examples of periphrastic perfects is that the construction probably spread in the spoken language, as was the case with other important changes which emerged in the Romance languages, such as the future. In Late spoken Latin a grammaticalization process took place, whereby the old resultative construction attested since Plautus (cf. example (1) above) developed into a new grammatical form. Meillet (1930: 130) suggested that the German(ic) periphrasis was calqued on Late Latin, a view which has been accepted by many scholars (Haase 1994: 289, among others) and can be considered the common opinion (Abraham 2004: 248). However, the alternative view, of an independent development, has also been adopted by some scholars, following Benveniste’s (1960: 132–134) observation that the conditions for the creation of a transitive perfect existed autonomously in the Germanic languages. This, however, is not sufficient grounds for excluding the possibility that Latin provided the model for the bi-clausal possessive construction which German replicated with its own lexical forms and developed following the same direction of change as Latin. In German, the periphrastic perfect with haben has been documented since the early Old German period (ninth century; Behaghel 1923–1932: II, 271), at first only with transitive verbs whose objects agreed with the participial complement, much as in Latin. In word-by-word translations of the Gospels, one can find resultative constructions in which the participle agrees with the direct object, as in (5).

(5) a. Latin arborem fici habebat quidam tree.acc.sg.f fig.gen.sg have.impf.3sg somebody plantatam in vinea sua. (Luke 13)6 plant.ptcp.acc.sg.f adp vineyard.abl.sg.f poss.abl.sg.f b. Old German phigboum habeta sum giflanzo-t-an in fig.tree.acc.sg have.pret.3sg somebody plant-ptcp-acc.sg adp sinemo wingarten. poss.dat.sg vineyard.dat.sg ‘Somebody possessed a figtree [which was] planted [not necessarily by himself] in his vineyard.’ (Tatian 102.2)

6. Note that the Greek text also shows a periphrastic form, which the Latin translation follows word by word: συκῆν εἶχέ τιϚ πεφυτευµένην ἐῃ τῷ ἀµπελῶνι αὐτοῦ (Luke 13).

138 Anna Giacalone Ramat

In authors like Otfrid (ninth century) and Notker (end of the tenth century), another verb meaning ‘own, possess’ is found which could form a periphrastic construction with auxiliary-like function, namely eigan, as in éigut ir gihórit ‘you have heard’ (Otfrid, Evangelienbuch IV, 19.67). Eigan, originally a preterite-present, is recorded in all Germanic languages, but acquired an auxiliary function only in Old High German. The shift from the lexical meaning ‘possess’ to auxiliary status probably reproduced the development of haban and constitutes additional evidence for this grammaticalization path. In Old High German as well as in Old Saxon and Old English texts, the perfect construction shows much variation. In German it was originally used only with transitive verbs, the perfect of intransitives being mostly formed with wesan ‘be.’ By Notker’s time, possibly at different points in different dialects, the bi-clausal structure changed to mono-clausal and could then appear without an object in the main clause. The meaning of possession was replaced by that of perfect, as shown by the co-occurrence with reflexives; cf. (6), from Harris (2003: 543).

(6) Latin si habet sih erretet. she.nom.3 has self.acc saved ‘She has saved herself.’ (Notker, I,64,11; cited by Dieninghoff 1904: 49)

In Old English texts the participle could be inflected, agreeing with the accusative object, but in most cases it was left uninflected (cf. Mitchell 1985:I, 292). As for word order, all possible arrangements can be found (Mitchell 1985:I, 282–283). As Traugott (1992: 191) points out, the fact that in Old English the construction could be used from the earliest times with both intransitive verbs and transitive verbs whose objects are not accusative suggests that habban could be employed with an auxiliary function at this stage. In other words, the grammaticalization of the ‘have’-participle construction must have been fairly well advanced in Old English (cf. Wischer 2004). Still, examples like (7) below testify to the persistence of the resultative meaning of the construction and of the possessive meaning of h abban.

(7) Old English heo hæfdon utamærede þa bigengan. they had expelled.acc.pl the inhabitants.acc.pl ‘They had [kept] expelled the inhabitants.’ (Bede 1 12.54.7; cited by Denison 1993: 346)

The possibility that the participle could, albeit not consistently, agree with the accusative direct object should in any case suggest that a process of reanalysis similar to that of Latin had taken or was taking place (cf. Visser 1963–1973; Denison 1993). As noted by Wischer (2004), however, the development of the present perfect in

Areal convergence in grammaticalization processes 139

English from resultative to perfect meaning is not straightforward. Elsness (1997) has shown that in Old English the preterite is still the predominant verb form for past time reference (it is found in more than 80 per cent of the data in Elsness’s corpus). Furthermore, differently from Present-day English, habban was not the only perfect auxiliary, since beon/wesan ‘be’ was also used with motion and change of state verbs. Alongside formal changes (loss of agreement, word order changes), the periphrastic construction with habban underwent a gradual semantic shift involving a change in time reference from present time to past time (cf. Elsness 1997: 356). Subsequently, ‘have’-perfects rapidly increased in frequency from Old English through Middle English up to Early Modern English.7 A grammaticalization process influenced by the Latin model seems to offer the most likely account for the parallel developments which took place in the Germanic languages, including Scandinavian languages (cf. Haugen 1982). Evidence also exists of a more recent spread of ‘have’-perfects in contact zones around Germanic and Romance languages (Drinka 2003). For instance, Czech has developed a periphrastic perfect with ‘have’ and the passive participle, possibly under the influence of German (cf. Garvin 1949: 84). Colloquial Polish also uses a type of perfect which involves forms of mieć ‘have’ and a perfective passive participle (cf. Rothstein 1993). However, these forms do not replace the original Slavic construction with the auxiliary ‘be,’ but rather introduce a differentiation corresponding to the English construction I’ve got it done (Garvin 1949: 84). All these forms are innovations; they are attested in spoken varieties and are apparently not very old. As for the languages along the western border of the ‘have’-perfect area, Breton is the only Celtic language to have developed a Romance style perfect, while Basque has developed a perfect which uses ‘have’ with transitive verbs and ‘be’ with intransitive verbs (cf. Haase 1994: 290–291).8 Haase (1994: 289) suggests that possibly all the analytical constructions in the Basque verbal system came about in a (pre-Ro7. It might be noted in passing that Present-day American English tends to use the preterite in cases where British English would prefer the present perfect. Thus, the increase in the ratio of present perfects to preterites has not only been arrested, but actually reversed, in the case of American English (cf. Elsness 1997: 341). This contrasts with what is happening in French and German, where the use of periphrastic forms is growing. Elsness (1997: 359) advances a language specific explanation for English, namely the formal reduction of the auxiliary have in the spoken language and the similarity between the past participle and the preterite for the vast majority of verbs. 8. Basque is quite marginal in the European area. However, Heine and Kuteva (2003: 549ff) show that, especially in the variety spoken in France, a number of changes have occurred due to contact and these have brought Basque closer to Standard Average European. Among them are the rise of an indefinite article via replication of the Romance pattern of grammaticalization and the rise of the ‘have’-perfect.

140 Anna Giacalone Ramat

mance) language contact situation, most probably with spoken Latin (though the possibility of a language internal development cannot be excluded; cf. Trask 1998: 319). Be that as it may, the spread of ‘have’-perfects to the periphery raises the question as to whether a fully grammaticalized construction was borrowed from the source language or, rather, a grammaticalization process for resultative constructions along the lines described above took place in the receiving language. A periphrastic ‘have’-perfect area is also found in the Balkan languages, although there are considerable differences with Romance and Germanic both in terms of the distribution of this feature and in terms of its historical development. From a diachronic perspective, a process of auxiliation of ‘have’ can be traced back to Classical Greek, where ékhō could be associated with the active aorist participle and the perfect passive participle. The latter gave rise to the type gramméno ékhō ‘I have written,’ “a wholly unclassical construction which begins to appear […] in the Roman period” (Horrocks 1997: 77) and is still used in Modern Greek. The modern language, however, prefers a periphrastic perfect with ékhō plus the nonfinite form, derived from the aorist infinitive, as in ékhō diabásei ‘I have read.’ According to Moser (2003: 247), in Greek, the participial construction with ékhō, unlike its Latin counterpart and similar constructions in other languages, never expanded beyond a very limited group of transitive verbs, with a locative argument and an overt object, and was consequently never completely grammaticalized as a periphrasis or a perfect.

Indeed, although some calquing cannot be excluded, especially in the translations of Christian literature, there is no evidence of any direct influence of Greek upon the development of the Latin and Romance perfects, contra Bonfante (1960) and Drinka (2003). On the contrary, Latin is likely to have influenced Greek in adopting the perfect passive participle used in an active transitive sense for the perfect construction, as suggested by Horrocks (1997: 77–78), who concludes that “this construction is therefore a very strong candidate for classification as a ‘Latinism’ in the koine.” Periphrastic ‘have’-perfects are used in various Balkan languages with different distribution and functions. In Albanian kam ‘have’ from Indo-European *kap‘seize, grasp,’ is the perfect auxiliary both for transitive and intransitive verbs. In Macedonian, alongside with the Common Slavic ‘be’ plus l-participle forms (sumperfect), ‘have’-perfects (ima-perfects) have been created and are used, more in Western than in Eastern Macedonia, possibly as a result of contact with Greek or Albanian. They express resultant states and can take on a preterite meaning, which is also found in Albanian (cf. Friedman 2002). In sum, Macedonian and Albanian build up a geographical continuum characterized by periphrastic ‘have’-perfects, albeit exhibiting varying degrees of grammaticalization. This points to areal

Areal convergence in grammaticalization processes 141

diffusion rather than independent, language-internal developments (cf. Friedman 1994; Heine and Kuteva 2005: 226–227).

3.2 A further step: Perfects as pasts ‘Have’-perfects have undergone further semantic development. In a rather restricted area of Western Europe they have become past tenses and have (at least partially) replaced simple preterites. This change involves the passé composé in French, the passato prossimo in (Northern) Italian and the Perfekt in German (Oberdeutsch). In these languages, as already mentioned, the periphrastic forms have more or less completely replaced the corresponding simple past forms. In Northern Italian the simple past is mainly a written form occurring in historical narratives (cf. Squartini and Bertinetto 2000). In turn, the so-called Präteritumschwund in German is linked to the disappearance of the preterite in dialects (cf. Ebert et al. 1993: 389) and the same holds true for Italian.9 In replacing the preterite, French and Italian ‘have’-perfects have inherited the perfective aspect meaning, since there exists in Romance, unlike German, a separate imperfective past. By contrast, in Spanish, English and Scandinavian the ‘have’-perfect has maintained a clear presentanterior meaning and the simple past has been preserved. As Thierhoff points out (2000: 285), the present-day distribution of ‘have’-perfects in Europe shows that languages with stable ‘have’-perfects, such as Spanish, English and Scandinavian, are located on the fringes of the European continent, surrounding the area where ‘have’-perfects are used as past tenses. Standard Italian and Standard German represent intermediate stages. Crucially, such developments are more recent than the emergence of ‘have’perfects discussed above and are still ongoing (cf. Ebert et al. 1993: 389). Drinka (2003) suggests that the centre of diffusion of the use of perfects for past tenses was twelfth century Parisian French and points to documentary evidence showing the strong influence of French on Southern German cultural centres.10 As for 9. The preterite has also been lost in Afrikaans (cf. Abraham 2004). Nevertheless, this development is not a counterexample to the hypothesis of areal diffusion of ‘have’-perfects in Europe. In Afrikaans there is nowadays only one past form, the ‘have’-perfect, as the result of processes of simplification in the whole verbal system under peculiar circumstances of contact. This change is probably independent of the European ones. In Dutch the preterite is preserved, although the perfect may have both an anterior and a past tense use. 10. According to Drinka (2003), Parisian French had a strong influence on German through courts and chanceries in territories near the French–German border, such as Cologne and Strasbourg, but also in important cultural centres in Southern Germany, such as Augsburg, Nürnberg. In Augsburg, the demise of the preterite is well-documented in texts from the imperial chanceries.

142 Anna Giacalone Ramat

Northern Italian, there are no thorough studies on temporal and aspectual systems of older varieties. In Present-day Italian empirical data collected by Squartini and Bertinetto (2000: 422–426) show that the present perfect (passato prossimo) continues to gain ground at the expense of the preterite and is spreading towards Southern Italy and also into Sardinian. To sum up, on the basis of the data discussed so far it can be concluded that the development of periphrastic ‘have’-perfects in Europe is the result of a process of areal diffusion which took place in two stages or waves. Late Latin was possibly the centre of innovation and the vehicle language for the spread of ‘have’-perfects in Western Europe (it is no coincidence that in Eastern Europe, where Latin was used less as a language of culture and as a written language, be-perfects are dominant). The wealth of literary works translated from Latin since the earliest written stages of German and English clearly justifies such an influence. A further argument in favour of a Latin/Romance innovation is that the process of reanalysis of periphrastic constructions which took place in Late Latin is not likely to have been independently replicated in Germanic languages, given that it is typologically rare for a verb of possession to become an auxiliary. Historical evidence points rather to the assumption that the Late Latin/Romance resultative construction with ‘have’ was adopted by speakers of Germanic languages at a stage of incipient grammaticalization, when the auxiliation process was under way and multiple analyses were available (cf. Harris 2003: 545). Indigenous lexical material, like Old High German habban or eigan, was employed. As was shown above, the oldest German and English texts belong to a period in which the grammaticalization process was not complete: ‘have’ plus participle was not yet (or, at least, not always) a fully-fledged perfect, that is, a periphrastic verbal group with an auxiliary carrying tense and person specifications. As Haase (1994: 291) has suggested for Basque, a verb with a concrete semantic meaning like ‘have’ could be translated without difficulty and might have been the triggering force for a structural transposition felt to be a useful tool. Indeed, the expansion of the periphrasis to neighbouring languages, such as Breton, Basque, Polish and Czech, seems to confirm that it provided an attractive model for speakers. ‘Have’-perfects have been a strong areal feature since their diffusion in Late Latin/Early Romance, in terms of both form and function. Functionally, ‘have’-perfects have restored the tripartite Indo-European verbal system, which had been reduced to the bipartite imperfectum/perfectum system in Latin because of the loss of the aorist (cf. Benveniste 1968). The perfectum in Latin had become a past tense denoting an action or event which happened in the past and did not make any reference to the current state of affairs resulting from that action or event. The propensity to have at one’s disposal a specific category marking anteriority, that is, linking the past with the present, was the motivation for change behind the rise of the peri-

Areal convergence in grammaticalization processes 143

phrastic perfect, for which the resultative construction was an obvious candidate.11 The same kind of motivation also applies to Germanic languages, which had a bipartite tense system contrasting present and preterite, as in Old High German, Old English, Old Norse and Gothic. Both the Latin and Germanic developments clearly represent instances of the mechanisms which are said to play a role in grammaticalization, such as semantic bleaching, loss of categorial properties and reanalysis. As for the further development of the periphrastic perfects into past tenses, it should be noted that this process is well attested crosslinguistically (cf. Bybee et al. 1994). Nonetheless, the areal diffusion hypothesis should not be discarded, since, as Drinka (2003: 27) suggests, spread from neighbouring languages in a localized area of Western Europe is more plausible than the hypothesis of unrelated instantiations of the universal tendency of perfects to become pasts.12 In conclusion, we have dealt with a two-step grammaticalization change. The first of these, the rise and spread of periphrastic ‘have’-perfects, seems to be a specific, typologically marked European development, while the second change, the use of periphrastic perfects as past tenses, follows universal patterns of grammaticalization. However, in support of an areal interpretation of the observed distribution, one might note that the area where the preterite is lost and the perfect is used as a past is limited and made up of contiguous sub-areas which have had intense cross-cultural contacts for centuries. This makes the hypothesis of independent changes improbable.13 11. The development of the present perfect in Romance has also been viewed as part of a more general tendency characterizing Romance languages whereby synthetic forms tend to be replaced by analytic forms (cf. Schwegler 1990; Hinrichs 2004, among others). However, the notion of cyclicity and the question of a drift discussed by Schwegler (1990) reflect theoretical frameworks and terminologies different from the ones adopted here, and are not immediately relevant to the issues addressed in this paper. 12. An alternative view with respect to the areal diffusion hypothesis is defended by Abraham (2004), who argues that there is no reason to assume that similarities among the European languages are rooted in contact-induced convergences. In his view, the rise of the periphrastic perfect would be due to the autonomous spread of a form — the periphrasis — which “facilitates oral and aural coding and is — in its emergence — restricted to this specific type of coding” (2004: 263). On the basis of my findings, Abraham’s view can be reconciled with the idea of areal convergence, since the importance of the spoken language has been recognized as a relevant factor for language change in general and for the spread of the ‘have’-perfects in Romance and Germanic in particular. 13. It may be noted that around the core area in which ‘have’-perfects have become past tenses and preterites have been lost there are languages in which the perfect has partially evolved into a past tense, but the preterite is preserved. Besides Standard Italian and Standard German, this is the case of Dutch (see footnote 9) and of northern dialects of Albanian, in which the ‘have’perfect may be used as a past tense along with the aorist (cf. Bucholz and Fiedler 1987: 130). In the latter case influence from Italian cannot be ruled out.

144 Anna Giacalone Ramat

4. Some reflections on articles in Slavic languages As mentioned above, hypotheses of contact-induced grammaticalization are often confronted with the alternative possibility of universal typological processes. In order to reach a better understanding of the process leading to the gradual acquisition of new grammatical structures, I will discuss some data collected from Slavic languages. The development of articles from demonstratives is, crosslinguistically, quite a common grammaticalization process which can take place in different languages without any visible influence from neighbouring languages (cf. Greenberg 1978; Heine and Kuteva 2003). Thus, the areal significance of such a feature is in principle weak. Still, it is reasonable to claim that intense areal contact is a relevant factor in creating the conditions for a likely typological change to occur. After all, one might wonder why a process such as the grammaticalization of demonstratives did not occur everywhere in Europe, in Russian and Polish, for example. As suggested by Dahl (2001: 1469), “while the chance that a certain morpheme or construction in a language will undergo a particular kind of grammaticalization is on the whole rather small, the probability increases dramatically if a neighbouring language undergoes the process in question.” Definite and indefinite articles are a peculiar feature of the core European languages (Haspelmath 2001). In large parts of Europe outside the Standard Average European there are no articles. This is the case of East Slavic, Finnish (but see below), Nakh-Daghestanian and Kartvelian languages, while Celtic, Basque and Semitic languages only have definite articles. The geographical distribution of articles on a worldwide basis (WALS 2005) confirms the Western European peculiarity: while languages lacking articles are common (out of 566 languages, only 188 have no definite or indefinite articles and 41 have no definite, but only indefinite articles), one area in which lack of definite article is infrequent is Western Europe (cf. Dryer 2005). Bearing this in mind, let us now consider the case of some Slavic languages where the development of demonstratives into definite articles seems to be under way. In particular, I will focus on spoken varieties of Czech, (Upper) Sorbian and Slovenian, drawing on data from a recent study of the uses of demonstratives in the spoken varieties of these languages, namely Trovesi 2004.14 Trovesi’s results show that demonstratives (ten, tón, ta) are undergoing a grammaticalization process and are becoming definite articles. This innovative process is more wide14. I will leave Bulgarian aside, where the development of the definite article from the demonstrative stem *to- (Scatton 1993) has to be framed in the context of changes which define the Balkan area. As in Macedonian, Albanian and Rumanian, the article is also postponed in Bulgarian.

Areal convergence in grammaticalization processes 145

spread in spoken than in written varieties. Since German played a significant political and cultural role in all the areas where Czech, Sorbian and Slovenian are spoken, it would seem plausible to suggest, as does Trovesi, that these developments are taking place under the influence of German.15 The German definite article der derives from the same demonstrative stem Indo-European *to- as the Slavic forms and it is used both as an article (if unstressed) and as a demonstrative (if stressed).16

(8) German Mit dem Kerl will ich nichts mehr zu tun haben. with that guy want I not more to do have.inf ‘With that guy I do not want to have anything to do anymore.’ (DUDEN 2005: 289)

If we assume that German has played some role in the Slavic developments, we should look for a pattern of evolution in order to understand how the German article influenced the neighbouring Slavic languages. I would argue that the model offered by German as a source language was transparent to speakers of the receiving languages because of the ambiguity of the demonstrative/article interpretation, and that this was the triggering factor in the incipient grammaticalization. As for Sorbian, Heine and Kuteva (2003: 537) claim that “what Sorbians did was to replicate a category by drawing on a universal strategy, the same strategy that speakers of German had used earlier.” However, the idea of replicating a category, or of copying it, seems inadequate in this case, because Sorbian speakers could not conceptualize the advantage or the usefulness of elaborating this category. If Sorbian speakers had replicated the category as a whole, we would expect them to have copied and used it as a fully-fledged construction, which is not the case. At present, it cannot be claimed that Sorbian has developed a new word class; at least it has not done so yet, since the grammaticalization process is still at an incipient stage. The category of definiteness is gradually arising in the minds of bilingual speakers whose use of German fosters the gradual extension of the Sorbian demonstrative. Typological research has pointed out a universal path along which demonstratives become articles. There are crucial disambiguating contexts where articles may be used but demonstratives cannot, such as unique referents, as in Italian oggi splende il sole or German heute scheint die Sonne ‘the sun is shining today,’ 15. The influence of German may be traced back to seventeenth century Upper Sorbian translations of religious texts in which the German definite article is mostly rendered with the corresponding form of ten (cf. Lötzsch 1996: 52). 16. In German der is not associated with any proximal/distal distinctions, i.e. it is deixis-neutral, while dieser and jener have a deictic function (Trovesi 2004: 170; DUDEN 2005).

146 Anna Giacalone Ramat

and “associative-anaphoric uses” (Himmelmann 2001: 833).17 Trovesi has compared such parameters against the Slavic data and, regarding the potential article in Sorbian, Czech and Slovenian, concludes that demonstratives have not yet been grammaticalized into articles; rather, they are frequently used with a weakened deictic force, that is, they may point to an item which is textually known (anaphoric use) or known from shared knowledge. In some cases they may be compatible with a generic use and may also occur with proper nouns, as in ten Karel (‘the Karel’). However, the variability of use is very high and depends on speakers and elicitation contexts: nowhere has the use of the demonstrative become either regular or obligatory (Trovesi 2004: 167). These findings confirm the conclusion that grammaticalization is a process by which the function of certain elements, the demonstrative in our case, is reinterpreted. It originates mainly in specific contexts, that is, in pragmatically conditioned constructions which become grammaticalized to varying degrees. Different languages occupy different positions along a continuum of uses which tends towards obligatoriness in well-defined contexts. Contexts for the origin of definite articles are to be found in the communicative uses of language, particularly conversations where the need to strengthen the deictic force of nominal elements exists. Keeping such conditions in mind, we are then in a better position to understand the “attractiveness” (Heine and Kuteva 2005: 265) of the category under discussion in the case of languages which lack articles but which are in contact with languages that possess them. Articles at the Western periphery of the Slavic area have no fully grammatical status; rather, they can be described as incipient categories. Actually, their use is generally recognized by neither speakers nor grammarians and is discouraged in formal education (cf. Heine and Kuteva 2005: 71). Their marginality is also reflected in the recently published World Atlas (WALS 2005), which lists Czech, Sorbian and Polish among the languages lacking articles, either definite or indefinite. I will add here two reflections on a language in which the existence of the definite article category is beyond doubt. Contrary to what is generally assumed, in Old Italian texts the definite article had not been completely generalized and cases can be found where the article, which would be obligatory in Modern Italian, is lacking. This shows that it takes some time for this category to develop. The second observation concerns the cyclicity of the development of demonstratives into articles and is meant to show that the process of article generation can start again in a language which already has a definite article. This is illustrated by the case of the demonstratives questo/quello, which may replace the definite article in spo17. In associative-anaphoric uses the first mention of an entity does not imply uniqueness per se, but uniqueness with respect to a previously mentioned referent, as in The man drove past our house in a car. The exhaust fumes were terrible (cf. Himmelmann 2001: 833).

Areal convergence in grammaticalization processes 147

ken Italian (sometimes, but not necessarily, with emphatic overtones), thus becoming definiteness markers with no demonstrative value (cf. Berretta 1993: 235). This weakly grammaticalized use resembles, at least partially, the one observed in Slavic languages and confirms, as might be expected, that the process sketched above is quite common. The diffusion of quello with article-like functions implies a renewal of forms in the category of articles (in Italian the link between the demonstrative quello from Vulgar Latin *(ec)cum-illum and the article il from Latin ille is no longer perceived). In conclusion, there are some Slavic languages which depart from the traditional Slavic pattern and are undergoing quite similar changes, leading to the emergence of definite articles. Areal closeness to German and long-standing contact with this language suggest that German must have had a significant impact on such (still) ongoing changes. To assume that these developments are merely a matter of chance does not seem plausible. A common trend towards the development of articles seems to be under way in many languages of Europe which surround the European core, such as Finnish. Laury (1997) has shown that the demonstrative se in spoken Finnish appears to be developing into a definite article. The grammaticalization process is taking place as expected: se is being used more and more frequently with textually known referents which have already been mentioned in discourse. Furthermore, se can be found in associative anaphora contexts, such as I bought a car but the motor didn’t work (Laury 1997: 256). Laury also observes that “the major difference in the distribution of articles between Finnish and languages like English is that the Finnish se has not been generalized to non-referential uses such as play the piano” (1997: 260). Language contact with English through the high number of bilingual speakers is not ruled out. The grammaticalization of articles has also affected the Balkan area (Albanian, Rumanian, Bulgarian and Macedonian): postposed articles are regarded as a strong Balkan feature, while Greek does not share this areal property, though using a full-fledged preposed article since Classical Greek. The centre of diffusion for articles in the Balkans might have been Greek, but in reality there exists no reliable evidence that the model was provided by this language.18 18. The rise of indefinite articles from the numeral ‘one,’ which is taking place at the periphery of the Slavic area, is also a process of contact-induced grammaticalization. Lötzsch (1996: 53) notes that, under the influence of German, Sorbian tends to use the numeral jedyn, in the weakened form jen, as an indefinite article, although not systematically. Weiss (2004) discusses the rise of an indefinite article in Macedonian out of the numeral eden ‘one,’ which is also used as an indefinite pronoun. His analysis supports the hypothesis of contact-induced grammaticalization, with Greek heading the process. Note that indefinite articles exist in other Balkan languages, such as Rumanian, Albanian and Greek.

148 Anna Giacalone Ramat

We can therefore speak of an extension of the definite article area in Europe. As mentioned above, articles are also contagious: languages can dispense with them, but, if a neighbouring language has them and if a bilingual context exists, then there is a high chance of similar developments expanding. The onset of a grammaticalization process can be triggered from another language (as was the case for German der, which served as a model for similar processes in other languages), but the process will follow universal principles and will be internal to the language in question.19 In the case of the definite article, it is the discourse-pragmatic use of demonstratives that constitutes the input of a grammaticalization process which will proceed at its own pace in different languages. Furthermore, the process is unidirectional: I am not aware of any languages where definite articles have developed into demonstratives.

5. The renewal of the relative pronoun paradigm in European languages I now turn to another grammaticalization change involving the development of Latin into Romance, that is, the rise and spread of relative pronouns coding number, gender and case distinctions and reproducing the properties of Latin relative pronouns, instantiated by Italian il quale, French lequel, Spanish el cual, Portuguese el cual, Rumanian care and also by German welcher, English who/which and Swedish vilken. It is important to underline that in Romance languages this construction is not inherited from Latin, although the lexical items involved are of Latin origin (ille + qualis). Haspelmath (2001), following Comrie (1989), notes that the relative pronoun strategy is a typical European feature, as confirmed by the distribution of relativization strategies in the languages of the world (WALS 2005). He further observes that, in most Standard Average European languages, the relative pronoun is based on an interrogative pronoun: “this is true,” he says, “of all Romance, all Slavic and some Germanic languages, Modern Greek, as well as Hungarian and Georgian” (Haspelmath 2001: 1494–1495). It should be added, however, that this was already the case in Latin, where both the interrogative/indefinite pronoun quis and the relative qui derive from the same stem, Indo-European *kwi-/kwo-. This stem functions as a relative pronoun also in Italic, Tocharian, Hittite and Celtic (while other Indo-European languages, such as Indo-Iranian and Greek, derive their relative pronouns from a different stem; cf. Szemerényi 1970). Thus, the evolution from inter19. Molise Croatian, the Croatian variety spoken by a minority group in Molise (Southern Italy), has been strongly influenced by Italian (cf. Breu 1996; Marra 2000). However, no development of a definite article is under way, despite five centuries of contact with Italian. This comes to show that speakers in contact with another language do not necessarily draw on it to develop a new category.

Areal convergence in grammaticalization processes 149

rogative into relative clause markers is a change shared by many Indo-European languages, which does not prove any innovation; for this reason, I claim that no truly areal feature in Standard Average European languages is at work here. By contrast, the change I want to describe as a case of areal diffusion is more limited and more recent, and concerns only Romance languages as the centre of innovation and Germanic languages as possible receiving languages. Greek and some Balkan languages have undergone a process partially similar to the one observed in Latin. On the one hand, the locative adverbial pou ‘where’ has been generalized into an invariable relative marker. On the other hand, in the written, formal language, the relative pronoun o opoíos, which is made up of the article and the interrogative adjective (hopoîos in Classical Greek), has developed. This form is similar to the Albanian relative i cili, which is also made up of the article and the interrogative element cili ‘what kind,’ and is used alongside invariable që and interrogative kush. The Balkan constructions are structurally isomorphic. However, the data available do not allow for the reconstruction of the grammaticalization process in more detail, and it is difficult to decide whether we are dealing with a calque from Romance, with an independent development or with a Balkan feature. Romance languages have complex paradigms for relative pronouns, which combine forms with different origins. On the one hand, there are forms which continue the Latin pronoun but have lost case marking, gender and number distinctions, thus becoming invariable markers, such as Italian che, Spanish and Portuguese que and French qui/que. On the other hand, new relative pronouns have been formed using Latin material, which do not appear as such in Latin texts. Thus, the demonstrative ille plus the adjective qualis appear to form a construction coding gender, number and syntactic role of the relativized item. In Present-day Romance languages the use of this new pronoun is usually restricted to cases where a preposition is present. Thus, in terms of Keenan and Comrie’s (1977) analysis of relative clauses, this pronoun realizes a [+case] strategy. As a means of illustration, Table 1 provides the relative pronoun system of Present-day Italian, in which case-coding and non-case-coding strategies are in complementary distribution. Table 1. Relative markers in Present-day Standard Italian Relativized positions in the relative clause

Restrictive relative clauses

Non-restrictive relative clauses

Subject Object Indirect object Oblique Possessor (genitive)

che che prep. + il quale/cui prep. + il quale/cui prep. + il quale/cui article + cui

che, il quale che prep. + il quale/cui prep. + il quale/cui prep. + il quale/cui article + cui

150 Anna Giacalone Ramat

It should also be noted in passing that in non-standard Romance varieties the invariable relative marker tends to be extended to lower positions in the Accessibility Hierarchy. Along with che, anaphoric personal pronouns may be used, encoding case, gender and number of the relativized noun phrase. Beside Italian, other Romance languages show a similar move toward the use of invariable markers to introduce relative clauses (cf. Gadet 1995). As an illustration, compare the Standard Italian examples in (9) and (10) with that from colloquial Italian in (11).

(9) Standard Italian La ragazza che ho visto. the girl rel.obj I.have seen ‘the girl that I have seen’

(10) Standard Italian La ragazza alla quale ho dato il libro. the girl to rel.indr.obj I.have given the book ‘the girl to whom I have given the book’ (11) Colloquial Italian La ragazza che le ho dato il libro. the girl rel cl.indr.obj I.have given the book ‘the girl to whom I have given the book’ (lit. ‘the girl that I have given the book to her’) Considering both morphosyntactic restrictions (in subject and object positions only che is allowed in restrictive clauses) and sociolinguistic conditions of use (Bernini 1991; Fiorentino 1999), we may conclude that the il quale type has a limited use in spoken varieties of Italian as well as in other Romance languages (see Fiorentino 1999 for data concerning Present-day Italian). It may probably be considered a recessive characteristic among the Standard Average European features, in stark contrast with the other properties discussed so far. With reference to the distribution of the Standard Average European features and the contact issue, it might be interesting to look at the grammaticalization process which gave rise to il quale and cognate forms in Romance languages. The introduction of relative pronouns of the Latin type is of special interest for at least two reasons. Firstly, relative pronouns are typologically quite rare in the languages of the world (cf. Comrie 1989: 149; 2002). Secondly, such pronouns entered languages in which relative pronouns already existed, thus raising the problem of functional distribution between old and new forms within the relative paradigm. Although we are discussing changes which took place before the earliest written records, a closer inspection of the oldest documents allows us to reconstruct these changes with some plausibility. On the basis of the evidence which has become

Areal convergence in grammaticalization processes 151

available, the il quale type was first used as a textual cohesion device to link a clause to a preceding one on the model of a well-known Latin pattern, the “connecting relative” (Giacalone Ramat 2005). In keeping with its origin, the il quale construction first coded appositive rather than restrictive relatives, a property which partially continued into modern times (in Present-day Italian il quale as subject is possible only in appositive relative clauses; cf. Cinque 1988). Although il quale originally served a cohesive function, its usage later expanded and its paradigm very soon merged with that of che, cui. Although semantically distinct, appositive and restrictive clauses are perceived as similar by speakers and are expressed through the same coding devices in many languages (cf. Keenan 1985), because both involve a shared participant in the main and in the relative clause. The analysis of a corpus of Old Italian texts (thirteenth and fourteenth century texts including Novellino, Dino Compagni and Boccaccio) revealed that il quale entered the relative pronoun paradigm starting from the most accessible roles, that is, subject and object. Indeed, these are the functions most likely to play a connecting role (cf. Giacalone Ramat 2005). Il quale was used in non-restrictive clauses which provide either additional information about a main participant or information crucial to the unfolding of a story. Typical uses are given in (12) and (13). (12) Old Italian Cominciorono avere i Donati grand odio contro di loro. begin.pst.3pl have.inf det.pl Donati great hate against them Il quale crebbe assai perché Messer Corso Donati [. . .] rel increase.pst.3sg considerably because Messer Corso Donati ‘The Donatis began to feel great hate against them [the Cerchis]. Which increased considerably because Messer Corso Donati…’ (Dino Compagni, Cronica 1.20.1) (13) Old Italian La reina [. . .] che essa l’ ordine seguitasse le that she det suit follow.sbjv.3sg 3sg.dat det queen domandò. La quale vezzosamente e con lieto viso incominciò. ask.pst.3sg rel graciously and with joyful face begin.pst.3sg ‘The queen ordered her to follow suit. Who graciously and with a joyful face began (to say).’ (Boccaccio, Decameron 1,5,1) According to the developmental path sketched above, the use of il quale was promoted by the needs of the written language, which requires conscious and careful planning and transparent relations between form and meaning. The change was, then, motivated by iconic forces. The use of Latin throughout Europe in Late Antiquity and the Middle Ages can be taken as the socio-cultural factor behind this and other similar changes which occurred very early in Romance languages.

152 Anna Giacalone Ramat

Faced with the task of using Romance vernaculars for written purposes, literate people may have tried to reproduce Latin grammatical models. A similar development appears to have taken place in Germanic; the pronoun welcher was present in the West Germanic area as early as the eighth century as an interrogative, cf. Old High German (h)welīh, and also Old Norse hvílikr ‘which, of what kind.’ Etymologically, it is a compound form from the interrogative pronoun *kwi- plus *leika- ‘form, body,’ that is, it is similar in form and function to Latin qualis (cf. Kluge-Seebold 1995). The relative meaning is attested considerably later, from the beginning of the fifteenth century in Low German and some decades later in Middle German (cf. Paul 1998). The form with the article, de welke, occurs only in Low German and Low Franconian texts and might have been modelled on French lequel (cf. Behaghel 1923–1932: I, 376). The late use of welcher as an appositive relative makes the hypothesis of Romance influence quite plausible: the interrogative welcher would have been further grammaticalized by extending its use to new contexts on the model of Romance languages. In the history of English, the renovation of the relative paradigm took place during the Early Middle English period, when the interrogative pronoun who/ whom and the adjective which entered the relative paradigm (cf. Mustanoja 1960: 222ff). According to Romaine (1980, 1984), the wh-forms entered the relative paradigm from the less accessible roles, which would require a more explicit coding of syntactic relations, and subsequently were extended to the least complex and more frequent positions of subject and object. This path of development has proceeded differently from Romance languages and is also chronologically later. Nevertheless, areal diffusion might have played a role, since the wh-forms may have spread in formal and written language under French and Latin influence (cf. Mustanoja 1960; Romaine 1980). To summarize, relative clauses introduced by relative pronouns are a feature of Indo-European languages, as is commonly assumed. However, in Romance languages a renewal of the relative paradigm took place, since, over time, the old Latin relative pronouns were reduced to invariable markers and new relative pronouns emerged. The spread of inflecting relative pronouns into Romance and Germanic languages appears to be the result of language contact and contact-induced grammaticalization for the following reasons: a. Relative pronouns do not belong to those typological features which are common in the world’s languages. b. The construction under scrutiny is not a direct development from Latin; however, Latin provided the model to restore in written Romance varieties the relative paradigm reduced by the loss of inflections. Thus, Latin acted as the source language for the grammaticalization process.

Areal convergence in grammaticalization processes 153

c. The innovation spread to German and Scandinavian languages, as well as to English. In all these languages, its use tends to be limited to the written language and a formal style. The grammaticalization process consisting in the extension of interrogative pronouns into relative clause markers is rather common both within and outside Indo-European languages (cf. Heine and Kuteva 2002, 2005). Nonetheless, the specific pattern examined here suggests that areal diffusion was at work.

6. Reflexives and middle forms from an areal perspective As a final case study, I will illustrate a grammaticalization process which took place in the European area in genetically related languages, namely the development of reflexive markers towards passive and impersonal constructions. In particular, I will be concerned with the role of animacy and control in promoting the semantic process of grammaticalization. The roots of the developments to be discussed here are very old, since reflexive pronouns can be traced back to Indo-European. The process in question does not show any clear influence of one language on the others and leads to grammaticalization at different times and up to different stages. Apparently, the grammaticalization process followed separate, parallel paths which replicate universal semantic mechanisms of change. The forms I will discuss are often grouped into what has been termed middle voice (Kemmer 1993). The middle voice is widely attested crosslinguistically and its realizations exhibit a considerable amount of variation, which is not possible to explore here in detail. For practical reasons, I will use a semantic characterization of the middle voice based on Kemmer’s (1993) work. She defines it as “a semantic area comprising events in which: (a) the Initiator is also an Endpoint, or affected entity and (b) the event is characterized by a low degree of elaboration” (Kemmer 1993: 243). This set of properties can manifest itself in different ways. The actor may not exist at all, as in the case of anticausatives; or it may coincide with the patient, as in reflexives and reciprocals; or it may be conceived as a potential, not actual, participant, as in impersonal constructions (Kazenin 2001: 906). Thus, middle forms in our terminology include direct reflexives and their extensions. The distribution of middle forms in Standard Average European shows considerable variation. Firstly, it should be noted that in Indo-European the middle voice existed as a grammatical category, expressed through a set of peculiar inflections, as is well attested in Sanskrit, Classical Greek and sporadically elsewhere. Some modern European languages, such as Albanian, Modern Greek and Irish, have preserved a middle-passive voice expressed through morphological inflections. Reflexive pronouns in these languages are generally coded by distinct morphemes.

154 Anna Giacalone Ramat

Secondly, in Romance, Germanic and Slavic languages, middle and passive systems are found which originated from reflexive pronouns. Although the verbs which can be used with reflexive markers do not necessarily coincide across these languages, the overall evolutionary process appears to be similar. Romance middle forms historically continue the Latin reflexive pronoun se (sui/sibi), which indicates coreference between the object and the subject of a clause (or coreference between the main participants in an event). An important change which took place in the verbal system between Late Latin and Early Romance was the demise of r-forms which had middle and deponent functions. The gradual expansion of se at the expense of r-forms can be traced in Late Latin documents (cf. Cennamo 1998: 83ff). As an illustration, consider (14) below, where se in (14b) appears as an alternative to the r-form in (14a). (14) Latin a. memoria minuitur. memory diminish.middle.3sg ‘The memory is impaired.’ (Cic. Sen. 7,21) b. minuente se morbo. diminish.ptcp.abl.sg refl desease.abl.sg ‘when the disease is on the decline’ (Pli. Nat. 23,50) A further expansion of reflexive markers, only sporadically attested in Late Latin texts, occurred when they took on an anticausative function, which is usually considered an intermediate step leading to the passive use of se. As pointed out by Haspelmath (1990: 44), the main difference between reflexives and anticausatives is that the latter are mostly non-agentive, i.e. the passive function of the reflexive marker first appears with inanimate subjects, that is, entities lacking control on the process, and extends to situations which cannot occur spontaneously, but necessitate a human causer (cf. Haspelmath 1990: 45; Cennamo 1998: 95–96). Thus, the passive use is the result of a further generalization, or a loss of restrictions on anticausatives. The process described here can be accounted for within the framework of grammaticalization with reference to the parameters of the extension to new contexts and the dropping of restrictions. In a diachronic perspective, the range of functions of reflexive markers has been described by Haspelmath (2003: 235) with the help of the following semantic map, which is claimed to be universal and unidirectional:20 20. Haspelmath’s (2003) universal semantic map for middle functions is exemplified with Russian and French data, but it is true that reflexive-derived middle and passive markers are found in several languages, including Turkic languages, Finnish, Georgian (Geniušienė 1987: 320) and Uto-Aztecan languages (Kemmer 1993: 149). From Geniušienė (1987: 320), one gathers that the

Areal convergence in grammaticalization processes 155

full reflexive → grooming/body motion → anticausative → passive

Kemmer (1993: 202) has also proposed a semantic map which is meant to make diachronic predictions concerning the spread of reflexive situation types. Similarly, Geniušienė (1987: 349) has advanced an implicational hierarchy, claiming that a reflexive marker can retain neither its primary reflexive function nor its passive function if it has lost its decausative (that is, anticausative) function. Some details in the structure of the semantic space may raise certain doubts. Thus, for instance, the identification of ‘body motion’ as a source for anticausatives in Haspelmath’s map is not supported by Romance diachronic evidence. Rather, the data seem to suggest that verbs of body care, of change in body posture and motion verbs derive from reflexives as a semantically distinct path along a continuum of decreasing transitivity (cf. Kemmer 1993: 202). The starting point of the grammaticalization cline would be represented by situations in which one participant performs upon himself/herself an action whose reflexive nature is the expected case, rather than the exception, like Italian vestirsi, French s’habiller and German sich anziehen ‘to dress oneself.’ Later, the reflexive marker is extended to situations where one participant acts upon his/her own body, like Italian sedersi ‘sit down.’ Next, reflexive markers gradually move toward prototypically one-participant situations which surface syntactically as intransitives, as Italian arrabbiarsi, German sich ärgern ‘become angry,’ French s’aperçevoir ‘become aware’ or Dutch zich schamen voor ‘be ashamed of.’ A further development of reflexive markers is their impersonal use with intransitive verbs, which is not included in Haspelmath’s map and is possible only in Italian and Spanish. In such constructions an unspecified or indefinite subject is demoted, as shown in (15) and (16) below. (15) Italian a. Si parla molto qui. refl speak.prs.3sg much here ‘People speak a lot here.’ b. Si vive bene in questa città. refl live.prs.3sg well in this city ‘People live well in this city.’ passive impersonal expressed via a reflexive is rare (Finnish, Turkish and Tatar in Geniušienė’s corpus). In Romance languages, in which another periphrastic passive construction exists (e.g. Italian Sono stato investito da una motocicletta ‘I have been run over by a motor-bike’), middle forms are generally more restrictive, that is, they are often associated with restrictions on person (third person only; cf. Kemmer 1993: 149).

156 Anna Giacalone Ramat

(16) Spanish a. Se habla mucho aquí; refl speak.prs.3sg much here ‘People speak a lot here;’ b. El congreso se inauguró ayer. the conference refl open.pst.3sg yesterday ‘The conference was opened yesterday.’ The discussion of this complex range of phenomena can be summarized as follows: a. A general realignment of the semantic role of the core clausal participant towards the patient role has taken place. This is conveyed by the anticausative and passive developments of Latin se. b. As is pointed out by Haspelmath (1990: 46), the diachronic perspective, which considers the anticausative as an intermediate step, helps us to understand the relation between reflexive and passive, which cannot be explained easily from a synchronic perspective. c. In the processes described above, several mechanisms typical of grammaticalization are involved. The grammaticalization of reflexive markers follows the familiar path of desemanticization (bleaching) and loss of referentiality, from referential meaning in direct reflexives (as in Italian si guarda ‘he looks at himself ’) to intransitivity marker uses and valency reduction (as in Italian vergognarsi ‘be ashamed’ or pentirsi ‘repent,’ which only have one argument). Another typical grammaticalization change is the loss of concrete meaning and increasing abstractedness, as is shown by the considerable number of verbs denoting emotions or feelings, such as Italian commuoversi ‘be moved’ or French s’étonner ‘be astonished.’ In accordance with the areal perspective taken in this paper, I will now discuss whether the grammaticalization of reflexive markers to middle and passive markers involves language contact and areal diffusion. Outside Romance, many modern European languages have reflexive pronouns inherited from the Indo-European pronoun *se-/s(e)we-, which occurs in Slavic, Baltic, Germanic and Albanian. The relations among these languages are genetic, geographical and historical, without necessarily involving language contact and contact-induced grammaticalization. These languages show grammaticalization of reflexive pronouns at varying degrees: for example, Scandinavian languages have developed passive meanings, whereas Standard German has advanced less in the grammaticalization process, since both the passive and the impersonal uses are impossible, with the exception of the potential passive, as in (17).

Areal convergence in grammaticalization processes 157

(17) German Dieses Buch verkauft sich gut. this book sell.prs.3sg refl well ‘This book sells well.’ (Haspelmath 2003: 225). There is no conclusive evidence to determine whether an overall contactinduced change is responsible for the wide range of developments of reflexives in the Indo-European languages of Europe. As a matter of fact, the path of change sketched above seems to constitute the most frequent diachronic source of middle markers, which can be found also in the Nilo-Saharian language family (cf. Kemmer 1993: 151ff). This would suggest that independent semantic changes in the language involved cannot be ruled out. Although no overall grammaticalization area can be identified, the possibility of smaller areas historically more identifiable should be taken into account. From such a perspective, attention might be drawn to a split between stressed and unstressed reflexive forms which occurred in some languages of the Circum-Baltic area, an area of multiple contacts between Indo-European and Finnic languages (cf. Dahl and Koptjevskaja-Tamm 2001). In Russian the unstressed reflexives became verbal affixes with middle and passive functions, while the stressed reflexives remained in use as ordinary reflexive pronouns. Such a change is attested in Russian as late as the sixteenth century (cf. Cennamo 1993: 64). It instantiates a process of grammaticalization where the main verb and the reflexive coalesce with each other (cf. Koptjevskaja-Tamm and Wälchli 2001: 693). As an illustration, consider Russian sebja vs. -sja. (18) Russian Viktor nenavidit sebja. Viktor hate.3sg refl ‘Victor hates himself.’ (Kemmer 1993: 27) (19) Russian mal’čik moet-sja. child wash-refl ‘The child washes [himself].’ (Fici 2004: 207) (20) Russian dver’ otkryla-s’. door open.3sg-refl ‘The door opened.’ (Fici 2004: 207)

[direct reflexive]

[grooming verb]

[anticausative]

158 Anna Giacalone Ramat

(21) Russian inogda učitel’ zaderžival-sja direktorom. [passive] sometimes teacher.nom keep.ipfv-refl headmaster.ins ‘Sometimes the teacher was kept back by the headmaster.’ (Fici 2004: 207) North Germanic languages have also undergone several grammaticalization processes which have resulted in a formal and functional split between the reflexive and the middle markers. The reflexive pronoun sik developed into the polysemous postfix (that is, an affix in the last position of a word, following inflections) -sk, which became -s and spread across the middle paradigm, thus losing person distinctions; cf. Norwegian vaske seg ‘wash,’ ferdes ‘travel’ or minnes ‘remember.’ The -s marker (-st in Present-day Icelandic) also has a passive meaning (Kemmer 1993: 189ff; Kazenin 2001: 900; Koptjevskaja-Tamm and Wälchli 2001: 691), as shown in (22). (22) Present-day Icelandic ég kall-a-st. I call-1sg-pass ‘I am called.’ Likewise, Baltic languages have a postfix -s derived from si with reflexive and middle meanings, while reflexive pronouns are independent forms, as in Lithuanian save and Latvian sevi (Geniušienė 1987: 25). Kemmer (1993: 24ff) calls the languages in which reflexive pronouns are identical to middle markers one-form languages, and the languages in which the reflexive markers are either similar but not identical to the middle markers, or not cognate to middle-markers, two-form languages. Romance languages generally belong to the former type because they do not distinguish between reflexive forms and middle/passive forms,21 while Scandinavian, Slavic and Baltic languages distinguish between two similar and etymologically connected forms: the heavy form for independent reflexive pronouns and the light form, usually cliticized onto the verb, for middle and passive uses.

21. Among Romance languages, the position of Surselvan is special in that it has become a twoform language (Kemmer 1993: 166ff). Se has been extended to the whole paradigm in place of first and second person pronouns and has agglutinated to the verb, as shown in (i). (i) Surselvan jeu selavel / ti selavas / el selava ‘I wash myself ’ / ‘you wash yourself ’ / ‘he washes himself ’ The reduplicated form, sesez, is used as a direct reflexive and as an emphatic.

Areal convergence in grammaticalization processes 159

As shown by Koptjevskaja-Tamm and Wälchli (2001: 691), Baltic, Eastern Slavic and North Germanic languages pattern together. The languages of the Baltic Sea region are characterized by postfixes as markers of reflexive and related meanings (i.e. anticausatives and passives), a development which represents a further step in the grammaticalization of reflexives. This might be a case of areal diffusion, as is suggested by Koptjevskaja-Tamm and Wälchli, perhaps due to contact. Finno-Ugric languages have, however, no directly comparable reflexive structures, although a Finnic influence on Baltic impersonal and passive constructions can be shown (cf. Holvoet 2001). On the basis of the distribution of reflexives and related forms in the languages of Europe, more than one grammaticalization area can be recognized: (i) the area of core Standard Average European languages (Romance languages and German) with predominantly one clitic form for reflexive and related functions;22 (ii) the Baltic Sea area, in which postfixes have been developed because of contactinduced change; (iii) an extensive area of languages outside the Standard Average European core area, which mostly exhibit historically and morphologically distinct forms for reflexive pronouns and middle/passive markers; this area includes Hungarian, Turkish, Albanian, Greek, Celtic languages and Maltese. It should also be noted that, among Germanic languages, German has only one form, sich, and that English is the only Germanic language which does not continue the ProtoGermanic reflexive pronoun *sik-.

7. Conclusions The examples of grammaticalization processes presented and discussed in the preceding sections differ from each other in various respects and exhibit different degrees of contact-induced grammaticalization. On the one hand, ‘have’-perfects and relative pronouns are similar in that they are both renewals of categories which were already present in the languages involved but had been overused 22. Among Germanic languages, Dutch has a peculiar position as a two-form system in that zich is used as a middle marker for grooming verbs and for body motion verbs, but not for anticausatives (Donaldson 1997): (i)

Dutch Jan heeft zich aangekleed. Jan have.3sg refl dress.ptcp ‘Jan has dressed/got dressed.’

The reflexive (and emphatic) marker is zichzelf, where -zelf is cognate with German selbst and English -self (Kemmer 1993: 184).

160 Anna Giacalone Ramat

or had undergone formal reduction. By contrast, articles in Slavic languages exemplify the gradual emergence of a new category for which no equivalent existed, that is, they are an instance of “innovating mutation” in Benveniste’s (1968) words. Thus, contact situations may lead to the development of new grammatical categories through grammaticalization. As for the development of the reflexive pronoun se and its related forms, this follows universal grammaticalization paths which are attested in many European languages as well as in languages outside Europe. As said above, such developments do not define any truly areal features, with the possible exception of reflexive postfixes around the Baltic Sea region, which exhibit an increasing degree of fusion. No spread from a centre can be detected; on the contrary, the presence of these constructions in the various languages can be explained on the basis of inheritance from the Indo-European reflexive stem *se-/s(e)we-. The present distribution then appears to be the result of parallel developments which are irrelevant to the definition of a grammaticalization area. There are a number of other problems which may be relevant to the issue of areal grammaticalization, although they are not exclusive of contact-induced grammaticalization. Among these, the notion of feature strength does not seem to be directly related to areal grammaticalization. ‘Have’-perfects appear to be a strong Standard Average European feature resulting from contact because of the rarity of the development of transitive verbs of predicative possession into auxiliaries. By contrast, articles seem to be a weaker feature, since the process by which articles are created from demonstratives is typologically frequent. In the case of Slavic articles, however, the geographical distribution of the ongoing grammaticalization processes around areas with fully developed articles and the existence of historical contacts among the languages concerned make the hypothesis of contact-induced convergence equally probable. Concerning the issue of motivation, it has been claimed that grammaticalization may involve the exploitation of old means for novel functions. The present study has indeed shown that two Latin items, ille and qualis, have been re-employed to produce a new relative pronoun which is present in all Romance languages. The rise of new pronouns is motivated iconically in terms of the need for clarity and informativeness in the expression of structural relations. Still, an opposing tendency is also at work in relative systems: economicity privileges simpler paradigms and invariable markers. Thus, the evolution of relative pronouns neatly illustrates the issue of competing motivations in linguistic change. As for the role of external factors in contact-induced grammaticalization, it has been argued that both ‘have’-perfects and relative pronouns originated in similar historical conditions in Late Latinity and the early Middle Ages. For several centuries, until at least the eighth century AD, and perhaps later, the written

Areal convergence in grammaticalization processes 161

language in Western Europe was Latin, although people spoke vernaculars in both the Romance and the Germanic speaking areas. In other words, a diglossic situation must be recognized in Western Europe linguistic communities. The varieties of spoken Latin which gave rise to the Romance vernaculars are not attested, or, more accurately, very scarcely attested, though the changes which brought about ‘have’-perfects and other distinctive Romance features must have spread in this period.23 Following Labov and sociolinguistic paradigms, a distinction should be drawn between the origin of linguistic variation and the propagation of language change in a linguistic community (Labov 1972: 1). In the cases under scrutiny, as stated above, the roots of grammaticalization may date back to as early as Plautus in Latin, while the spread of the new grammaticalized perfect probably continued until the Merovingian period and possibly until the date of the earliest Romance texts. The diffusion of the new relative pronouns is also due to the influence of written Latin texts on the written vernaculars because of concerns for clarity, as Romance speaking people began to elaborate a new writing system to represent their speech (Wright 2002). In the light of my findings here, a further conclusion may be suggested, namely that contact plays a role in the initial stage of the grammaticalization process. It triggers the grammaticalization process in the receiving language in those cases where multiple interpretation (or different stages of grammaticalization) and variability are still available in the source language, as in the case of the Latin perfect and the German article. It may well be the case that more grammaticalized constructions are borrowed by the speakers of a given language from a neighbouring language. However, such cases should not count as examples of a real grammaticalization process, but rather as instances of calquing or transfer of grammatical forms, a well-documented phenomenon which constitutes an alternative to the grammaticalization hypothesis (cf. Heine and Kuteva 2005: 100). The areal patterns discussed in this paper support the claim that language contact reinforces grammaticalization (cf. Wiemer and Bisang 2004: 12). It is also noteworthy that contact-induced grammaticalization does not seem to run coun23. Wright (2002: 33ff, 147ff) observes that the conceptual separation of Latin from Romance is a matter of debate. According to him, only in the twelfth century did a definitive distinction obtain. However, probably around 800 AD Romance monolingualism had come to an end (Wright 2002: 154) and Romance languages and Medieval Latin were conceived of as separate languages. Haspelmath (1998, 2001) suggests, however, that the appropriate time frame at least for ‘have’perfects and articles could be the middle of the first millennium AD, that is, the time of the great migrations during the transition between Antiquity and the Middle Ages. The relevant period for the spread of some Standard Average European features must probably be extended to the Merovingian, and possibly Carolingian, periods because written records in Romance vernaculars seem to have played a crucial role in enhancing some changes.

162 Anna Giacalone Ramat

ter to widely accepted grammaticalization principles, like unidirectionality. ‘Have’perfects do not reverse their development once they have become preterites, nor do si/se-middles change directionality. Although the role of contact in grammaticalization processes has been recognized as being more relevant than previously thought, there seems to be no significant difference in the nature of grammaticalization between contact-induced grammaticalization and canonical, internally motivated grammaticalization (cf. Heine and Kuteva 2005: 266). Because of their similarity it is no surprise that the two types have gone unnoticed for a long time. Notwithstanding, the study of grammaticalization processes should be reconsidered in the light of the impact of contact situations and, more generally, in the light of the interaction between both language internal and language external factors.

Abbreviations 1 2 3 abl acc adp cl dat det f gen impf indr inf ins

first person second person third person ablative accusative adposition clitic dative determiner feminine genitive imperfect indirect infinitive instrumental

ipfv n nom obj pass pl poss pret prs pst ptcp refl rel sbjv sg

imperfective neuter nominative object passive plural possessive preterite present past participle reflexive relative subjunctive singular

References Abraham, Werner. 2004. The European demise of the simple past and the emergence of the periphrastic perfect: Areal diffusion or natural autonomous evolution under parsing facilitation? In Werner Abraham (ed.), Focus on Germanic typology [Sprachtypologie und Universalienforschung. Studia Typologica 6], 241–271. Berlin: Akademie Verlag. Behaghel, Otto. 1923–1932. Deutsche Syntax I-IV. Heidelberg: Carl Winter. Benveniste, Émile. 1960. ‘Être’ et ‘avoir’ dans leurs functions linguistiques. Bulletin de la Société de linguistique de Paris 55: 113–134. Benveniste, Émile. 1968. Mutations of linguistic categories. In Winfried P. Lehmann & Yakov Malkiel (eds.), Directions for historical linguistics. A symposium, 85–94. Austin TX: University of Texas Press.

Areal convergence in grammaticalization processes 163

Bernini, Giuliano. 1991. Frasi relative nel parlato colloquiale. In Cristina Lavinio & Alberto A. Sobrero (eds.), La lingua degli studenti universitari, 165–187. Firenze: La Nuova Italia. Bernini, Giuliano & Paolo Ramat. 1996. Negative sentences in the languages of Europe. A typological approach. Berlin: Mouton de Gruyter. Berretta, Monica. 1993. Morfologia. In Alberto A. Sobrero (ed.), Introduzione all’italiano contemporaneo. Le strutture, 192–245. Roma: Laterza. Bisang, Walter. 1998. Grammaticalization and language contact. Constructions and positions. In Anna Giacalone Ramat & Paul J. Hopper (eds.), The limits of grammaticalization, 13–58. Amsterdam: John Benjamins. Bisang, Walter, Nikolaus P. Himmelmann & Björn Wiemer (eds.). 2004. What makes grammaticalization? A look from its fringes and its components. Berlin: Mouton de Gruyter. Bonfante, Giuliano. 1960. Les rapports linguistiques entre la Grèce et l’Italie. Hommages à Léon Herrmann. Collection Latomus XLIV: 171–182. Breu, Walter. 1996. Überlegungen zu einer Klassifizierung des grammatischen Wandels im Sprachkontakt (am Beispel slavischer Kontaktfälle). In Martin Haase & Nicole Nau (eds.), Sprachkontakt und Grammatikalisierung. Sprachtypologie und Universalienforschung (STUF) 49(1): 21–38. Bucholz, Oda & Wilfried Fiedler. 1987. Albanische Grammatik. Leipzig: VEB Verlag Enzyklo pädie. Bybee, Joan, Revere L. Perkins & William Pagliuca. 1994. The evolution of grammar: Tense, aspect and modality in the languages of the world. Chicago IL: University of Chicago Press. Campbell, Lyle (ed.). 2001a. Language Sciences 23(2–3). Campbell, Lyle. 2001b. What’s wrong with grammaticalization. Language Sciences 23(2–3): 113–161. Campbell, Lyle. 2006. Areal linguistics. In Keith Brown (ed.), Encyclopedia of language and linguistics. 2nd edn, 454–460. Amsterdam: Elsevier. Carey, Kathleen. 1994. The grammaticalization of the perfect in Old English. An account based on pragmatics and metaphor. In William Pagliuca (ed.), Perspectives on grammaticalization, 103–117. Amsterdam: John Benjamins. Cennamo, Michela. 1993. The reanalysis of reflexives: A diachronic perspective. Napoli: Liguori Editore. Cennamo, Michela. 1998. The loss of the voice dimension between Late Latin and Early Romance. In Monica S. Schmid, Jennifer R. Austin & Dieter Stein (eds.), Historical linguistics 1997, 77–100. Amsterdam: John Benjamins. Cinque, Guglielmo. 1988. La frase relativa. In Lorenzo Renzi (ed.), Grande grammatica italiana di consultazione. Vol. I, 443–503. Bologna: Il Mulino. Comrie, Bernard. 1989. Language universals and linguistic typology. 2nd edn. Oxford: Blackwell. Comrie, Bernard. 2001. Different views of language typology. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and language universals. Vol. I, 25–39. Berlin: Mouton de Gruyter. Comrie, Bernard. 2002. Rethinking relative clauses types: The Mediterranean area. In Paolo Ramat & Thomas Stolz (eds.), Mediterranean languages, Papers from the MEDTYP workshop, Tirrenia, June 2000, 87–98. Bochum: Brockmeyer. Cristofaro, Sonia & Anna Giacalone Ramat. 2002. Relativization patterns in Mediterranean languages, with particular reference to the relativization of time circumstantials. In Paolo Ramat & Thomas Stolz (eds.), Mediterranean languages. Papers from the MEDTYP workshop, Tirrenia, June 2000, 99–112. Bochum: Brockmeyer.

164 Anna Giacalone Ramat

Dahl, Östen. 1990. Standard Average European as an exotic language. In Johannes Bechert, Giuliano Bernini & Claude Buridant (eds.), Toward a typology of European languages [Empirical Approaches to Language Typology 8], 3–8. Berlin: Mouton de Gruyter. Dahl, Östen. 2000. The grammar of future time reference in European languages. In Östen Dahl (ed.), Tense and aspect in the languages of Europe, 309–328. Berlin: Mouton de Gruyter. Dahl, Östen. 2001. Principles of areal typology. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and language universals, Vol. II, 1456–1470. Berlin: Mouton de Gruyter. Dahl, Östen & Maria Koptjevskaja-Tamm (eds.). 2001. The Circum-Baltic languages: Typology and contacts, 2 vols. Amsterdam: John Benjamins. Denison, David. 1993. English historical syntax. London: Longman. Dieninghoff, J. 1904. Die Umschreibungen aktiver Vergangenheit mit dem, Participium Praeteriti im Althochdeutschem. Bonn: Carl Georgi, Universitäts-Buchdruckerei. Donaldson, Bruce. 1997. Dutch. A comprehensive grammar. London: Routledge. Drinka, Bridget. 2003. Areal factors in the development of European periphrastic perfect. Word 54(1): 1–38. Dryer, Matthews S. 2005. Definite articles. In Martin Haspelmath, Matthews S. Dryer, David Gil & Bernard Comrie (eds.), The world atlas of language structures (WALS): 154–155. DUDEN. 2005. Die Grammatik. 7. Auflage. Mannheim: Duden Verlag. Ebert, Robert, Oskar Reichmann, Hans-Joachim Solms & Klaus-Peter Wegera. 1993. Früneuhochdeusche Grammatik. Tübingen: Niemeyer. Elsness, Johan. 1997. The perfect and the preterite in contemporary and earlier English. Berlin: Mouton de Gruyter. Fici, Francesca. 2004. Costrutti passivi, decausativi, medi e pseudo-passivi. Osservazioni sulla marca riflessiva del verbo in russo. Studi Italiani di Linguistica Teorica e Applicata 23: 205–219. Fiorentino, Giuliana. 1999. Relativa debole. Sintassi, uso, storia in italiano [Materiali LinguisticiUniversità di Pavia 24]. Milano: Franco Angeli. Friedman, Victor A. 1994. Variation and grammaticalization in the development of Balkanisms. In Katharine Beals, Jeannette Denton, Robert Knippen, Lynette Melnar, Hisami Suzuki & Erica Zeinfeld (eds.), Papers from the 30th Regional Meeting of the Chicago Linguistic Society. Vol. II: The Parasession on Linguistic Theory, 101–115. Chicago IL: Chicago Linguistic Society. Friedman, Victor A. 2002. Macedonian. Munich: Lincom. Gadet, Françoise. 1995. Les relatives non standard en français parlé: Le système et l’usage. In Hanne Leth Andersen & Gunver Skytte (eds.), La subordination dans les langues romanes, 141–162. Copenhagen: Munksgaard. Garvin, Paul L. 1949. Standard Average European and Czech. Studia Linguistica 3: 65–85. Geniušienė, Emma. 1987. The typology of reflexives. Berlin: Mouton de Gruyter. Giacalone Ramat, Anna. 2002. Review of Bernd Heine & Tania Kuteva, World lexicon of grammaticalization. Cambridge: CUP. Archivio Glottologico Italiano LXXXVII: 231–239. Giacalone Ramat, Anna. 2005. Persistence and renewal in the relative pronoun paradigm: The case of Italian. Folia Linguistica Historica XXVI(1–2): 115–138. Giacalone Ramat, Anna & Paul J. Hopper. 1998. The limits of grammaticalization. Amsterdam: John Benjamins. Greenberg, Joseph H. 1978. How does a language acquire gender markers? In Joseph H. Green-

Areal convergence in grammaticalization processes 165

berg, Charles A. Ferguson & Edith Moravcsik (eds.), Universals of human language, Vol. III, 47–82. Stanford CA: Stanford University Press. Haase, Martin. 1994. Tense and aspect in Basque. In Rolf Thieroff & Joachim Ballweg (eds.), Tense systems in European languages, 279–292. Tübingen: Niemeyer. Harris, Alice C. 2003. Cross-linguistic perspectives on linguistic change. In Brian D. Joseph & Richard D. Janda (eds.), The handbook of historical linguistics, 529–551. Oxford: Blackwell. Haspelmath, Martin. 1990. The grammaticalization of passive morphology. Studies in Language 14: 25–72. Haspelmath, Martin. 1998. How young is Standard Average European? Language Sciences 20: 271–287. Haspelmath, Martin. 2001. The European linguistic area: Standard Average European. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and language universals, Vol. II, 1492–1510. Berlin: Mouton de Gruyter. Haspelmath, Martin. 2003. The geometry of grammatical meaning: Semantic maps and crosslinguistic comparison. In Michael Tomasello (ed.), The new psychology of language, Vol. II, 211–242. Mahwah NJ: Lawrence Erlbaum. Haspelmath, Martin. 2004. On directionality in language change with particular reference to grammaticalization. In Olga Fischer, Muriel Norde & Harry Perridon (eds.), Up and down the cline: The nature of grammaticalization [Typological Studies in Language 59], 17–44. Amsterdam: John Benjamins. Haugen, Einar. 1982. Scandinavian language structures. A comparative historical survey. Tübingen: Niemeyer. Heine, Bernd & Tania Kuteva. 2002. World lexicon of grammaticalization. Cambridge: CUP. Heine, Bernd & Tania Kuteva. 2003. On contact-induced grammaticalization. Studies in Language 27: 529–572. Heine, Bernd & Tania Kuteva. 2005. Language contact and grammatical change. Cambridge: CUP. Himmelmann, Nikolaus P. 2001. Articles. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and language universals. Vol. I, 831–841. Berlin: Mouton de Gruyter. Hinrichs, Uwe (ed.). 2004. Die europäischen Sprachen auf dem Wege zum analytischen Sprachtyp. Wiesbaden: Otto Harrassowitz. Holvoet, Axel. 2001. Impersonals and passives in Baltic and Finnic. In Östen Dahl & Maria Koptjevskaja-Tamm (eds.), Circum-Baltic languages: Typology and contact. Vol. II, 363–389. Amsterdam: John Benjamins. Hopper, Paul J. & Elizabeth Closs Traugott. 2003. Grammaticalization. 2nd edn. Cambridge: CUP. Horrocks, Geoffrey. 1997. Greek. A history of the language and its speakers. London: Longman. Kazenin, Konstantin I. 2001. The passive voice. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and language universals. Vol. II, 899–916. Berlin: Mouton de Gruyter. Keenan, Edward. 1985 . Relative clauses. In Timothy Shopen (ed.), Language typology and syntactic description. Vol. II. Complex constructions, 141–170. Cambridge: CUP. Keenan, Edward & Bernard Comrie. 1977. Noun phrase accessibility and universal grammar. Linguistic Inquiry 8: 63–99. Kemmer, Suzanne. 1993. The middle voice [Typological Studies in Language 23]. Amsterdam: John Benjamins.

166 Anna Giacalone Ramat

Kluge-Seebold. 1995. = F. Kluge. Etymologisches Wörterbuch der deutschen Sprache. Bearbeitet von E. Seebold. Berlin: Mouton de Gruyter. Koch, Peter. 1999. Cognitive aspects of semantic change and polysemy: The semantic space HAVE/BE. In Andreas Blank & Peter Koch (eds.), Historical semantics and cognition, 279–305. Berlin: Mouton de Gruyter. Koptjevskaja-Tamm, Maria & Bernhard Wälchli. 2001. The Circum-Baltic languages: An arealtypological approach. In Östen Dahl & Maria Koptjevskaja-Tamm (eds.), Circum-Baltic languages. Vol. II, 615–750. Amsterdam: John Benjamins. Labov, William. 1972. Language in the inner city: Studies in the Black English Vernacular. Philadelphia PA: University of Pennsylvania. Laury, Ritva. 1997. Demonstratives in interaction. The emergence of a definite article in Finnish [Studies in Discourse and Grammar 7]. Amsterdam: John Benjamins. Lehmann, Christian. 1995. Thoughts on grammaticalization. A programmatic sketch. 2nd edn. Munich: Lincom. Lötzsch, Ronald. 1996. Interferenzbedingte grammatische Konvergenzen und Divergenzen zwischen Sorbisch und Jiddisch. In Martin Haase & Nicole Nau (eds.), Sprachkontakt und Grammatikalisierung, Sprachtypologie und Universalienforschung (STUF) 49(1): 50–59. Marra, Antonietta. 2000. Il sistema dei casi nel nanaš dei croati molisani: Processi di mutamento, decadenza e morte di una lingua. PhD dissertation, Università di Pavia. Meillet, Antoine. [1917] 1930. Caractères généraux des langues germaniques. 4th edn. Paris: Hachette. Mitchell, Bruce. 1985. Old English syntax. 2 vols. Oxford: Clarendon Press. Moser, Amalia. 2003. Tense, aspect and the Greek perfect. In Artemis Alexiadou, Monika Rathert & Arnim von Stechow (eds.), Perfect explorations, 235–252. Berlin: Mouton de Gruyter. Mustanoja, Tauno F. 1960. A Middle English syntax. Part I: Parts of speech. [Mémoires de la Société Néophilologique de Helsinki XXIII]. Helsinki: Société Néophilologique. Myers-Scotton, Carol. 2002. Contact linguistics. Bilingual encounters and grammatical outcomes. Oxford: OUP. Nedjalkov, Vladimir P. (ed.). 1988. Typology of resultative constructions. Amsterdam: John Benjamins. Newmeyer, Frederick J. 1998. Language form and language function. Cambridge MA: The MIT Press. Newmeyer, Frederick J. 2001. Deconstructing grammaticalization. Language Sciences 23(2–3): 187–229. Paul, Hermann. 1998. Mittelhochdeutsche Grammatik. 24. Auflage überarbeitet von Peter Wiehl & Siegfried Grosse. Tübingen: Niemeyer. Pinkster, Harm. 1987. The strategy and chronology of the development of future and perfect tense auxiliaries in Latin. In Martin Harris & Paolo Ramat (eds.), Historical development of auxiliaries, 193–223. Berlin: Mouton de Gruyter. Ramat, Paolo. 1987. An example of reanalysis: Periphrastic forms in the Romance languages’ verb system. In Paolo Ramat (ed.), Linguistic typology, 141–164. Berlin: Mouton de Gruyter. Ramat, Paolo & Thomas Stolz (eds.). 2002. Mediterranean languages. Papers from the MEDTYP workshop. Tirrenia, June 2000, 87–98. Bochum: Brockmeyer. Romaine, Suzanne. 1980. The relative clause marker in Scots English: Diffusion, complexity, and style as dimensions of syntactic change. Language in Society 9: 221–247.

Areal convergence in grammaticalization processes 167

Romaine, Suzanne. 1984. Towards a typology of relative-clause formation strategies in Germanic. In Jacek Fisiak (ed.), Historical syntax [Trends in Linguistics. Studies and Monographs 23], 437–470. Berlin: Mouton de Gruyter. Rothstein, Robert A. 1993. Polish. In Bernard Comrie & Greville G. Corbett (eds.), The Slavonic languages, 686–758. London: Routledge. Scatton, Ernest A. 1993. Bulgarian. In Bernard Comrie & Greville G. Corbett (eds.), The Slavonic languages, 188–248. London: Routledge. Schwegler, Arnim. 1990. Analyticity and syntheticity. A diachronic perspective with special reference to Romance languages. Berlin: Mouton de Gruyter. Squartini, Mario & Pier Marco Bertinetto. 2000. The simple and compound past in Romance languages. In Östen Dahl (ed.), Tense and aspect in the languages of Europe, 403–439. Berlin: Mouton de Guyter. Stolz, Christel & Thomas Stolz. 2001. Mesoamerica as a linguistic area. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and language universals. Vol. II, 1539–1553. Berlin: Mouton de Gruyter. Stolz, Thomas. 2003–2004. A new Mediterraneanism: Word iteration in an areal perspective. A pilot study. Mediterranean Language Review 15: 1–47. Szemerényi, Oswald. 1970. Einführung in die vergleichende Sprachwissenschaft. Darmstadt: Wissenschaftliche Buchgesellschaft. Thierhoff, Rolf. 2000. On the areal distribution of tense-aspect categories in Europe. In Östen Dahl (ed.), Tense and aspect in the languages of Europe, 265–305. Berlin: Mouton de Gruyter. Thomason, Sarah Grey. 2001. Contact induced typological change. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and language universals. Vol. II, 1640–1648. Berlin: Mouton de Gruyter. Trask, R. Larry. 1998. The typological position of Basque. Language Sciences 20: 313–324. Traugott, Elizabeth Closs. 1992. Syntax. In Richard M. Hogg (ed.), The Cambridge history of the English language. Vol. I: The Beginnings to 1066, 168–289. Cambridge: CUP. Traugott, Elizabeth Closs. 2003. Constructions in grammaticalization. In Brian D. Joseph & Richard D. Janda (eds.), The handbook of historical linguistics, 624–647. Oxford: Blackwell. Trovesi, Andrea. 2004. La genesi di articoli determinativi. Modalità di espressione della definitezza in ceco, serbo-lusaziano e sloveno [Materiali Linguistici 46]. Milano: FrancoAngeli. Visser, Fredericus T. 1963–1973. An historical syntax of the English language. Leiden: E. J. Brill. WALS 2005 = The World Atlas of Language Structures, ed. by Martin Haspelmath, Mattew S. Dryer, David Gil & Bernard Comrie. Oxford: OUP. Weiss, Daniel. 2004. Das moderne Russisch als antianalytische Sprache (ein Vergleich mit dem Polnischen). In Uwe Hindrichs (ed.), Die Europäischen Sprachen auf dem Wege zum analytischen Sprachtyp, 263–283. Wiesbaden: Otto Harrassowitz. Wiemer, Björn & Walter Bisang. 2004. Introduction. In Walter Bisang, Nikolaus P. Himmelmann & Björn Wiemer (eds.), What makes grammaticalization? A look from its fringes and its components, 3–20. Berlin: Mouton de Gruyter. Wischer, Ilse. 2004. The HAVE-perfect in Old English. In Christian J. Kay, Simon Horobin & Jeremy Smith (eds.), New perspectives on English historical linguistics, 243–255. Amsterdam: John Benjamins. Wright, Roger. 2002. A sociophilological study of late Latin. [Utrecht Studies in Medieval Literacy 10]. Turnhout: Brepols.

chapter 6

The grammaticalization of nominalizers in Japanese and Korean A contrastive study* Kaoru Horie Tohoku University The paper examines the grammaticalization patterns of sentential nominalizers in Japanese and compares them with those found in Korean. The analysis of the nominalization system of Japanese illustrates the multifunctional character of such forms and the existence of differential degrees of grammaticalization among the various nominalizers. Moreover, the comparison of the behaviour of the Japanese nominalizers no and tokoro with their Korean counterparts kes and tey reveals that the former are far more versatile and have evolved to a more advanced stage of grammaticalization. The paper also suggests that the tendency for Korean to exhibit a lesser degree of grammaticalization than Japanese manifests itself in domains other than that of sentential nominalizers, such as the development of periphrastic aspectual constructions.

1. Introduction Nominalization in Japanese, as in many other languages, is an intricate phenomenon encompassing morphosyntactic, semantic and pragmatic dimensions. What makes nominalization in Japanese even more intriguing as an object of inquiry are (i) its diachrony, particularly in relation to its grammaticalization process (Horie 1997, 1998a; Simpson and Wu 2001; Simpson 2003), and (ii) its typological and areal linguistic properties (cf. Noonan 1997; Horie 1998b, 2000).

* This study was partly supported by a grant from the Japan Society for the Promotion of Science (#15520242). Special thanks are due to the two anonymous reviewers and the editors of this volume for their penetrating and very helpful comments. Thanks also go to the participants of the workshop on the grammaticalization of nominalizers in East Asian languages at New Reflections on Grammaticalization 3 (July 2005, Santiago de Compostela, Spain), particularly Foong Ha Yap and Mickey Noonan, and the participants of the 4th Meeting of the Japanese– Korean Contrastive Study Seminar (November 2005, University of Tokyo, Japan), particularly Naoki Ogoshi, and Joung-Min Kim, for constructive criticism and comments.

170 Kaoru Horie

The present study analyses nominalization in Japanese as a prominent grammatical feature of the language which has evolved since ancient times into its current complex system, which it shares with other languages in East Asia. Specifically, this study addresses the following two questions: a. What kind of grammaticalization patterns do Japanese nominalizers exhibit? b. How are the grammaticalization patterns observed in Japanese different from those found in Korean, an East Asian language with similar typological characteristics to Japanese? The organization of the paper is as follows. Section 2 offers a brief historical background of nominalization in Japanese. Section 3, then, presents an analysis of the grammaticalization of Japanese nominalizers as contrasted with that of Korean nominalizers and explores the theoretical implications of the crosslinguistic findings. Finally, Section 4 presents the conclusions.

2. Nominalization in Japanese: Historical background One of the most remarkable features of Japanese grammar, arguably on a par with, or even more prominent than, well-known grammatical phenomena like topicalization and honorification, is the extensive use of sentential nominalizers. Sentential nominalizers in Japanese appear not only in intra-sentential position, such as argument positions, where they mark complement clauses, but also in sentencefinal position, either independently or together with the copulas da/dearu/desu ‘to be’ (plain style/formal style/polite style), where they convey a variety of modal and aspectual meanings. Maynard (1997: 113), in reference to a Japanese sentence from Abe (1968: 6) involving nominalization (cf. (1) below) and its English non-nominalized counterpart (1′) from Saunders’ (1969: 6) translation, makes the following insightful observation: “[t]here is a distinct preference for nominalized expressions, at least in some Japanese discourse. When original Japanese and English translations are compared, there are many cases where the Japanese writer uses nominal clauses, but the English translator does not” [emphasis added].

(1) Modern Japanese Kono ame-ga agare-ba moo sugu natu na no daroo. this rain-nom let.up-cond already soon summer be nmlz would.be ‘(lit.) It would be that, when the rains let up, it will be summer soon.’

(1′) Soon, when the rains let up, it would be summer.

The grammaticalization of nominalizers in Japanese and Korean

In fact, the non-nominalized counterpart to (1) is also possible in Japanese, as shown in (1′′). (1′′) Modern Japanese Kono ame-ga agareba moo sugu natu daroo. As has been discussed at length by a number of Japanese grammarians (for example, Kuno 1973, among many others), there are several semantico-pragmatic differences between (1) and (1′′), such as presence (nominalized version) versus absence (non-nominalized version) of a preceding linguistic or non-linguistic context which would lead the speaker to conclude that summer will come soon. Although I will not go into details here concerning the semantico-pragmatic differences between sentences featuring nominalization or not, it is important to note that nominalization by overt sentential nominalizers like no is not a marked rhetorical strategy; on the contrary, it is observed extensively across genres, both in speech and writing. The preference for nominalization, noted by Maynard (1997), is not an innovation in Modern Japanese, but has its roots in the history of the language. This is discussed briefly in Section 2.1 below.

2.1 Nominalization in Classical Japanese Some morphosyntactic characteristics of Classical Japanese have been lost completely. Notable among them is the distinction between attributive (nominal) predicate form and conclusive (sentence-final) predicate form, as shown in (2a) and (2b), respectively.

(2) Classical Japanese a. Oturu tori. fall.attr bird ‘a falling bird’ b. Tori otu. bird fall.concl ‘A bird falls.’

As well as modifying nouns, as in (2a), attributive (nominal) forms in Classical Japanese, were also employed as a means of forming a nominalized clause in argument position (complement clause) or adjunct position (adverbial clause) without any overt nominalizer, as in (3).

171

172 Kaoru Horie

(3) Classical Japanese [Tori oturu]-wo mi-tari. [bird fall.attr-acc see-prf ‘I saw a bird falling.’

Attributive forms were also employed in sentence-final position when the sentence presented the emphatic or focus particles zo or namu, or the interrogative particles ya or ka in pre-final position, as in (4).1

(4) Classical Japanese Ware-nomi-zo kimi-ni-wa kouru. I-only-focprt you-dat-top yearn-attr ‘It is I alone who yearn for thee.’

These sentences ending in attributive forms in Classical Japanese served some rhetorical or pragmatic functions and are referred to as kakari musubi bun (‘focus-concord sentence’) in traditional Japanese grammar. Attributive forms could, moreover, be employed in sentence-final position even when no focus or interrogative particle was present, as in (5).

(5) Classical Japanese Tiru-to mite / aru-beki mono-wo / ume-no-hana // utate scatter-quot see.gerv exist-should thing-prt plum-gen-flower very nioi-no / sode-ni tomare-ru. scent-gen sleeve-loc lay.prf-attr ‘It would have been best simply to watch them scatter—now, alas, the scent from the blossoms of the plum still lingers upon my sleeve.’ (Iwasaki 2000: 246–247)

Sentences of this kind ending in attributive forms without any overt particle served to convey the writer’s emotive or affective stance toward the proposition expressed in the sentence, such as vivid recollection of a past event experienced by the writer. These are referred to as rentai syuusi bun (‘attributive-final sentences’) in traditional Japanese grammar. The existence of examples like (4) and (5) ending in attributive forms suggests that, even in Classical Japanese, it was not uncommon for sentences to end with some kind of nominalization, similar to nominalized sentences in Modern Japanese, as in (1) above. In fact, sentences ending in attributive forms in Classical Japanese became the norm rather than the exception because of a morphosyntactic 1. Examples (4) and (5) are from Iwasaki (2000); glosses are partially modified.

The grammaticalization of nominalizers in Japanese and Korean 173

change whereby conclusive (sentence-final) forms (e.g. (2b)) were replaced by attributive forms (e.g. (2a)). Iwasaki (2000: 243) notes that [t]he distinction between Conclusive and Attributive forms gradually disappeared, and by the 14th century, the old Conclusive form of some verbs had been replaced by the Attributive form. In other words, the same (Attributive) form now functions both as the Conclusive and as the Attributive. [emphasis added]

Attributive forms in Modern Japanese have thus become indistinguishable from conclusive forms, as shown in (2a’) and (2b’).

(2) Modern Japanese a′. otiru tori fall.attr bird ‘a falling bird’ b′. Tori-ga otiru. bird-nom fall ‘A bird falls.’

From the perspective of grammaticalization, the replacement of conclusive forms by attributive forms suggests that attributive forms extended their functional domain over time. Some of the functions served by Classical Japanese attributive forms, particularly their complement-marking function (cf. example (3) above) and pragmatic or rhetorical functions (cf. examples (4) and (5)) have thus come to require more overt nominalization marking, that is, overt nominalizers such as no. Section 3 below presents a contrastive analysis of the grammaticalization of overt nominalizers in Japanese and Korean in an attempt to answer the questions posed in Section 1 above.

3. Grammaticalization of nominalizers in Japanese and Korean The sentential nominalizing function of Classical Japanese attributive forms, which replaced conclusive forms (cf. Section 2.1 above), has its direct descendant in Modern Japanese, in a non-overt nominalization process referred to as “direct nominalization” (cf. Martin 1975), as in (6).

(6) Modern Japanese [Tometa]-ni mo kakawarazu, dete it-ta. [stop.pst.attr-dat also concern.neg leave.gerv go-pst ‘Though I stopped, he departed.’

174 Kaoru Horie

This nominalization process is, however, no longer productive in Modern Japanese and is limited to “relics” (Harris and Campbell 1995) or conventionalized idiomatic expressions, such as -ni mo kakawarazu ‘notwithstanding’ in example (6) above. Barring such relics, Modern Japanese needs to employ overt sentential nominalizers like no, as in (3′), the Modern Japanese equivalent of (3). (3′) Modern Japanese [Tori-ga otiru no]-wo mi-ta. [bird-nom fall.attr nmlz-acc see-pst ‘I saw a bird falling.’

3.1 Grammaticalization of overt nominalizers in Japanese Overt sentential nominalizers in Modern Japanese, referred to as keisiki meisi (lit. ‘formal noun’) in traditional Japanese grammar, either have no confirmed lexical origin, as is the case with no (its earliest documented function being that of genitive), or originated from lexical nouns, as is the case with koto (< ‘matter, event’), tokoro (< ‘place’), mono (< ‘thing’) or wake (< ‘reason’). All these overt nominalizers constitute a finite set of forms which are stratified in three groups according to their different degrees of grammaticalization, that is, their degrees of functional extension and semantic generalization: a. Lexical meaning virtually absent; functional extension widest of all the nominalizers: no (cf. also Horie 1998b; Yap et al. 2004; Nishi 2006). b. Very general or abstract lexical meaning; rather wide functional extension: mono (< ‘thing’), koto (< ‘matter, event’), tokoro (< ‘place’). c. More specific lexical meaning; limited functional extension: wake (< ‘reason’), yoo (< ‘appearance’), moyoo (< ‘design, likelihood’), kanzi (< ‘feeling’), etc. What seems to be rather special about the grammaticalization of overt nominalizers in Japanese is their multiple grammaticalization pathways. As shown below, these nominalizers have developed a variety of grammatical uses, namely (i) complementizer; (ii) cleft construction marker; (iii) marker of internally headed relative clauses; (iv) conjunction; (v) modal and aspectual constructions marker and (vi) sentence-final particle. In what follows, I will provide representative examples for each grammatical use, and specify which of the three groups of overt nominalizers above have developed the uses in question, since, as already mentioned, not all nominalizers behave alike in terms of their functional extension. (i) Complementizer. Three nominalizers, no in group (a) and koto and tokoro in (b), have evolved into complementizers, as shown in (7a–b) below.

The grammaticalization of nominalizers in Japanese and Korean 175

(7) Modern Japanese a. [Kodomo-ga nai-te i-ru no]-o mi-ta. [child-nom cry-gerv exist-prs nmlz-acc see-pst ‘I saw a child crying.’ b. [Kodomo-ga nai-te i-ru tokoro]-o mi-ta. [child-nom cry-gerv exist-prs nmlz-acc see-pst ‘I saw a child crying.’

(ii) Cleft construction marker. Only one nominalizer, namely no in group (a), has evolved into a cleft construction marker. Consider example (8).

(8) Modern Japanese [Souru-ni tootyakusi-ta] no-wa sanzi desi-ta. [Seoul-to arrive-pst nmlz-top three.o’clock cop-pst ‘It was at three o’clock that I arrived in Seoul.’

(iii) Marker of internally headed relative clauses. In this particular type of relative clause, in contrast to regular externally headed relative clauses, the clause head occurs internally. Only one nominalizer, no in group (a), has fully developed into a marker of internally headed relative clauses. Another nominalizer, tokoro in group (b), can serve this function with only a limited number of matrix verbs, such as tukamaeru (‘catch, arrest’). Thus, for instance, it would be unacceptable in an example such as (9a). The co-occurrence restriction is absent for no, which could appear in (9b). This suggests that the lexical meaning of tokoro, though fairly abstract, has not been bleached completely.

(9) Modern Japanese a. [Tukue-no ue-ni ringo-ga aru no/*tokoro]-o tabe-ta. [desk-gen top-loc apple-nom exist nmlz-acc eat-pst ‘I ate the apple, which/as it was on the desk.’ (lit. ‘I ate that an apple was on the desk.’) b. [Doroboo-ga heya-kara dete ki-ta tokoro/no]-o tukamae-ta. [thief-nom room-from come out-pst nmlz-acc catch-pst ‘I caught a thief who/as he was coming out of the room.’

(iv) Conjunction. The nominalizers no, mono, koto and tokoro have evolved into conjunctions either by merging with co-occurring case particles, such as nominative ga, accusative o, dative/locative ni and instrumental/locative de, or independently, which is the case for tokoro only. The list of conjunctions includes the following forms: in group (a), no-de (‘because’), no-ni (‘though’), etc.; in group (b), mono-o (‘though’), mono-no (‘though’), koto-de (‘because’), tokoro-de (‘no matter how, incidentally’), tokoro-ga (‘contrary to one’s expectation, however’), tokoro (‘in a situation where, as it happens’), etc. Examples are given in (10a–b).

176 Kaoru Horie

(10) Modern Japanese a. [Yohukasi si-ta no-de], neboosi-te simat-ta. [sit.up late-pst nmzl-ins oversleep-gerv end.up-pst ‘Because I sat up late, I overslept [and I could not help it].’ b. [Kyoositu-ni it-ta tokoro], kyuukoo dat-ta. [classroom-to go-pst nmlz cancellation cop-pst ‘When I went to a classroom, I found [to my surprise] that the class was cancelled.’ (v) Marker of modal and aspectual constructions. Nominalizers in all three groups (a)–(c) have developed this use by merging with the copula da: in group (a), no-da (causality, ‘it is that, because’); in group (b), mono-da (moral obligation, past habit), koto-da (moral obligation), tokoro-da (progressive, proximative or immediate past), etc.; finally, in group (c), wake-da (explanation), yoo-da (inferential), moyoo-da (inferential), and so on. Consider the examples in (11a–c) below. (11) Modern Japanese a. A, ame-ga hut-ta n da. oh rain-nom fall-past nmlz cop ‘[I infer from some evidence that] it rained.’ (lit. ‘It is that it rained.’) b. Yoku mukasi-wa umi-ni oyogi-ni it-ta mono-da. well past-top sea-to swim.conj-to go-pst nmlz-cop ‘In the past I would go to sea for swimming.’ c. Ame-ga hut-ta yoo-da. rain-nom fall-pst nmlz-cop ‘It appears to have rained.’ (vi) Sentence-final particle (with discourse-pragmatic meaning). Nominalizers in all three groups (a)-(c) have developed this use: (a) no (offering explanation or confirmation); (b) mono (conveying opinion), koto (strong advice); (c) wake (offering explanation or justification). (12) Modern Japanese a. Hayaku iku no. quickly go nmlz ‘Will you leave early?’ (lit. Is it that you leave early?) or ‘Please leave soon.’ (lit. It is that you leave quickly.) b. Sira-nai mon. know-neg nmlz ‘I don’t know [I assert this no matter what you say].’

The grammaticalization of nominalizers in Japanese and Korean 177

c. Asoko-ni ikoo tte itta wake. that.place-to go-intt quot said nmlz ‘I said why not go there, you know.’ An interesting question which arises here is whether the grammaticalization patterns (i)–(vi) found in Japanese nominalizers are typologically common or, on the contrary, language-specific. As noted by Heine and Kuteva (2002: 326) and other researchers in grammaticalization studies, it is not uncommon for languages to evolve complementizers from lexical nouns with generic meanings like ‘thing.’ However, this grammaticalization pathway does not apply straightforwardly to Japanese, as the most relevant nominalizer mono (< ‘thing’) has not developed the complementizer function. Instead, Japanese presents some grammaticalization pathways not widely documented in other languages, such as ‘place’ > complementizer (see 7b) and ‘place’ > conjunction (see 10b), which apply to the nominalizer tokoro, or ‘thing’ > sentence-final particle (see 12b), which applies to the nominalizer mono (cf. also Horie and Sassa 2000). Moreover, it is not uncommon for some languages, particularly those in East Asia, to exhibit a grammaticalization pathway from nominalizers to pragmatic markers (cf. Yap et al. 2004). In fact, as shown in the overview of nominalizers in East Asian languages by Yap and Matthews (this volume), many of the pathways documented in Japanese have also been observed in other languages in the same region, which points to the areal nature of versatile nominalization. However, it appears that the multiple grammaticalization pathways (i)-(vi) observed in Japanese are not necessarily common even among East Asian languages where such phenomena could be expected. In order to assess correctly the grammaticalization pathways of Japanese nominalizers, I will compare them with their counterparts in Korean, a language known to exhibit remarkable structural similarity to Japanese relative to other languages (cf. Horie 1998a, 1998b, 2000, 2002a, 2002b, 2003; Horie and Taira 2002).

3.2 Grammaticalization of overt nominalizers in Korean Similarly to Japanese, Korean has a group of nominalizers (or dependent nouns) historically derived from lexical nouns (cf. Rhee, this volume). As with the Japanese nominalizers presented in Section 3.1 above, the classification of Korean nominalizers is stratified in terms of differential degrees of grammaticalization, that is, degrees of functional extension and semantic generalization. One noticeable difference between the overt nominalizers in Korean and their Japanese counterparts is that Korean does not appear to have any nominalizer which has reached the same stage

178 Kaoru Horie

of grammaticalization as Japanese no (i.e. lexical meaning virtually absent, functional extension widest of all the nominalizers). The stratification of overt Korean nominalizers is therefore restricted to groups (b) and (c): (b) very general or abstract lexical meaning and rather wide functional extension: kes (< ‘thing’); (c) more specific lexical meaning and/or limited functional extension: tey (< ‘place’), il (< ‘event, matter’), cek (< ‘event’), moyang (< ‘appearance’), ttaymwun (< ‘reason’), etc. Let us consider now the different grammatical uses of overt nominalizers in Korean. (i) Complementizer. Only one Korean nominalizer, namely kes, from group (b), has evolved into a complementizer (cf. example (13) below). (13) Korean [Phiano-lul chi-nun kes]-ul tul-ess-ta. [piano-acc play-attr.prs nmlz-acc hear-pst-decl ‘I heard him playing the piano.’ (ii) Cleft construction marker. Once again, only the nominalizer kes has evolved into the marker of cleft constructions, as shown in (14). (14) Korean [Chayksang wi-ey iss-nun kes]-un sakwa i-ta. [desk top-loc exist-attr.prs nmlz-top apple be-decl ‘It is an apple that is on the desk.’ However, kes is not felicitous in a sentence equivalent to (8) above in Japanese. (8′) Korean ?[Sewul-ey tochakha-n kes]-un seysi i-ess-ta. ?[Seoul-to arrive-attr.pst nmlz-top three.o’clock be-pst-decl ‘It was three o’clock that I arrived in Seoul.’ [Intended meaning: ‘It was at three o’clock that I arrived in Seoul.’] This indicates that Korean kes has not reached the same stage of semantic generalization as its closest counterpart no in Japanese. In other words, its remaining lexical meaning prevents kes from serving as a generalized cleft construction marker, in contrast to Japanese no or English that. (iii) Marker of internally headed relative clauses. Unlike its Japanese counterpart no in example (9a) above, kes is not fully acceptable in this construction (9a’), as was the case with the low acceptability of its use in cleft constructions in (8′).2 2. Cf. Horie (1993) for a more extensive analysis.

The grammaticalization of nominalizers in Japanese and Korean 179

(9a′) Korean ?[Chayksang wi-ey sakwa-ka iss-nun kes]-ul mek-ess-ta. ?[desk top-loc apple-nom exist-attr.prs nmlz-acc eat-pst-decl ‘I ate the apple, which/as it was on the desk.’ (lit. ‘I ate that an apple was on the desk.’) (iv) Conjunction. Unlike their Japanese counterparts, few Korean overt nominalizers with lexical origin have evolved into conjunctions, either independently or by merging with co-occurring morphemes. A possible exception is tey ( < ‘place’), from group (c), which has formed conjunctions by coalescing with the present attributive ending -nun, as in nuntey (‘but, because, as, etc.’), or with the attributive form of the stative predicate kuleta (‘to be so’), to give kulentey (‘however’), illustrated in example (15) below. (15) Korean Kulentey, imi syethe-ka tathye iss-ess-ta. nmlz already shutter-nom be.closed exist-pst-decl ‘But the shutter (of the bank) was already closed.’ (v) Marker of modal and aspectual constructions. Like their Japanese counterparts, many Korean nominalizers have evolved this use by merging with the copula -ita, such as kes-ita (explanation, inference, suggestion; cf. Kim and Horie, in press), from group (b), or moyang-ita (inferential), cham-ita (proximative), etc., from group (c). Consider the examples in (16a–b) below. (16) Korean a. Hoyuy-ey chamsekha-l kes-ita. meeting-to attend-attr.fut nmlz-cop.decl ‘He will attend the meeting.’ b. Kim sensayngnim-kkey cenhwaha-l Kim professor-dat.hon telephone-attr.fut cham-i-ess-e-yo. verge-cop-past-conj-sfprt.pol ‘I was just about to call Professor Kim.’ (vi) Sentence-final particle (with discourse-pragmatic meaning). Some Korean nominalizers with lexical origin have evolved into sentence-final particles, among them the following: kel (< kes + lul ‘nmlz-acc’) and nuntey (< nun + tey ‘Attributive Present ending + place’), from groups (b) and (c), respectively, as illustrated in (17a–b).

180 Kaoru Horie

(17) Korean a. Ne-to kathi ka-l kel. you-also together go-attr.fut nmlz ‘You should have gone with them.’ b. Ku chayk-un yeki eps-nuntey-yo. that book-top here not.exist-nmlz-pol ‘We don’t carry that book here [I am sorry to say].’ The crosslinguistic comparison between Japanese nominalizers and their Korean counterparts presented above reveals that some pathways found in Japanese are not attested in Korean or are attested but to a lesser extent. Conversely, no grammaticalization pathway has been observed only with Korean nominalizers. More specifically, the crosslinguistic comparison between the Japanese nominalizer no and its Korean counterpart kes presented in this section has shown that, in spite of the extensive parallelism between the two forms, kes has not been as fully grammaticalized as no, since some pathways found with Japanese no are not attested at all for Korean kes, or are attested only to a lesser degree. Though less noteworthy than the contrast between no and kes, another pair of nominalizers, tokoro and tey, both originating as lexical nouns meaning ‘place,’ exhibit a similar contrast to that observed between no and kes in terms of their degree of grammaticalization.3 Japanese tokoro has developed more grammatical uses (complementizer, internally headed relative clause marker, conjunction and marker of aspectual constructions) than Korean tey, which can only function as a conjunction and a sentence-final particle. Table 1 summarizes the nominalizers found in Japanese and Korean, with specification of their functions.

Table 1. Japanese and Korean nominalizers (forms in brackets have not been fully grammaticalized in that function) Complementizer Cleft construction marker Marker of internally headed relative clauses Conjunction Modal and aspectual construction marker Sentence-final particle

Japanese

Korean

no, koto, tokoro no no, (tokoro) no, mono, koto, tokoro no, mono, koto, tokoro, etc. no, mono, koto, wake, etc.

kes (kes) (kes) tey kes, etc. kes, tey, etc.

3. See Horie and Sassa (2000) for a more extensive analysis of the grammaticalization pathways of tokoro and tey.

The grammaticalization of nominalizers in Japanese and Korean 181

3.3 Theoretical implications of the Japanese–Korean contrast The grammaticalization patterns of overt nominalizers presented in Sections 3.1 and 3.2, particularly those exhibited by the Japanese nominalizers no and tokoro and their Korean counterparts kes and tey, suggest a higher degree of grammaticalization in the former than in the latter. This contrast leads to an interesting theoretical question regarding the degree of grammaticalization in languages of a similar morphological type. Bybee et al. (1994: 118) propose the following hypothesis: The existence of (morphological) types gives evidence for typological constraints on grammaticization: in some languages grammaticization does not proceed as far as it does in others. In particular, isolating languages do not carry grammaticization as far as fusional or agglutinating languages do. Not only do they not affix, they also do not have grams with meaning as abstract as synthetic languages do. The stability of certain isolating languages, such as Chinese, over time further attests to typological constraints on grammaticization.

Bybee et al.’s hypothesis makes it possible to predict that grammaticalization in an isolating language like Chinese is unlikely to proceed as far as in an agglutinating language like Japanese. The contrast between the grammaticalization patterns of Japanese and Korean nominalizers shown here suggests that the rate at which grammaticalization proceeds may vary even between languages of the same morphological type. Crucially, the tendency for Korean to exhibit a lesser degree of grammaticalization than Japanese is manifested not only in the grammaticalization of overt nominalizers, as seen in Sections 3.1 and 3.2, but also in the grammaticalization of verbal elements, such as periphrastic aspectual constructions. According to Bybee et al. (1994), grammatical morphemes (or grams in their terminology) encoding tense, aspect and modality do not evolve randomly from any lexical item, but manifest common grammaticalization pathways crosslinguistically from lexical items with specific lexical meaning, among them the following: (i) ‘be/have’ > resultative > anterior (= perfect) > perfective/simple past; (ii) ‘finish’ > completive > anterior (= perfect) > perfective/simple past. The pathway in (i) is attested in both Japanese and Korean:4 a. Japanese: -te ari (conjunctive affix-‘be’ (existential) verb) > -tari (affix: resultative/ perfective) > -ta (affix: perfective/simple past). b. Korean: -e is-/-e isi- (conjunctive affix-‘be’ (existential) verb) > -eys-/-eysi- (affix: resultative/perfective) > -ess- (affix: perfective/simple past). 4. Constituent lexical item indicated by bold type.

182 Kaoru Horie

The affixes -ta and -ess- have already reached the final stage of the pathway since they can encode simple past tense. Japanese and Korean also have less grammaticalized periphrastic aspectual constructions whose constituent lexical items (in this case verbs) have not yet been semantically bleached completely. This is the case with -te iru in Japanese and its Korean counterparts -ko issta and -a/-e issta, as well as with -te simau in Japanese and its Korean counterpart -a/-e pelita:5 a. Japanese: -te iru (progressive/resultative/perfect), -te simau (completive).6 b. Korean: -ko issta (progressive), -a/-e issta (resultative), -a/-e pelita (completive). These periphrastic aspectual constructions still have formally identifiable lexical constituents, namely the ‘be’-verbs iru and issta and the ‘put away/finish’-verbs simau and pelita, and have not reached the final stage of grammaticalization, unlike -ta and -ess- above. What is interesting from a comparative perspective is that, in addition to the contrast in grammaticalization patterns between Japanese no/tokoro and Korean kes/tey, Japanese aspectual constructions attest a more advanced stage of grammaticalization than their Korean counterparts. Japanese -te iru is known to have developed multiple aspectual meanings including progressive (18), resultative (19a) and perfect (‘anterior’ in Bybee et al.’s 1994 terminology) (19b), consistent in part with the aforementioned grammaticalization pathway: ‘be/have’ > resultative > anterior (= perfect) > perfective/simple past. (18) Modern Japanese Dareka-ga nokku si-te ru yo. someone-nom knock-prog sfprt ‘Someone is knocking on the door.’ (19) Modern Japanese a. Titi-wa sono T syatu-o ki-te iru. father-top that T shirt-acc wear-res ‘My father wears that T-shirt.’

5. The verbs included in the periphrastic aspectual constructions, in bold type, are given in citation form. 6. The term completive aspect refers to the aspectual meaning “to do something thoroughly and to completion” (Bybee et al. 1994: 318).

The grammaticalization of nominalizers in Japanese and Korean 183

b. Titi-wa nando ka sono T syatu-o ki-te iru. father-top several times that T shirt-acc wear-prf ‘My father has worn that T-shirt several times.’ In contrast, Korean does not have a single periphrastic aspectual construction to cover the multiple meanings expressible by -te iru. A dynamic progressive meaning is encoded by -ko issta (20) and a stative resultative meaning by -a/-e issta (21), the latter co-occurring only with “a limited number of intransitive verbs” (Chang 1996: 124). (20) Korean Bongsik-i cikum talli-ko iss-ta. Bongsik-nom now run-prog-decl ‘Bongsik is now running.’ (21) Korean Bongsik-i uyca-ey anc-a iss-ta. Bongsik-nom chair-loc sit-res-decl ‘Bongsik is seated in a chair.’ However, neither of these periphrastic constructions in Korean has developed a general perfect meaning similar to that expressed by Japanese -te iru, which suggests that -te iru has reached a more advanced stage of grammaticalization than its Korean counterparts. Interestingly, as discussed at length by Wako et al. (2003), Korean -ko issta has started to take on perfective aspectual meaning when it cooccurs with reporting verbs, as in (22), and attainment verbs, as in (23), particularly in newspaper articles and editorials. (22) Korean Kath-un kwunin chwulsin-ulo amithici pwucangkwan-kwa same-attr soldier background-as Armitage vice-secretary-with twul-to eps-nun sai-i-n khollin phawel two-also not.exist-attr relation-cop-attr Collin Powell kwukmwucangkwan-un pimanglok-eyse ku-ey tayha-n State.Secretary-top memoire-loc he-regarding-attr chesinsang-ul ilehkey cek-ko iss-ta. first.impression-acc like.this note-prf-decl ‘In his memoir, State Secretary Collin Powell has noted his first impression of Vice-Secretary Armitage, also an ex-soldier whose presence is unlike any other to him, like this.’ (from Wako et al. 2003: 69, partially modified)

184 Kaoru Horie

(23) Korean Yecapaksa-uy yenphyengkyun cungkayul-nun namcapaksa-uy female.PhD-gen annual.average increase.rate-top male.PhD-gen twupay-ey talha-ko iss-ta. twice-to reach-prf-decl ‘The annual average increase in the number of female Ph.D. recipients has almost doubled [lit. ‘has reached the double size of ’] that of male Ph.D. recipients.’ (from Wako et al. 2003: 69, partially modified) A grammaticalization pathway from progressive to perfect has not been reported in grammaticalization literature and deserves further research. For our present concern, however, it suffices to note that such genre-sensitive perfective meaning of -ko issta is at best in its incipient stage of grammaticalization and has not yet generalized to other semantic classes of verbs and genres, unlike -te iru in Japanese. Finally, completive aspectual constructions in Japanese and Korean, -te simau and -a/-e pelita respectively, show a similar contrast in terms of their degree of grammaticalization. As shown in Table 2 (from Strauss and Sohn 1998: 221, partially modified), -te simau has reached a more advanced stage of grammaticalization than -a/-e pelita, having evolved from a lexical verb meaning ‘finish, put away’ through the stage of a (completive) aspectual marker and into that of an emphatic or affective marker. The latter stage, characterized by formal reduction (-te simau > -tyau), may further lead to the loss of such emphatic or affective meaning. In contrast, Korean -a/-e pelita has not advanced to these posterior developmental stages. The two case studies of grammaticalization of periphrastic aspectual constructions in Japanese and Korean presented in this section demonstrate that, as in the case of overt nominalizers, discussed in Sections 3.1 and 3.2 above, periphrastic aspectual constructions also show a tendency for grammaticalization to proceed more extensively in Japanese than in Korean. This suggests that the degree or rate of grammaticalization can differ between languages of a similar morphological type and that the differing grammaticalization patterns are manifested across different grammatical domains. Table 2. Grammaticalization paths for Japanese -te simau and Korean -a/-e pelita I

II

III

IV

Physical domain

Aspectual marker

simau ‘put away’ >

-te simau >

pelita ‘throw away’ >

-a/-e pelita >

Emphatic or affective Light or no emphasis marker -te simau/-tyau > -tyau (semi-productive) -a/-e pelita (rare) > limited cases (not productive)

The grammaticalization of nominalizers in Japanese and Korean 185

4. Conclusion This paper has presented an analysis of the grammaticalization patterns of sentential nominalizers in Japanese as contrasted with those of Korean sentential nominalizers. In spite of the morphosyntactic typological similarities between the two languages, a striking crosslinguistic contrast emerges in that Japanese nominalizers exhibit a more advanced stage of grammaticalization than their Korean counterparts. This suggests that the degree of grammaticalization may vary even between languages of a similar morphological type. Furthermore, this paper also relates the differential grammaticalization patterns observed in the nominal domain to a similar contrast in the verbal domain, namely the grammaticalization patterns of periphrastic aspectual constructions in Japanese and Korean. It has thus shown that the contrastive patterns of grammaticalization between these two languages are observable across different grammatical domains. Finding out where such a crosslinguistic contrast in grammaticalization patterns comes from is next on our agenda.

Abbreviations acc attr concl cond conj cop dat decl focprt fut gen gerv hon ins intt

accusative attributive conclusive conditional conjunctive copula dative declarative focus particle future genitive gerundive honorific instrumental intention

loc neg nmlz nom pol prf prog prs prt pst quot res sfprt top

locative negative nominalizer nominative polite perfect progressive present particle past quotative resultative sentence final particle topic

References Abe, Kooboo. 1968. Tanin no Kao (Face of another). Tokyo: Sintyoosya. Bybee, Joan L., Revere Perkins & William Pagliuca. 1994. The evolution of grammar: Tense, aspect and modality in the languages of the world. Chicago IL: University of Chicago Press. Chang, Suk-Jin. 1996. Korean. Amsterdam: John Benjamins.

186 Kaoru Horie

Harris, Alice C. & Lyle Campbell. 1995. Historical syntax in cross-linguistic perspective. Cambridge: CUP. Heine, Bernd & Tania Kuteva. 2002. World lexicon of grammaticalization. Cambridge: CUP. Horie, Kaoru. 1993. Internally headed relative clauses in Korean and Japanese: Where do the differences come from? In Susumu Kuno, John Whitman & Young-Se Kang (eds.), Harvard studies in Korean linguistics 5, 449–458. Seoul: Hanshin Publishers. Horie, Kaoru. 1997. Three types of nominalization in Modern Japanese: No, koto and zero. Linguistics 35(5): 879–894. Horie, Kaoru. 1998a. Functional duality of case-marking particles in Japanese and its implications for grammaticalization: A contrastive study with Korean. In David Silva (ed.), Japanese/Korean linguistics 8, 147–159. Stanford CA: CSLI. Horie, Kaoru. 1998b. On the polyfunctionality of the Japanese particle no: From the perspectives of ontology and grammaticalization. In Toshio Ohori (ed.), Studies in Japanese grammaticalization: Cognitive and discourse perspectives, 169–192. Tokyo: Kurosio Publishers. Horie, Kaoru. 2000. Complementation in Japanese and Korean: A contrastive and cognitive linguistic approach. In Kaoru Horie (ed.), Complementation: Cognitive and functional perspectives, 11–31. Amsterdam: John Benjamins. Horie, Kaoru. 2002a. Verbal nouns in Japanese and Korean: Cognitive typological implications. In Kuniyoshi Kataoka & Sachi Ide (eds.), Culture, interaction, language, 77–101. Tokio: Hituzi Syobo. Horie, Kaoru. 2002b. A comparative typological account of Japanese and Korean morpho-syntactic contrasts. Eonehag 32: 9–32. Horie, Kaoru. 2003. What cognitive linguistics can reveal about complementation in non-IE languages: Case studies from Japanese and Korean. In Eugene H. Casad & Gary B. Palmer (eds.), Cognitive linguistics and non-Indo European languages, 363–388. Berlin: Mouton de Gruyter. Horie, Kaoru and Yuko Sassa. 2000. From place to space to discourse: A contrastive linguistic analysis of tokoro and tey. In Mineharu Nakayama & Charles J. Quinn (eds.), Japanese/ Korean linguistics 9, 181–194. Stanford CA: CSLI. Horie, Kaoru & Kaori Taira. 2002. Where Korean and Japanese differ: Modality vs. discourse modality. In Noriko Akatsuka & Susan Strauss (eds.), Japanese/Korean linguistics 10, 178–191. Stanford CA: CSLI. Iwasaki, Shoichi. 2000. Suppressed assertion and the functions of the final-attributive in prose and poetry of Heian Japanese. In Susan C. Herring, Pieter van Reenen & Lene Schøsler (eds.), Textual parameters in older languages, 237–272. Amsterdam: John Benjamins. Kim, Joung-Min & Kaoru Horie. In press. Intersubjectification and textual functions of Japanese noda and Korean kes-ita. In Yukinori Tabuko (ed.), Japanese/Korean Linguistics 16. Stanford CA: CSLI. Kuno, Susumu. 1973. The structure of the Japanese language. Cambridge MA: The MIT Press. Martin, Samuel E. 1975. A reference grammar of Japanese. New Haven CT: Yale University Press. Maynard, Senko K. 1997. Japanese communication: Language and thought in context. Honolulu HI: University of Hawaii Press. Nishi, Yumiko. 2006. The emergence of the complementizer no in Japanese revisited: The dual path approach. In Timothy Vance (ed.), Japanese/Korean linguistics 14, 127–137. Stanford CA: CSLI.

The grammaticalization of nominalizers in Japanese and Korean 187

Noonan, Michael. 1997. Versatile nominalizations. In Joan L. Bybee, John Haiman & Sandra A. Thompson (eds.), Essays on language function and language type. In Honor of T. Givón, 373–394. Amsterdam: John Benjamins. Saunders, Ernest Dale. 1969. Face of another. New York NY: Weidenfeld. Simpson, Andrew. 2003. On the re-analysis of nominalizers in Chinese, Japanese and Korean. In Yen-hui Audrey Li & Andrew Simpson (eds.), Functional structure(s), form and interpretation, 131–160. London: RoutledgeCurzon. Simpson, Andrew & Zoe Wu. 2001. The grammaticalization of formal nouns and nominalizers in Chinese, Japanese and Korean. In Thomas E. McAuley (ed.), Language change in East Asia, 250–283. London: RoutledgeCurzon. Strauss, Susan & Sung-Ock Sohn. 1998. Grammaticalization, aspect and emotion: The case of Japanese -te shimau and Korean -a/e pelita. In David Silva (ed.), Japanese/Korean linguistics 8, 217–230. Stanford, CA: CSLI. Wako, Masakazu, Shigeru Sato and Kaoru Horie. 2003. From progressive to perfect: A corpusbased study of the perfect meaning of the Korean progressive form -ko iss. In William McClure (ed.), Japanese/Korean linguistics 12, 64–74. Stanford CA: CSLI. Yap, Foong Ha, Stephen Matthews & Kaoru Horie. 2004. From pronominalizer to pragmatic marker: Implications for unidirectionality from a crosslinguistic perspective. In Olga Fischer, Muriel Norde & Harry Perridon (eds.), Up and down the cline: The nature of grammaticalization [Typological Studies in Language 59], 137–168. Amsterdam: John Benjamins.

chapter 7

On the frills of grammaticalization* Tania Kuteva University of Dusseldorf The paper shows that contact-induced grammaticalization is frequently accompanied by frills, that is, by linguistic accretion instead of the commonly expected attrition. This fact adds weight to the important insights gained in recent grammaticalization studies that both language-internal and contact-induced grammaticalization can be analysed in terms of the same set of definitional criteria and that contact-induced grammaticalization always entails language-internal grammaticalization and, hence, the same factors operate in both kinds of change. The explanation I propose for linguistic accretion in contact-induced grammaticalization capitalizes on the notion of overlap acknowledged in processes of language-internal grammaticalization. In contact-induced grammaticalization, I show that overlap is manifested both as a diachronic stage of the grammaticalization development and as a synchronic buffer zone in geographical space.

1. Introduction Sometimes language manifests a spectacular number of frills in marking a single category. In a survey of relativization strategies employed in the languages of Africa, Kuteva and Comrie (2001) describe a Mbam-Nkam language of Cameroon, Ngemba, which marks relative clauses by means of no less than five morphosyntactic segments: a. A relative conjunction/determiner (varying for number and nominal class). b. A complementizer marker -bah. c. Pronoun-retention. d. A verbal suffix -ne (a multi-purpose marker for topicalization, nominalization and relativization). e. A sentential definitizer -la (related to the determiner system) (cf. Chumbow 1977: 296–297, 302). Of these, the complementizer -bah is optional, while the other four relativization markers are obligatory. This abundance of marking is illustrated in example (1). * I would like to express my gratitude to John Haiman and the anonymous reviewers of this paper for their insightful suggestions and comments.

190 Tania Kuteva

(1) Ngemba nyung wá bah a-keshung-ne mung wa la a kung atsang. man rel bah he-tns.beat-ne child det la he enter into.prison ‘The man who beat the child went to prison.’ (Chumbow 1977: 290)

The question which arises here is: for what purpose does so much morphemic substance accrue in a case where most of the world languages employ exactly one — no more and no less! — marker (cf. Comrie and Kuteva 2005)? Several proposals have been articulated in the literature on why linguistic material accrues. One of these is the so-called residual quirks theory: a novel structure appears but the old one continues to exist, so that the two are used together within the same expression. The resulting bulking of material in such cases is due to historical residue or, rather, fossilizations of earlier uses; that is, there is an overlap between a historically earlier and a newly emerged marking. Thus, in Present-day Swahili (Bantu, Niger-Congo) relative clauses can take two main forms, namely the amba-form, as in (2a), and the infix form (-o- in the example below), related to the so-called relative pronoun, as in (2b); both examples are from Liner (1977: 269).

(2) Swahili a. Wa-kikuyu amba-o wa-na-fanya kazi kwa bidii c2-kikuyu rel-relpron sbjv-prs-do work with diligence wa-na-fanikiwa. sbjv-prs-successful ‘The Kikuyus who work hard are successful.’ b. Wa-kikuyu wa-na-o-fanya kazi kwa bidii c2-kikuyu sbjv-prs-relpron-do work with diligence wa-na-fanikiwa. sbjv-prs-successful ‘The Kikuyus who work hard are successful.’

Diachronic investigations show that it is as late as the 1950s that the first occurrences of amba-markers are attested. Liner (1977: 272) reports that, in these first attestations from the middle of the twentieth century, there are also double relative forms, i.e. relative clauses containing both amba- and the verb infix, as in (2a) above. Liner (1977: 272–273) proposes that in these cases amba- is used with the infix relative to emphasize the restrictive meaning of the clause, and later it specializes as the marker for restrictive (but not for non-restrictive) clauses. What is relevant to us here is that, by combining the historically earlier, residual relativization marker and the historically later one, these double forms present a case of accumulating linguistic material.

On the frills of grammaticalization 191

Second, accretion may also result from phonological bulking (Matisoff 1978), whereby an old form, subject to heavy phonological attrition, is replaced by a phonologically longer, novel form. Heath’s (1998) account is an argumentation in support of this line of reasoning. He sees grammatical change — or at least much of it — as a process with a conservative nature, as something which restores linguistic structure at a time when the language system “is in distress.” As is well known, a major force leading to the destruction of language is phonological attrition; the reason why language does not deteriorate irrevocably is the constant renewal of linguistic material, the adding of fresh linguistic substance to old, worn-out and dying out forms, a mechanism which perpetuates the existence of grammatical categories. Heath (1998: 733) makes use of a biological metaphor: he compares the linguistic accretion processes which maintain the grammatical categories of language to the behaviour of the well-described biological species Pagurus and Eupagurus, also known as hermit crabs. The latter are soft-bodied crustaceans which survive only by locating and occupying empty shells left on the beach by defunct molluscs; during its lifetime, a single hermit crab discards several shells as they wear out, replacing them with more robust ones, which may be from distinct mollusc species. Hence the term hermit crab processes coined by Heath for the formal renewal of a grammatical affix that has undergone phonetic attrition to the point of being inadequately perceptible or risking complete disappearance. An independent lexical stem with a more robust shape (i.e. more phonetic material), and sharing one or more conspicuous segments such as an initial consonant with the ailing affix is seized on (Heath 1998: 733). As an illustration of a hermit crab process, Heath points out the appearance of the Common Germanic dental preterite, not as the result of a syntax-tomorphology compression of the noun.instrumental + ‘do’ construction, as one of the most influential hypotheses (Lühr 1984) would have it, but as a hermit crab process where the old past passive *-to- participle (Common Germanic *-da) was simply replaced by the inflected, phonologically longer (and similar) forms of the ‘do’ verb (for the details of this proposal, see Heath 1998: 744–750).1 Both the residual quirks theory and the hermit crab theory assume that phonetic substance gets accumulated for some discernible reason, that is, as a result of inert morphosyntactic behaviour in the former case and as a crucial restorative linguistic mechanism in the latter. A totally different and absolutely unexpected account is proposed in Haiman (forthcoming): accretion may happen for no discernible reason at all, just for the hell of it. Language may sometimes bulk linguistic substance, via repeating an element or adding an infix, for instance, as a reflection of what Haiman refers to as decorative imagery (as contrasted to representative 1. The plausibility of Heath’s (1998) explanation per se is of no interest to us here.

192 Tania Kuteva

imagery), that is, for no referential reason, but whereby language users are exercising a natural aesthetic impulse to make their speech more stylish and elegant. Here Haiman draws a parallel with music, where almost all the imagery is presumably of the decorative type, that is, music does not represent or imitate reality, it creates an imagined reality of its own. This playfulness in language can, by no big stretch of the imagination, be traced back to the activity of playing as a characteristic of the behaviour of all vertebrates; this same feature may even be the motivation for the most fundamental property of language, namely displacement (something like ‘remove involuntary symptoms out of context as a playful activity, and what you end up with is signs of language’). One of the most convincing examples analysed by Haiman concerns what he calls a case of “purely aesthetic repetition” in Khmer. In addition to numerous cases of the familiar referential or iconic repetition (used to encode repetition of action, emphasis, symmetry, plurality, etc.), Khmer exhibits doubling in symmetrical compounding, where a word is doubled by means of a synonymous item for no discernible semantic reason at all; the presence of the repeated form only “adds flavor” and makes the style “elegant.” Consider example (3).

(3) Khmer tumngawn (tokkh wetawnie) tleak sawngkawt maawk leu […] weight (suffering suffering fall press come on ‘the weight of suffering fell upon […]’ (Haiman, forthcoming: 9)

Yet another account of accretion relates to the emblematic function of language as one of the symbols — often a very potent symbol — of a particular social group. As Comrie and Kuteva (2005: 201) point out, although there is a crosslinguistic tendency to prefer simple kinds of phonetic segments and sequences, there are nonetheless languages (and language groups, and dialects of languages) which depart strikingly from these norms in one or more areas of their phonology. Examples are the development of click consonants in the Khoisan languages of Southwestern Africa or that of rich vowel systems in the Austronesian (Micronesian) language Nauruan. While such developments can often be initiated by random sound changes, even by the interaction of a number of quite natural ones, once an unusual phenomenon of this kind arises, it can be seized upon by the speech community as an emblem of that speech community, differentiating it from all other speech communities and likely to be emphasized even further, just as a particular pattern in dress or ritual can. Such a phenomenon can thus be maintained quite tenaciously despite its apparent dysfunctionality. In terms of general thinking on evolution, it might be compared with the notion of costly signalling; in terms of more traditional pre-scientific concepts, with that of shibboleth. Finally, accretion may be one of the results of what Joseph (2005) terms lateral shifts. These are changes whereby a grammatical form is altered but not so as to be-

On the frills of grammaticalization 193

come more or less grammatical. In fact, there seems to be no change with respect to degree of grammaticalization, that is, there is no movement either “up or down a cline;” rather, the shift is “at the side,” hence the term lateral. It may happen that such a lateral shift results in bulking of linguistic material, as the following example from Greek illustrates. The person number tense grammatical ending for the third person plural middle past -ondustan in dialectal Modern Greek derived from a shorter, earlier form -ondusan. Joseph (2005) shows that the linguistic accretion, that is, the addition of the -t, brought about by this lateral shift was based on other endings in the paradigm, namely first person plural -mastan and second person plural -sastan. The above accounts emphasize five important factors underlying the phenomenon of linguistic accretion. In the present work, I will identify another, sixth factor which has remained largely unnoticed in the literature so far, namely contactinduced grammaticalization.2 Furthermore, it is my goal to account for linguistic accretion in contact-induced grammaticalization in terms of grammaticalization theory.

2. Contact-induced grammaticalization The traditional assumption has been that grammaticalization constitutes a sub-part of language-internal change, and that a particular grammatical language change can either be attributed to universal, cognitive, language-internal processes — that is, to grammaticalization — or else to language contact.3 This traditional wisdom culminates in the view that grammaticalization, or language-internal change, is natural, whereas contact-induced change, or language external change, is nonnatural: [. . .] linguistic changes may come in two rather different types. Some forms of linguistic change may be relatively ‘natural’, in the sense that they are liable to occur in all linguistic systems, at all times, without external stimulus, because of the inherent nature of linguistic systems themselves — and it is here of course that the stability of the nature of human beings is relevant. Other types of linguistic change, on the other hand, may be relatively ‘non-natural’, in the sense that they take place mainly as the 2. An anonymous reviewer points out that there may be other factors further enhancing contact as responsible for elaborateness accretion; otherwise, contact-induced accretions may disappear as soon as they are developed. One factor may be parsing (cf. n. 8); another may be redundancy in terms of hearer-friendliness. 3. Apart from the few linguists for whom grammaticalization and language contact are not mutually exclusive; cf. Heine (1994); Nau (1995); Bisang (1996); Kuteva (1998); Matras (1998); Croft (2000); Dahl (2000); Stolz and Stolz (2001).

194 Tania Kuteva result of language contact. They are, that is, not due to the inherent nature of language systems, but to processes that take place in particular sociolinguistic situations. (Trudgill 1983: 102)

In other words, grammaticalization and language contact have been seen as factors working against each other, as shown in Figure 1. Grammaticalization

Language contact

Figure 1. The traditional view: grammaticalization vs. language contact

Here I accept the position put forward in Heine and Kuteva (2005) that grammaticalization and contact-induced language change are in no way mutually exclusive; rather, and perhaps more often than not, they jointly conspire in triggering grammatical change. Accordingly, there are two kinds of grammaticalization, languageinternal grammaticalization and contact-induced grammaticalization. Following Heine and Kuteva (2003, 2005), contact-induced grammaticalization is defined as a grammaticalization process which is due to the influence of one language on another. The language exerting the influence will be referred to as the model, while the language influenced by the model language is known as the replica.4 The following two observations, made on the basis of a sizable body of empirical data (Heine and Kuteva 2005), are of crucial importance to the present study: a. Both language-internal and contact-induced grammaticalization can be analysed in terms of the same set of definitional criteria: (i) extension, (ii) desemanticization, (iii) decategorialization and (iv) erosion. b. Contact-induced grammaticalization always involves language-internal grammaticalization (but not vice versa).

3. Elaborateness of marking in high-contact varieties Contact-induced grammaticalization operates in high-contact varieties: the higher the degree of contact between the languages, the higher the number of contactinduced grammaticalization developments. It has been the common practice so far to assume that high-contact varieties are characterized by features such as (i) decrease in size of phonetically-realized inflectional affixes on nouns and verbs, (ii) reduction in redundancy, (iii) filtering out of morphological irregularities and (iv) reduction of grammatical systems, such as conjugations, declensions and in4. It has been argued elsewhere (Heine and Kuteva 2005) why it is not enough to speak of polysemy copying or grammatical calquing the way that Weinreich (1964) talked about it; accordingly, I find it feasible to distinguish between contact-induced grammaticalization and polysemy copying.

On the frills of grammaticalization 195

flected forms. What is most relevant to the present discussion are the features in (i) and (ii), both of which involve what I will call elaborateness of marking (cf. also the notion of verbosity in Dahl 2004a). In other words, the traditional assumption has been that high-contact varieties do not exhibit elaborate marking; on the contrary, they have been commonly believed to be characterized by reduced, impoverished marking. An explanation for this poverty of marking in high-contact varieties has also been around for quite some time. Trudgill (2001: 372) puts the blame on “the lousy language-learning abilities of the human adult” trying to cope with second language acquisition processes (cf. also Trudgill 1983: 102–107). Whereas these processes present no problem for “the amazing language-learning abilities of the human child,” they can be too much of a challenge for the adult language learner. As Trudgill (2001: 372) puts it, [a]dult language contact means adult language learning; and adult language learning means simplification, most obviously manifested in a loss of redundancy and irregularity and an increase in transparency. This can indeed be seen at its most extreme in pidgins and hence in creoles (Trudgill 1996). But it is not confined to this type of language. [emphasis added]

Contrary to the traditional assumption, in what follows I will show that contact-induced grammaticalization may just as well lead to redundancy in marking, that is, linguistic accretion in high-contact varieties. More precisely, I will argue that there are cases of contact-induced grammaticalization which manifest more elaborate marking of an existing grammatical category in a replica language due to the influence of a model language.

4. Contact-induced grammaticalization and bulking of material The simplest way to exemplify the fact that language contact may lead to linguistic bulking and, accordingly, to complexity, is to use a straightforward example from lexical compounding in Khmer. Haiman (forthcoming) shows that, in Khmer, one comes across coordinate combinations where a word from Sanskrit is conjoined with its doublet from Pali, with no change in meaning. An illustrative example is given in (4), where the form peel ‘time’ from Sanskrit is conjoined with its cognate weelie ‘time’ from Pali.

(4) Khmer peel weelie time time ‘time, time when’ (Haiman, forthcoming: 11)

196 Tania Kuteva

Another example is illustrated in (5) below, where two pairs of Pali-Sanskrit doublets occur.

(5) Khmer mien (teewawdaa teep) (rak reaksaa) wawt. have (angel angel (guard guard temple ‘There is an angel guarding the temple.’ (Haiman, forthcoming: 11–12)

In what follows I will show that it is not only in the domain of the lexicon that morphemic accretion happens in language-contact situations. In the domain of grammar, language contact may also result in the chunking of material. Here I will be concerned only with the increase in the linguistic material encoding an already existing grammatical category in a given language (the replica) due to the influence of a contact language (the model). At least two major groups of cases involving accretion due to contact-induced grammaticalization can be distinguished, and I will describe each of these in the remainder of this section.

4.1 A more elaborate expression from the model language replaces a less elaborate one in the replica language A straightforward case of increase in linguistic material as a result of contact-induced grammaticalization is when the expression of a grammatical category in the replica language is replaced by its counterpart from the model language, whereby the model expression happens to be more elaborate. Let us illustrate this case by means of the Saraswat Brahmins Konkani language variety spoken in the Indian State of Karnataka (Nadkarni 1975: 680). Konkani is an Indo-Aryan language. The Karnataka Saraswat Konkani variety is in close contact with Kannada, a Dravidian language. In this contact situation, the model language Kannada has two relative clause markers, embedded in the relative clause of a correlative construction. One of these markers is the question word yāva ‘which?,’ as in (6); the other one is the polar question marker ō, as in (7); both examples are from Nadkarni (1975: 674, 676).

(6) Kannada yāva mudukanu pēpar ōdutta iddāne. which old.man paper reading be.3sg.prs ‘Which old man is reading a newspaper?’

(7) Kannada mudukanu pēpar ōdutta iddān-ō old.man paper reading be.3s.prs-q ‘Is the old man reading a newspaper?’

On the frills of grammaticalization 197

Example (8), from Nadkarni (1975: 674, 676), illustrates how these two markers are employed to mark the relative clause construction in Kannada.

(8) Kannada [yāva mudukanu pēpar ōdutta iddān-ō] avanu dāktaranu [which old.man paper reading be.3sg.prs-q that doctor iddāne. be.3sg.prs ‘The old man who is reading a newspaper is a doctor.’

Konkani, on the other hand, has an already existing relative construction which is only marked by one relativization marker, namely the relative pronoun jo (Nadkarni 1975: 678). In the high-contact variety, Karnataka Saraswat Konkani, however, the newly grammaticalized relative construction involves more markers than the old one; in fact, the Karnataka Saraswat Konkani speakers use an isomorphic relative construction whereby, instead of the Kannada question word yāva, they use their own question word, khanco, and instead of the Kannada polar question marker ō, they use the Konkani polar question marker ki (see also Heine and Kuteva 2005: 128–129). Compare in this connection example (9) from Karnataka Saraswat Konkani with its Kannada counterpart in (8) above.

(9) Karnataka Saraswat Konkani [khanco mhāntāro pepar vāccat āssa-ki] to dāktaru āssa. [which old.man paper reading be.3sg.prs-q that doctor be.3sg.prs ‘The old man who is reading a newspaper is a doctor.’ (Nadkarni 1975: 674–675)

Thus, the grammaticalization from a combined interrogative structure to a relative construction was replicated in the Karnataka Saraswat dialect of the Indo-Aryan language Konkani in the course of a history of at least four centuries of contact, leading to total isomorphism in the syntax of this relative construction between these two genetically unrelated languages (see Nadkarni 1975). What is relevant to the present discussion is that this contact-induced grammaticalization process resulted in an increase in the marking of the relative clause construction in the replica language.

4.2 Coexistence of two expressions for the same function, one from the model language and one from the replica language Cases of this kind will be of special interest here since most of the documented examples of linguistic bulking as a result of contact-induced grammaticalization which I have found turn out to involve the coexistence of (parts of) marking from

198 Tania Kuteva

both the model and the replica languages within the new (that is, the more elaborate) structure in the replica language. Note that this need not be a mechanical combination of two form-meaning pairings (the marker from the replica language and the marker from the model language in their original meaning and form) coexisting within the same structure in the replica language. In other words, the marker from the model language need not be transferred as a form-meaning pairing directly or automatically into the replica language. Rather, what happens first is the establishing of a functional equivalence between the model language marker and a linguistic structure from the replica language, something like the translating of a model grammatical structure into the replica language. It is only then that the already existing replica language marker for function X is combined with the new, translated, model language marker for the same function X. The relative clause construction in the Colloquial Singapore variety of English, or Singlish,5 is a useful example here. Singlish results from a high-contact situation involving Mandarin Chinese, Cantonese Chinese, Hokkien Chinese and Malay as the main substratal languages and English as the superstrate language. Singlish is characterized by a variety of relativization strategies. One of them coincides with the major relativization strategy as it occurs in Standard English, using a wh-word as a relative pronoun, as shown in (10). (10) Singlish The man [who sell ice-kachang] gone home already. (Alsagoff and Lick 1998: 131) Another strategy demonstrating the influence which Chinese has on Singlish involves the use of the form one, which Alsagoff and Lick (1998) analyse as a relative clause marker (or relative pronoun in their terminology). An example is given in (11).6 (11) Singlish The man [sell ice-kachang one] gone home already. (Alsagoff and Lick 1998: 131) The function of one as a relative clause marker results from the influence of the Chinese linker particle, which has the form de in Mandarin, ge in Cantonese and e in Hokkien. Here I am following Alsagoff and Lick’s (1998) account, according to which Mandarin de and Cantonese ge can occur with adjectives (12), genitives 5. Alsagoff and Lick (1998: 137) define Colloquial Singapore English, or Singlish, as “the L[anguage] of the English in Singapore spoken by proficient speakers of English” and treat it as a creoloid. 6. Note that one is obligatory here; if it is left out, the sentence is ungrammatical.

On the frills of grammaticalization 199

(13) and relative clauses (both headed, as in (14), and non-headed, as in (15)), whereas Hokkien Chinese e can occur, in addition, with demonstratives, as in (16) below. Note that, in most of these contexts, the Chinese de linker particle can be rendered by means of one in Standard English. (12) Mandarin Chinese hong se de red color de ‘red ones’ (Alsagoff and Lick 1998: 135) (13) Mandarin Chinese Yue-han de John de ‘John’s’ (Alsagoff and Lick 1998: 135) (14) Mandarin Chinese Tamen zhong de shuiguo hen tian. They grow de fruit very sweet ‘The fruit that they grow is very sweet.’ (Alsagoff and Lick 1998: 134) (15) Mandarin Chinese Yue-han mai de John buy de ‘the one that John bought’ (Alsagoff and Lick 1998: 135) (16) Hokkien Chinese zhe e this e ‘this one’ (Alsagoff and Lick 1998: 135) As can be seen from the above examples, there is an overlap of the functions of the Chinese linker particle and the Standard English pronominal form one in contexts involving demonstratives, adjectives and non-headed relative clauses. Given this equivalence of function, speakers of Singlish seem to have extended the use of Standard English pronominal one to the genitive construction and to headed relative clauses on the model of the Chinese linker particle. The result of the latter extension are sentences like the one in (11) above. Other examples (all from Alsagoff and Lick 1998: 129–136) are given in (17) below. (17) Singlish a. Man don’t have car one, I don’t want. ‘I don’t want a man who doesn’t own a car.’

200 Tania Kuteva

b. The radio she buy one got more features than yours. ‘The radio that she bought has more features than the one you bought.’ c. That boy pinch my mother one very naughty. ‘The boy who pinched my mother is very naughty.’ Thus, the juxtaposition between Chinese, Singlish and Standard English below shows that Singlish patterns with Chinese in cases involving demonstratives (18), adjectives (19) and the genitive construction (20), as well as both headed (21) and non-headed relative clauses (22). (18) Hokkien Chinese (gloss): this e Singlish: this one Standard English: this one (19) Mandarin Chinese (gloss): red color de Singlish: red ones Standard English: red ones (20) Mandarin Chinese (gloss): John de Singlish: John’s one Standard English: John’s (21) Mandarin Chinese (gloss): [they grow] de fruit very sweet Singlish: The fruit [they grow] one very sweet Standard English: The fruit that [they grow] is very sweet (22) Mandarin Chinese (gloss): [John buy] de Singlish: [John buy] one Standard English: the one that [John bought] That the use of one in relative clauses in Singlish results from Chinese influence is also suggested by the similarity in syntactic positioning: both in Singlish and in Chinese, the relative construction marker (one in Singlish and the linker particle de/ge/e in Chinese) follows the relative clause, whereas in Standard English it is the other way round. Compare (23), (24) and (25) below. (23) Singlish The fruit [they grow] one very sweet. (Alsagoff and Lick 1998: 134) (24) = (14) Mandarin Chinese [Tamen zhong] de shuiguo hen tian. [they grow de fruit very sweet ‘The fruit that they grow is very sweet.’ (Alsagoff and Lick 1998: 134)

On the frills of grammaticalization 201

(25) Standard English The fruit that [they grow] is very sweet. So far we have considered two major relativization strategies in Singlish, each of which involves one marker of the relative clause construction. There is, however, another relativization strategy in Singlish, which testifies to the fact that language contact can lead to an increase in morphosyntactic complexity and hence to accretion. This third strategy, illustrated in (26) below, is the one most relevant to the present study since it involves the wh-relative pronoun, taken from Standard English, together with the relative pronoun one, which, as shown above, behaves like Chinese de/ge/e. (26) Singlish The man who [sell ice-kachang] one gone home already. (Alsagoff and Lick 1998: 131) The existence of such doubly marked sentences in Singlish shows that structures from the two contact languages can be combined and that there is amalgamation of English and Chinese constructions.7 This results in a highly marked and complex morphemically bulky structure.8 In other words, contrary to the common assumption that the new, contact varieties of English are simplified versions of their superstrate language, Singlish relative clauses exhibit extremely elaborate grammatical structure and linguistic accretion. A number of cases of accretion resulting from contact-induced grammaticalization involve the change from synthetic to analytic expression of a particular category, whereby an existing synthetic expression Y for the category X is replaced by a model analytic expression Z. The first step of this change, that is, the period before the new analytic expression Z becomes established, very often involves the coexistence of the old synthetic marker Y with the new analytic marking Z in a new, doubly marked structure in the replica language. Let us exemplify this by means of the Slavic Croatian minority living in Molise, southern Italy, which has a 500-year history of intense contact with the host language Italian, resulting in massive transfers from Italian to Molisean Croatian, the variety spoken by this 7. Alsagoff and Lick (1998) do not make any statistical ranking of the double-marked relative clause construction with respect to the two other structures described above. The authors do make it clear, however, that these three relative clause constructions are reported by their informants as being the most commonly used in Singlish, out of a variety of relative clause structures. 8. An anonymous reviewer suggests that parsing may be an additional reason why Singlish one is often combined with the relative pronoun who, since, in postnominal relative clauses, clauseinitial markers are usually preferred because otherwise the parser has to wait until the end of the construction to clearly recognize relative clause constructions.

202 Tania Kuteva

minority. One of these transfers involves the marking of the comparative of inequality (cf. Breu 1996: 26). The conventional Slavic construction is synthetic. Thus, in Standard Croatian, the comparative suffix -ši attached to the positive stem of the adjective would be the norm to mark the comparative of inequality, as shown in example (27). (27) Standard Croatian lijep-ši beautiful-cmpr ‘more beautiful’ In Molisean Croatian, however, the conventional synthetic construction has been given up and speakers of this high-contact variety have instead replicated the analytic construction of Italian. In Italian, the comparative of inequality is marked analytically, by means of the degree marker più ‘more,’ placed before the positive form of the adjective. Using the Italian expression as a model, Molisean speakers have grammaticalized their own word for ‘more,’ namely veče, the result of this contactinduced grammaticalization process being a new, analytic expression with the structure [degree marker veče + positive form of the adjective], as shown in (28). (28) Molisean Croatian veče lip more beautiful ‘more beautiful’ The claim that this change in the marking of the comparative of inequality in Molisean can best be accounted for in terms of a contact-induced grammaticalization — and not in terms of, for instance, regular language-internal change which would have happened in the language no matter what — can be strengthened by the following argument. While Italian, like all other Romance languages, has given up the use of synthetic degree markers, there are a few synthetic forms left in it, and it is precisely in these cases that the synthetic forms have survived in Molisean, at least as optional variants. Thus, the way to render the English comparative of inequality better involves either a synthetic or an analytic form in both Italian (29a–b) and Molisean (30a–b). (29) Italian a. migliore good.cmpr ‘better’ b. più buono more good ‘better’

On the frills of grammaticalization 203

(30) Molisean Croatian a. bolij good.cmpr ‘better’ b. veče dobar more good ‘better’ Much more relevant to the present discussion, however, is the existence in Molisean of a third form which represents a case of double marking and, accordingly, linguistic bulking. This third variant consists of the old, synthetic comparative form together with the new degree marker, veče, as shown in (31). (31) Molisean Croatian veče bolij more good.cmpr ‘better’ (lit. ‘more better’) Heine and Kuteva (2006: 76) propose the following four-stage scenario of evolution for this particular contact-induced grammaticalization process from the synthetic to the analytic comparative construction: – Stage I: Synthetic form (e.g. bolij ‘better’). – Stage II: An analytic degree marker is added (e.g. veče bolij). – Stage III: Redundancy is eliminated in that the positive (basic) form of the adjective is used after the degree marker (e.g. veče dobar). – Stage IV: The old synthetic form disappears (this stage has not been reached in our example). Similarly, Standard Croatian displays the Slavic distinction between the comitative and the instrumental: the comitative is marked by the preposition s ‘with,’ whereas the instrumental is encoded by means of a special instrumental case suffix, -em, going back to Old Church Slavic, as in (32). (32) Standard Croatian nož-em knife-ins ‘with a knife’ (Breu 1996: 26) But in the high-contact variety of Molisean Croatian, the use of the comitative preposition has been extended to mark also instrumental participants, on the model of the contact-language Italian (cf. (33) below).

204 Tania Kuteva

(33) Italian con un coltello with a knife ‘with a knife’ (Breu 1996: 26) The result, however, is not an immediate replacement of the old suffixal form by the new, analytic form with the preposition ‘with.’ Rather, there is a gradual transition towards the new analytic construction, and the initial step in this transition is morphemic bulking, that is, increase in the linguistic substance used for encoding the same function. What happens in this particular language-contact situation is that the existing instrumental case suffix -em is first strengthened by the comitative adposition s ‘with’ before it is lost. Consider (34) below. (34) Molisean Croatian s nož-em with knife-ins ‘with a knife’ (Breu 1996: 26) In other words, before the analytic expression becomes well-established in the replica language under the influence of the model language, the existing synthetic expression in the replica language is first strengthened by an analytical form and then it is lost. This particular period in the development of the new analytic structure is thus characterized by linguistic bulking. Yet another group of examples of accretion resulting from double marking in contact-induced grammaticalization represents a clear transitional pattern in both time and space. One of these examples involves a phenomenon investigated in depth and with convincing precision in Dahl (2004b), whose analysis I will follow in the discussion below. Dahl (2004b) offers a detailed study of the overdetermination or double determination in the area of definiteness marking in some varieties of Scandinavian languages (Swedish and most forms of Norwegian) as contrasted to other Scandinavian varieties (Danish and marginally Norwegian). Generally speaking, there are two definite articles in Continental Scandinavian languages (Danish, Norwegian and Swedish): (i) a preposed article, homophonous with the demonstrative den, as in (35), and (ii) an article which is suffixed to the head noun, as in (36), both from Dahl (2004b: 148). (35) Danish and marginally in Norwegian det stor-e hus dem big house ‘the big house’

On the frills of grammaticalization 205

(36) Danish hus-et house-def ‘the house’ The preposed article is only used when the noun is preceded by an attribute, as in (35) above. However, while in Danish and marginally in Norwegian the suffixed article is restrained whenever the prefixed article is used, in Swedish and most forms of Norwegian both articles are used in cases when the head noun is preceded by an attribute. Consider (37), also from Dahl (2004b: 148). (37) Swedish and most forms of Norwegian det stor-a hus-et dem big house-def ‘the big house’ Dahl (2004b: 147) views this double use of the definite article as the result of two different grammaticalization processes and the “competition between them about the same territory.” A closer look at the geographical linguistic situation shows that the double marking of definiteness arises precisely in a transitional zone, that is, in the area where language contact is most intense between distinct Scandinavian varieties, in which the different definite article types represent two separate grammaticalization processes. A brief diachronic survey reveals that the suffixed article is the historically older type, going back to Old Scandinavian. With the agricultural expansion which occurred from 1050 to 1350, Old Scandinavian spread northwards. As there is not so much fluctuation of population in the north, traits of the old forms remained preserved and are still to be found as conservative patterns in the dialects of Northern Sweden and Swedish-speaking Finland and Estonia. As a matter of fact, suffixed-article use in northern Sweden is much higher than in Standard Swedish, while the southern and western dialects tend to use less suffixed articles than Standard Swedish. Regarding the total distribution of the suffixed article in the three Continental Scandinavian languages, the most extensive use is found in the north-east; the rest of Sweden and Norway comprise an intermediate zone, whereas Denmark restricts the suffixed article most strongly. On the other hand, the preposed article is the innovation in the Scandinavian area and exhibits the Western European pattern of preposed article use. Its occurrence is strongest in Danish, which is also geographically closest to the western preposed article languages. With the Danish colonization of South Sweden, the Danish preposed article pattern came into use there. Thus, the use of this pattern is much higher in South Sweden, higher than in Standard Swedish, whereas its use is

206 Tania Kuteva

below average in North Sweden. Dahl (2004b: 176) divides the Continental Scandinavian dialect continuum into five different sub-areas, with respect to the strength of the P[refixed]-article, ordered in decreasing strength from the southwest towards the northeast as follows: (i) S[outh] W[est] Jütland (where the P[refixed]-article is in general used as a definite article); (ii) the rest of Denmark; (iii) Norway and southern and western Sweden; (iv) central Sweden; (v) northern Sweden.

In sum, the two distinct grammaticalization processes described above have different centres of gravity: the preposed article in the south and the suffixed article in the north. It is precisely in the area of high-intensity contact, namely in the geographically transitional zone, that these two grammaticalization patterns overlap, giving rise to double marking and linguistic material accretion.

4.3 A caveat The cases discussed above do not allow us, however, to make sweeping generalizations like ‘every example of material bulking in the expression of a structure in a high-contact language variety is the result of contact-induced grammaticalization.’ For example, on the basis of the diachronic development of the andrelative clause construction in Irish English, Pfaff (2004) shows that a high-contact variety such as Irish English may exhibit a case of accretion in the encoding of its relative clause construction, yet this accretion is to be traced back to an earlier structure in one of the contact varieties (in this particular case, English) rather than to a contact-induced grammaticalization process.

5. Accretion in contact-induced grammaticalization: An explanation One of the basic assumptions in grammaticalization, formulated on the basis of crosslinguistic findings on language-internal evolution, is that this is a process typically accompanied by linguistic attrition. This means that accretion in contactinduced grammaticalization can be seen as a highly unwelcome phenomenon which disturbs our expectations about how a grammaticalization process should take place in time and space. In what follows, however, I will argue that there is nothing puzzling about accretion in contact-induced grammaticalization. On the contrary, it turns out to be a fulfillment of an expectation relevant to a particular aspect of grammatical language change. I will also show that material accretion in contact-induced grammaticalization is nothing else but a specific sub-case of the residual quirks phenomenon mentioned in Section 1.

On the frills of grammaticalization 207

The first group of cases of linguistic accretion due to contact-induced grammaticalization discussed in Section 4.1 above, namely those in which a more elaborate expression from the model language replaces a less elaborate one in the replica language, can be explained away easily: once the new, more elaborate expression becomes attracted into the grammaticalization development in the replica language, it will run the same, ordinary, crosslinguistically recurrent path with the same stages and the same characteristics, one of them being erosion. The more challenging group of cases is the second one, involving coexistence of two expressions for the same function, one from the model language and one from the replica language (cf. Section 4.2 above), and I will discuss this in greater detail. As mentioned above, contact-induced grammaticalization always involves language-internal grammaticalization (but not vice versa). Accordingly, the phenomenon under discussion here can be explained in terms of a particular characteristic of both language-internal and contact-induced grammaticalization. This characteristic involves the notion of overlap. In what follows I will offer two explanations of accretion in contact-induced grammaticalization, both of which are centred on this notion. The first explanation has at its core the notion of a diachronic overlap stage of the grammaticalization development, the second that of a synchronic buffer zone in geographical space.

5.1 Diachronic overlap stage Accretion in contact-induced change has already been documented in cases where the transfer of linguistic material from the donor into the receiving language concerns a borrowing, that is, a meaning-form pairing. Thomason (2001: 152) describes the following case of borrowing of the coordinate conjunction ‘and’ in Siberian Yupic Eskimo. The conventional way of marking phrase coordination in this language is by means of a native comitative suffix -lju ‘with,’ as illustrated in example (38) below. (38) Siberian Yupic Eskimo Nunivaymi kijaxtaqut tiyiyat-lju qawayit-lju. on.the.tundra live beasts-with birds-with ‘Beasts and birds live on the tundra.’ (Thomason 2001: 152–153) Once the conjunction inkam ‘and’ was borrowed from Chukchi, Siberian Yupic Eskimo came to use sentences with both kinds of coordination marker, the native comitative suffix -lju ‘with’ and the new conjunction inkam, as in (39), in variation with sentences where only the new conjunction inkam was employed.

208 Tania Kuteva

(39) Siberian Yupic Eskimo Nunivaymi kijaxtaqut tiyiyat-lju inkam qawayit-lju. on.the.tundra live beasts-with and birds-with ‘Beasts and birds live on the tundra.’ (Thomason 2001: 152–153) This redundant marking turns out to be only an intermediate stage in the development of the language before the native marking of coordination was completely replaced by the borrowing from Chukchi. This paper, however, is concerned not with such cases of borrowing of a meaning-form pairing but with cases where a meaning or concept is borrowed from a contact language and (part of) a grammaticalization process involving this meaning or concept is replicated on the model of the contact language. As I have shown already, there exist a number of cases where such a contact-induced grammaticalization is characterized by linguistic accretion. In what follows, I will argue that this peculiarity, that is, the accretion resulting from contact-induced grammaticalization, is a characteristic of a transitional period, which occurs when one structure (the already existing structure of the replica language) is replaced by another (the new, replicated structure from the model language). After this transitional period, the process of erosion or reduction sets in along with the three other processes mentioned above as mechanisms or parameters of grammaticalization, namely extension, desemanticization and decategorialization. Disturbing as it may seem at first sight, this peculiarity is, in fact, not a problem for an explanation in grammaticalization theory. On the contrary, it may well be identified as one more parallel between language-internal and contact-induced grammaticalization, for the following reason. The transitional, overlapping stage of the coexistence of marking from the model and the replica language within the replica language structure observed in contact-induced grammaticalization is strongly reminiscent of a process also characteristic of language-internal grammaticalization, whereby a historically earlier structure occurs in combination with the newly arisen, historically later structure. This is where the notion of residual quirks springs back to life: the historically earlier structure is our residual quirk. As for the historically later structure, language does not care where it comes from. If it comes from the same language, then we speak of an overlap stage in languageinternal grammaticalization. If it comes, however, from a contact language, then we have a case of an overlap stage in contact-induced grammaticalization. Notice that the expectation to find a diachronic overlap stage in contactinduced grammaticalization falls out naturally from the insight that contactinduced grammaticalization entails language-internal grammaticalization. Thus, it is a standard observation in grammaticalization that the existing expression of a grammatical category may start fading out and that it may eventually be replaced

On the frills of grammaticalization 209

by a new form encoding the same category.9 As is well-known, a major characteristic of this renewal grammaticalization process is an overlap stage. The kind of overlap which has most often been described by students of grammaticalization is what can be termed overlap of function: at a particular point in time in the development of a particular language, two (or more) expressions may compete for the marking of the same grammatical function, that is, they overlap in function.10 The competition usually leads to recession or loss of the older expression. An illustration of this kind of overlap is the competition between the new, periphrastic future and the old synthetic form in Latin: the periphrastic construction existed in Late Latin long before the eventual loss of the future -b- form (Hopper and Traugott 2003: 124). Another kind of overlap frequently discussed in the grammaticalization literature involves overlap of meaning: at the diachronic overlap stage, two meanings, the historically earlier one A and the historically later one B, coexist, the result being ambiguity: A > AB > B. Let us illustrate this by means of the development of the body part term megbé ‘back’ (as in My back is cold) into the adverb megbé ‘back’ (as in He stays back) in Ewe. Heine et al. (1991: 65–68) show that, somewhere along this grammaticalization path, there was an intermediate overlap stage where the form megbé was ambiguous between an earlier meaning (‘back part of an object’) and a later meaning (‘space behind that object’). Consider example (40) from Heine et al. (1991: 66). (40) Ewe dzra xɔ-á pé megbé ɖó. prepare house-def of back ready ‘Prepare the back wall of the house!’/‘Prepare the place behind the house!’ There is, however, another kind of overlap, overlap of form, which is the one relevant to the present discussion (cf. also Kuteva and Heine 2005). It involves the coexistence of the older form a with the newly emerging form b within an existing grammatical structure which remains essentially the same from the point of view of function: Earlier stage a

Overlap stage a+b

Later stage

b

9. Such a renewal of grammatical structures usually involves a free (non-bound) form, which over time may or may not acquire an affixal status again, and the process may repeat itself more than once in the lifetime of a language (Heine et al. 1991). 10. The way in which this is usually described in the literature is that the two (or more) competing expressions are in a paradigmatic relationship (cf. also the notion of layering in Hopper 1991).

210 Tania Kuteva

Another way of articulating this is to state that, from the perspective of a particular point in time in the development of a given structure, accretion is what we observe every time a grammaticalization renewal process allows for a marker to be redundantly buttressed by another marker which is functionally equivalent. This other marker may come from the same language (language-internal grammaticalization) or it may be donated by a contact language (contact-induced grammaticalization). In the latter case, the new marker will be transferred from the model into the replica language as either (i) a (form-plus-meaning) borrowing11 or (ii) the concept or idea underlying a formal expression built out of the material of the replica language. After that point in time, that is, the transitional, overlap stage, the process of erosion will set in just as it typically does in cases of grammaticalization. Note that there also exist cases of accretion in contact-induced grammaticalization for which we have no conclusive evidence indicating a transitional period; instead, what seems to be involved is a permanent structure in the replica language. This, however, does not rule out the possibility — and, in fact, the likelihood — for erosion to take place in the future development of the language. Yet, there is at least one more explanation from a grammaticalization perspective which takes care of such cases and possibly also of others. This involves the notion of a buffer zone, which I discuss below.

5.2 Synchronic buffer zone The notion of a buffer zone in contact-induced grammaticalization can best be described as the geographical area where two distinct grammaticalizing or grammaticalized structures or, rather, two grammaticalizing processes, meet (cf. also Kuteva and Heine 2005). This understanding of the notion of a buffer zone is based on Stilo (1987, 2005, forthcoming). According to him, a buffer zone is an area where two competing patterns converge, whereby the areal overlap leads to a merger of the two patterns. It has already been shown that what typically happens in a buffer zone is alternation between two existing patterns and/or chunking of linguistic material together, that is, accretion. Let us take as example the Araxes Sprachbund (Stilo, forthcoming), encompassing a number of Iranian, Armenian, Turkic, Semitic, Kartvelian and Lezgic language varieties, where two of a number of isoglosses meet. One of these isoglosses involves oblique pronominal enclitics functioning as possessives with nouns, that is, a postposed possessive; the other isogloss involves independent pronominal forms functioning as possessives with 11. This will then be a case of borrowing and not of what has been referred to elsewhere as grammatical replication (cf. Heine and Kuteva 2005).

On the frills of grammaticalization

Table 1. Buffer zone possessives in Armenian and Caucasian Tati (Stilo, forthcoming) Preposed

Postposed

Circumposed

Armenian

im

tun=əs

im

Caucasian Tati

mænæ xunæ my house ‘my house’

xunæ=mæn house=my

mænæ xunæ=mæn my house=my

tun-ə

tun=əs

nouns, that is, a preposed possessive. What is readily observed in the area where these two isoglosses meet, in the buffer zone, is simultaneously (i) alternation between the preposed and the postposed (enclitic) forms and (ii) circumposed, that is, doubly marked and, hence, accrued, possessive pronominal expression. The usage patterns set out in Table 1 come from two of the language varieties situated within the buffer zone, namely Armenian (Armenian) and Caucasian Tati (Southwest Iranian) (Stilo, forthcoming). Another example of a buffer zone phenomenon involves directional adverbs in southern Germany, more precisely in the transitional dialect zone between northern and southern Alemannic dialects (cf. Stevens 1992). Stevens (1992: 90–93, 210ff.) offers a convincing interpretation of the forms to be found in the area between northern Alemannic dialects, which have the prefixed type (e.g. hinab ‘down’) and southern Alemannic dialects, which are of the suffixed type (e.g. abhin ‘down’). The forms which Stevens identifies as geographically transitional ones involve double marking: one and the same element is used as a prefix and as a suffix, e.g. hinabhin ‘down,’ without any change in meaning (cf. also Hinderling 1980: 269). The phenomenon of double determination (or overdetermination) in Scandinavian languages discussed in Section 4.2 above is also a clear manifestation of a buffer zone where two opposing grammaticalization patterns meet. Recall that the areal distribution of the preposed and the postposed definite articles in Continental Scandinavian varieties reveals the existence of two distinct grammaticalization clines which have two opposing geographical orientations: a. Northeast-to-southwest cline for the postposed definite article with a centre of gravity in the north. b. Southwest-to-northeast cline for the preposed definite article with a centre of gravity in the south. A closer look at this geographical linguistic situation shows that the double marking of definiteness arises precisely in the transitional zone, that is, in the buffer zone, where language contact is most intense between the Scandinavian varieties with opposing grammaticalization clines representing two separate grammaticalization processes.

211

212 Tania Kuteva

In cases like the above, accretion in contact-induced grammaticalization turns out to be a typical manifestation of a particular configuration in the geographical distribution of grammaticalizing structures in a language-contact situation.

6. Conclusions The examples discussed in this paper raise a number of important questions. Firstly, what causes accretion in contact-induced grammaticalization? Is it the need to restore language structure? It seems that the answer to this question should be in the negative, simply because nobody is thinking about ‘saving the language from erosion’ and nobody realizes that ‘language is in distress.’ Another possible answer to this question is that accretion in contact-induced grammaticalization is motivated by the existence of a strong mutual co-presence awareness among speakers of two distinct language varieties, the mutual awareness that your interlocutor has other linguistic means for expressing the same thing and, therefore, you make a symbolic bow to him/her by using also the means of the other. While plausible for some cases of contact-induced grammaticalization, such an explanation contradicts a number of other well-documented sociolinguistic situations involving intense language contact. For instance, the Saraswat Brahmins, the speakers of Saraswat Brahmins Konkani who increased the marking of their relative construction due to the contact with Kannada, have always tended to regard themselves in many ways superior to the Kannada-speaking communities among whom they live (Nadkarni 1975: 680). Perhaps the key to the answer lies in the mutual awareness that your interlocutor has other linguistic means for expressing the same thing and in the incapability of keeping the means of the two languages apart. Such an account is not implausible, especially in the face of what has been referred to as the lack of total deactivation hypothesis in bilingualism (cf. Grosjean and Soares 1986). According to this hypothesis, when bilinguals speak one of their languages, the other language cannot become totally deactivated. This is valid even for completely monolingual situations, that is, even when two interlocutors communicate in only one language throughout the whole communicative event. Since bilinguals are not capable of keeping the two systems apart, one can then assume that, due to the cognitive processing load (cf. Silva-Corvalán 1994), bilingual speakers are prone to using two linguistic expressions, one from the language which they are employing in a particular discourse-pragmatic situation and another from the other language, within the same syntagm or construction, either by borrowing or by translating a concept from the other language.

On the frills of grammaticalization 213

Another possible motivation for accretion is that language users, in the situation of language contact, may encode a particular grammatical structure redundantly on purpose, in order to make sure that, when replicating a novel construction on the model of a language of admired neighbours, they will be understood. Could this be the reason why they pair the innovated linguistic expression with the already existing one in their own language? Such an explanation, if at all correct, would then be compatible with a popular account already proposed for redundant marking in the area of contact-induced lexical change. Thus, in the domain of the lexicon, Crystal (1995: 60) identifies at least one possible motivation for bilingual language users to increase the material for expressing a particular notion. It is a cognitive strategy to which he refers as “explain yourself,” that is, to pair a new word with a familiar equivalent. It appears to be the reasoning applied in cases where bilingual speakers introduce or borrow a particular word from one of their languages into the other language and, for fear of not being understood, encode the same notion also by means of the already existing expression in the borrowing language. The result of this attempt to explain one’s new coinage then leads to synonym compounds, such as English animate or gyue courage to (from the Renaissance period) or persist and continue,12 with Latin being the donor language. All these are questions about the cognitive grammars of speakers, and all of them require psychologically realistic answers. My concern here, however, is not with these cognitively and psychologically realistic aspects of language, but rather with what Haspelmath (2004: 555) has termed the phenomenological dimension, which involves language as it functions in communicative use, as contrasted to language as it is mentally represented in individual brains or minds. What we can conclude about the phenomenology of contact-induced grammaticalization and accretion can be summarized as follows. Firstly, it seems to be beyond reasonable doubt that the examples discussed here, numerous as they may be, do not justify a sweeping generalization of the kind ‘language contact always brings about accretion of linguistic material’ (cf. Section 4.3 above). Such a conclusion would be premature, since there are clear cases where exactly the opposite (i.e. loss of material) is true. When two languages come into contact, there are at least two possibilities for the material of the morphosyntactic paradigms which interact. One of these possibilities is decrease (and eventual loss). Speakers become confused and reluctant to learn the differences between the contact languages; due to the difficulties in learning different morphosyntactic paradigms, speakers may give up these para12. This may well be an additional explanation for the phenomenon of synonym compounding in the Pali-Khmer pairs mentioned in Section 4 and discussed by Haiman (forthcoming).

214 Tania Kuteva

digms. Thus, for example, one of the reasons commonly adduced for the loss of grammatical gender in English is contact with French: when French entered the English language in medieval times, the distinction of gender became problematic, because speakers were confronted with two quite different gender systems. Since it is always difficult to learn genders in a second language, the consequence of this conflict was that grammatical gender was given up. It is cases like this that are usually discussed in the specialized literature. There is, however, a second possibility, as I have tried to show in this paper, namely the addition of linguistic material. Speakers may replicate the more elaborate structural pattern of a given grammatical category from the model language into their own language and either (i) stop using the already existing, native expression for that category altogether or (ii) pair the two patterns (from the model language and from their own language) together within the same syntagm. I have shown in this study that contact-induced grammaticalization is frequently accompanied by linguistic accretion instead of the expected attrition. This is nothing spectacular or puzzling about contact-induced grammaticalization, I contend. On the contrary, it only enhances an important insight gained in recent grammaticalization studies: contact-induced grammaticalization always involves language-internal grammaticalization; hence, the factors (criteria) at work in language-internal grammaticalization can also be seen operating in contactinduced grammaticalization. The explanation which I have proposed for linguistic accretion in contact-induced grammaticalization capitalizes on the notion of overlap stage clearly recognized in language-internal grammaticalization processes. I have shown that, in contact-induced grammaticalization, overlap is manifested both as a diachronic stage of the grammaticalization development and as a buffer zone in the synchronic geographical distribution of competing grammaticalization patterns.

Abbreviations 3 cmpr def dem det ins prs

third person comparative definite demonstrative determiner instrumental present

q rel relpron sbjv sg tns

question marker relative relative pronoun subjunctive singular tense

On the frills of grammaticalization 215

References Alsagoff, Lubna & Ho Chee Lick. 1998. The relative clause in colloquial Singapore English. World Englishes 17(2): 127–138. Bisang, Walter. 1996. Areal typology and grammaticalization: Processes of grammaticalization based on nouns and verbs in East and mainland South East Asian languages. Studies in Language 20: 519–597. Breu, Walter. 1996. Überlegungen zu einer Klassifizierung des grammatischen Wandels im Sprachkontakt (am Beispiel slawischer Kontaktfälle). Sprachtypologie und Universalienforschung 49(1): 21–38. Chumbow, Beban. 1977. Relatives as determiners: A case from Ngemba. In Paul F. A. Kotey & Haig Der-Houssikian (eds.), Language and linguistic problems in Africa. Proceedings of the 7th Conference on African Linguistics, 283–302. Columbia SC: Hornbeam Press. Comrie, Bernard & Tania Kuteva. 2005. The evolution of grammatical structures and ‘functional need’ explanations. In Maggie Tallerman (ed.), Language origins: Perspectives on evolution, 185–206. Oxford: OUP. Croft, William. 2000. Explaining language change: An evolutionary approach. Essex: Pearson Education. Crystal, David. 1995. The Cambridge encyclopedia of the English language. Cambridge: CUP. Dahl, Östen. 2000. The grammar of future time reference in European languages. In Östen Dahl (ed.), Tense and aspect in the languages of Europe. [Empirical Approaches to Language Typology, Eurotyp 20–26], 309–328. Berlin: Mouton de Gruyter. Dahl, Östen. 2004a. The growth and maintenance of linguistic complexity. Amsterdam: John Benjamins. Dahl, Östen. 2004b. Definite articles in Scandinavian: Competing grammaticalization processes in standard and non-standard varieties. In Bernd Kortmann (ed.), Dialectology meets typology, 147–180. Berlin: Mouton de Gruyter. Grosjean, François & Carlos Soares. 1986. Processing mixed language: Some preliminary findings. In Jyotsna Vaid (ed.), Language processing in bilinguals: Psycholinguistic and neuropsychological perspectives, 145–171. Hillsdale NJ: Lawrence Erlbaum. Haiman, John. Forthcoming. The esthetic impulse of grammar. Oxford: OUP. Haspelmath, Martin. 2004. Does linguistic explanation presuppose linguistic description? Studies in Language 28(3): 554–579. Heath, Jeffrey. 1998. Hermit crabs: Formal renewal of morphology by phonologically mediated affix substitution. Language 74(4): 728–759. Heine, Bernd. 1994. Areal influence on grammaticalization. In Martin Pütz (ed.), Language contact and language conflict, 56–68. Amsterdam: John Benjamins. Heine, Bernd, Ulrike Claudi & Friederike Hünnemeyer. 1991. Grammaticalization: A conceptual framework. Chicago IL: University of Chicago Press. Heine, Bernd & Tania Kuteva. 2003. On contact-induced grammaticalization. Studies in Language 27(3): 529–572. Heine, Bernd & Tania Kuteva. 2005. Language contact and grammatical change. Cambridge: CUP. Heine, Bernd & Tania Kuteva. 2006. The changing languages of Europe. Oxford: OUP.

216 Tania Kuteva Hinderling, Robert. 1980. Die Richtungsadverbien im Bairischen und im Alemannischen. Versuch einer Synopse. Mit vier Karten. In Anthony Rowley (ed.), Sprachliche Orientierung I. Untersuchungen zur Morphologie und Semantik der Richtungsadverbien in oberdeutschen Mundarten [Bayreuther Beiträge zur Sprachwissenschaft und Dialektologie I], 249–296. Bayreuth: Sprach- und Literaturwissenschaftliche Fakultät. Hopper, Paul J. 1991. On some principles of grammaticization. In Elizabeth Closs Traugott & Bernd Heine (eds.), Approaches to grammaticalization. Vol. I, 17–35. Amsterdam: John Benjamins. Hopper, Paul J. & Elizabeth Closs Traugott. 2003. Grammaticalization. 2nd edn. Cambridge: CUP. Joseph, Brian D. 2005. Greek verb endings: ‘Lateral shifts’ and grammaticalization. Paper presented at New Reflections on Grammaticalization 3. Santiago de Compostela, Spain, July 2005. Kuteva, Tania. 1998. Large linguistic areas in grammaticalization: Auxiliation in Europe. Language Sciences 20(3): 289–311. Kuteva, Tania & Bernard Comrie. 2001. The typology of relative clause formation in African languages. Paper presented at the International Conference in Honour of G. Greenberg. St. Augustin, Germany, May 2001. Kuteva, Tania & Bernd Heine. 2005. The ‘exception’ on one analysis, the ‘rule’ on another. Paper presented at the 27th Annual Conference of the Deutsche Gesellschaft für Sprachwissenschaft (DGfS). Cologne, February 2005. Liner, E. 1977. Restrictive and non-restrictive relative clauses in Swahili. In Paul F. A. Kotey & Haig Der-Houssikian (eds.), Language and linguistic problems in Africa. Proceedings of the 7th Conference on African Linguistics, 269–282. Columbia SC: Hornbeam Press. Lühr, Rosemarie. 1984. Reste der athematischen Konjugation in den Germanischen Sprachen. Das Germanische und die Rekonstruktion der Indogermanischen Grundsprache. In Jürgen Untermann & Bela Brogyanyi (eds.), Das Germanische und die Rekonstruktion der Indogermanischen Grundsprache. [Current Issues in Linguistic Theory 22], 25–90. Amsterdam: John Benjamins. Matisoff, James A. 1978. Variational semantics in Tibeto-Burman: The ‘organic’ approach to linguistic comparison [Occasional Papers of the Wolfenden Society on Tibeto-Burman Linguistics 6]. Philadelphia PA: Institute for the Study of Human Issues. Matras, Yaron. 1998. Convergent development, grammaticalization, and the problem of ‘mutual isomorphism’. In Winfried Boeder, Christoph Schroeder, Karl Heinz Wagner & Wolfgang Wildgen (eds.), Sprache in Raum und Zeit: In memoriam Johannes Bechert. Vol. II: Beiträge zur empirischen Sprachwissenschaft, 89–103. Tübingen: Gunter Narr. Nadkarni, Mangesh V. 1975. Bilingualism and syntactic change in Konkani. Language 51(3): 672–683. Nau, Nicole. 1995. Möglichkeiten und Mechanismen kontaktbewegten Sprachwandels unter besonderer Berücksichtigung des Finnischen [Edition Linguistik 08]. Munich: Lincom. Pfaff, Meike. 2004. A comparative case study in contact-induced change: Irish English relativization vs. Scottish English relativization. MA thesis, University of Dusseldorf. Silva-Corvalán, Carmen. 1994. Language contact and change. The life of Spanish in Los Angeles. Oxford: OUP. Stevens, Christopher M. 1992. Directional adverbs in 14th century southwestern German: A study in historical dialectology [Göppinger Arbeiten zur Germanistik 570]. Göppingen: Kümmerle Verlag.

On the frills of grammaticalization 217

Stilo, Donald. 1987. Ambipositions as an areal response: The case study of the Iranian zone. In Elena Bashir, Madhav P. Deshpande & Peter E. Hook (eds.), Selected papers from SALA-7, 308–335. Bloomington IN: Indiana University Linguistics Club. Stilo, Donald. 2005. Iranian as a buffer zone between the universal typologies of Turkic and Semitic. In Éva Ágnes Csató, Bo Isaksson & Carina Jahani (eds.), Linguistic convergence and areal diffusion: Case studies from Iranian, Semitic and Turkic, 35–63. London: RoutledgeCurzon. Stilo, Donald. Forthcoming. The Araxes Sprachbund. Stolz, Christel & Thomas Stolz. 2001. Mesoamerica as a linguistic area. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and language universals: An international handbook. Vol. II. [Handbücher zur Sprach- und Kommunikationswissenschaft 20(2)], 1542–1553. Berlin: Walter de Gruyter. Thomason, Sarah Grey. 2001. Language contact. Edinburgh: Edinburgh University Press. Trudgill, Peter. 1983. On dialect: Social and geographical perspectives. Oxford: Basil Blackwell. Trudgill, Peter. 1996. Dual source pidgins and reverse creoles: Northern perspectives on language contact. In Ernst H. Jahr & Ingvild Broch (eds.), Language contact in the Arctic: Northern pidgins and contact languages, 5–14. Berlin: Mouton de Gruyter Trudgill, Peter. 2001. Contact and simplification: Historical baggage and directionality in linguistic change. Linguistic Typology 5(2–3): 371–374. Weinreich, Uriel. 1964. Languages in contact. The Hague: Mouton de Gruyter.

chapter 8

Nominalizations in Bodic languages* Michael Noonan University of Wisconsin-Milwaukee Tibeto-Burman languages, and in particular those in the Bodic subgroup, make extensive use of nominalizations, which serve a wide variety of functions (noun and verb complementation, purpose clause, agent and patient nominals, etc.) far beyond their expected use of naming activities and events. My aim in this paper is to address a number of key issues relating to the history of these extended functions of nominalizations in the Bodic languages. Special attention will be paid to the use of nominalizations in the modification of nouns, that is, as relative clauses. Among other topics, I will discuss the relevance of areality to the phenomenon of nominalization–relativization syncretism, the various sources of nominalizing suffixes, the relativization with the genitive, and the innovations in the system of nominalizations.

1. Introduction Central mountain and highlands Asia has been the scene of extensive linguistic contact over a considerable period. Languages of a number of different genetic phyla have been involved, but so have languages within the Sino-Tibetan phylum representing different stocks with differing typological characteristics. Indeed, the long period of contact between speakers of Sino-Tibetan languages of different stocks has resulted in considerable lexical and grammatical borrowing, which has tended to obscure genetic relationships. As a result, there is still a good deal of uncertainty as to how even major groupings of languages should be positioned within the Sino-Tibetan family tree. One postulated genetic grouping within the Tibeto-Burman branch is an agglomeration of languages spoken in the western reaches of the Sino-Tibetan speaking area, a grouping referred to as Bodic.1 While a number of interesting isoglosses * The work reported on in this paper has been supported by the following grants from the National Science Foundation: DBC-9121114, SBR-9600717 and SBR-9728369. 1. The languages considered Bodic for the purposes of this paper are arranged by subfamily in the Appendix. It is far from clear, however, that the three subdivisions of Bodic as given in the Appendix — Central Himalayish, Bodish and rGyalrong — should be grouped exclusively under a single genetic node. Further, it is not clear that Central Himalayish represents a genetic grouping at all as opposed to a geographic assemblage of Tibeto-Burman languages which have been in contact in the sub-Himalayan region of Nepal for a long period. rGyalrong was tradi-

220 Michael Noonan

cut through this group (see Noonan 2003a for some examples), on the whole they exhibit a rather impressive typological consistency. In particular, languages in this group make extensive use of nominalizations, as documented in Noonan (1997). In these languages, constructions headed by nominalizations — or forms morphologically identical to them — are used for more than the expected functions of nominalizations, such as the reification of events and processes and the expression of clauses as arguments of predicates. For simplicity of exposition, I will refer to a morphological form and the construction that it heads as a nominalization if it includes within its uses the naming of activities or states; by contrast, I will use nominalization in the strict sense to refer specifically to the function of naming activities and states. Given this, one of the fullest ranges of uses of nominalizations among the Bodic languages is to be found in Chantyal, a Tamangic language spoken in Nepal (Noonan 2003b): a. Nominalization in the strict sense (i.e. naming activities and states). b. Verb complementation. c. Noun complementation. d. Purpose clause. e. Relative clause. f. Non-relative attributive. g. Agent and patient nominals. h. Attributive nominal. i. Expression of the semantic predicate in verbal periphrasis. j. Main clause. The use of nominalizations for all these functions in Chantyal is discussed in some detail in Noonan (1997); a number of examples illustrating many of these functions can be found in various places in this paper. My aim here is to discuss a number of issues relating to the history of the grammar of nominalizations in the Bodic languages, in particular issues relating to the use of nominalizations in the modification of nouns, that is, as relative clauses. In all, I will discuss briefly the following five points: (i) the areal context (Section 2); (ii) additional features of the nominalization–relativization syncre-

tionally not assigned to Bodic, but LaPolla (2003a) suggests that this group should be classified together with some (but not all) Central Himalayish languages in a newly defined ‘Rung’ family. Further, Bodish and rGyalrong show interesting similarities in their relational morphology, much more than either group does with Central Himalayish — or, indeed, many Central Himalayish subgroups do with each other (see Noonan 2005b). Unfortunately, the basic groundwork which would establish or contradict many of the relationships proposed in the Appendix has simply not been done.

Nominalizations in Bodic languages 221

tism in Bodic (Section 3); (iii) sources of nominalizers (Section 4); (iv) relativization with the genitive (Section 5); and (v) innovations in the system of nominalizations (Section 6).

2. The areal context In discussing the history of the extended uses of nominalizations in Bodic languages, it is well to put the matter in its appropriate areal context. As we will see, the use of nominalizations beyond their core uses is fairly widespread in Asia. Going back at least to Jakobson (1931), scholars have recognized a special relationship between many of the languages along the axis of Asia, ranging from Siberian languages in the north to South Asia, representing a speech area referred to by Masica (1976) as Indo-Altaic. Within this grouping would be included the Turkic and Mongolic languages, some Siberian languages (e.g. Tungusic, Yeniseian, Yukaghir), Korean, Japanese, Tibeto-Burman (with a more marginal presence of the Sino- component of Sino-Tibetan), Uralic, Burushaski and Dravidian.2 The Indo-European languages found in this region opt out of many of its characteristics: Iranian to a considerable degree (Stilo 2005), Indic to a lesser degree, though modern Indic languages have continued to evolve toward more convergence with the other languages in this speech area.3 A number of features characterize IndoAltaic, though, as one would expect in an assemblage this large and covering so much territory, there are also many important isoglosses restricted only to parts of the whole. However, one characteristic which unites many of the languages in this speech area is a special relationship between nominalization and attribution, one that is either a prominent feature of the contemporary syntax or one that can be reconstructed for an earlier stage in the language. I will refer to this relationship as the nominalization–attribution syncretism and mean by this the state of affairs whereby a morphological marker which functions to signal nominalizations is identi2. Cultural features along with linguistic features traversed this vast territory in ancient times. Eliade (1964) describes the characteristics of ‘Inner Asian Shamanism,’ a religious complex which once pervaded Siberia, Central Asia and the Himalayas. Elements of this tradition have been preserved in the Himalayas to varying degrees, as documented by Hitchcock (1967) and Watters (1975). 3. Among the features which distinguish Indic languages from other Indo-Altaic languages are the use of finite clauses in subordination, (secondary) tense distinctions in non-finite verbals and coordination of sequential clauses in narration (as opposed to the exclusive use of converbs for this purpose). For all these features there is, as noted, a trend toward convergence with the Indo-Altaic type.

222 Michael Noonan

cal to one which functions as a marker of the genitive and/or relative clauses.4 In Japanese, for example, the particle no signals both genitives (1a) and nominalizations (1b).

(1) Japanese a. Genitive watakushi no namae I gen name ‘my name’ b. Nominalization in the strict sense haha ga kaet-te kuru no o mat-te ori-mas-u. mother sbj return-cvb come nmlz do wait-cvb exist-pol-prs ‘I am waiting for my mother to return.’

Of more central interest to us here are languages illustrating a special type of nominalization–attribution syncretism, namely the nominalization–relativization syncretism. In Mongolian, for example, nominalized clauses (2a) are also used adnominally, i.e. as relative clauses (2b) (Binnick 1979).

(2) Mongolian a. Nominalization in the strict sense bata-yn türgen sajn bol-x-yg bid bodo-ǰ baj-na. Bata-gen soon well become-nmlz-acc we think-ipfv be-prs ‘We think Bata will be well soon.’ b. Relative clause ene xüü-gijn suu-ǰ baj-x xot this boy-gen live-ipfv be-nmlz town ‘the town in which this boy lives’

Yukaghir, a Siberian language, shows a similar pattern, whereby nominalizations (3a) can also be used to form relative clauses (3b) (Maslova 1999).

(3) Yukaghir a. Nominalization in the strict sense omo-s' tet qamie-d'e–l met-in. good-intr.3sg you help-det-nmlz I-dat ‘It is good that you have helped me.’

4. In claiming that the nominalization–relativization syncretism characterizes this speech area, I am not claiming that it is unique to the area. Examples can be found elsewhere, for example in English, where the subordinator that is used with both relative clauses and finite complement clauses, the latter filling nominal slots. In European languages, however, this syncretism seldom extends to non-finite nominalizations or adnominals, whereas that is what one regularly finds in the Indo-Altaic speech area.

Nominalizations in Bodic languages 223

b.

Relative clause odu-pe modo-l jalhil-pe-gi Yukaghir-pl live-nmlz lake-pl-poss ‘the lake where the Yukaghirs lived’

Burushaski, a language isolate spoken in the Western Himalayas, instantiates this pattern also, with (4a) illustrating a nominalization and (4b) illustrating an adnominal use of a nominalization (Anderson 2002).

(4) Burushaski a. Nominalization in the strict sense buʈ muškíl bilá góo-ltir-as. very difficult be.iv 2sg-show-nmlz ‘It is very difficult to show (it) to you.’ b. Relative clause c há-aʈe oó-ruʈ-as huk post-supess neg-sit-nmlz dog ‘a dog which doesn't sit at its post’

The Tibeto-Burman languages, and in particular the Bodic branch of TibetoBurman, are prime exemplars of nominalization–relativization syncretism. Except for some Western Himalayish languages which have adopted Indo-European-style relative clause constructions replete with relative pronouns and finite verbs, the Bodic languages are fairly consistent in using nominalizations adnominally. Below are some examples from Chantyal, illustrating a nominal (5a) and an adnominal (5b) use of nominalizations.

(5) Chantyal a. Nominalization in the strict sense nɦi-sә reysi thũ-wa a-kham mu. we-erg raksi drink-nmlz neg-be.able be.npst ‘We aren’t able to drink raksi.’ b. Relative clause mәnchi-sә ca-si-wa gay-ye sya person-erg eat-ant-nmlz cow-gen meat ‘the beef that the person ate’

Because adjectives align with verbs in these languages, adjectives are also prototypically found with the nominalizing affix; in the Bodic languages, the genitive, however, is always distinct. The nominalization–relativization syncretism found in the Bodic languages is thus a subspecies of a larger phenomenon characterizing the Indo-Altaic speech

224 Michael Noonan

area, though there are indeed special features of this syncretism in the Bodic languages which deserve special attention and will be discussed later in this paper. The fact that the nominalization–relativization syncretism is found in a number of neighbouring language families strongly suggests that it is among the set of typological features which are relatively easily susceptible to areal influence, implying that its presence, or even its particular manifestations, cannot be used as evidence for genetic relatedness. This feature thus contrasts with certain other grammatical features which, as argued by Nichols (1992), Bickel (2003) and Bickel and Nichols (2003), are much more likely to be genetically stable.

3. Additional features of the nominalization–relativization syncretism in Bodic In the Bodic languages, the nominalization–relativization syncretism prototypically involves a set of additional uses for nominalizations beyond nominalization in the strict sense and relative clauses. One of these, found throughout Bodic, is the use of nominalizations in verbal periphrasis, a complex topic which will not be discussed in the present paper. Of the additional uses of nominalizations, two are common enough to discuss in this context: (i) agent/patient nominals and (ii) the use of nominalized clauses as main clauses. Agent and patient nominals are very frequently encountered as components of the nominalization–relativization syncretism. Chantyal again can be used to illustrate.

(6) Chantyal a. Agent nominal na-sә reysi thũ-wa-ye naku khway-kәy mu. I-erg raksi drink-nmlz-gen dog feed-prog be.npst ‘I am feeding the raksi-drinker’s dog.’ b. Patient nominal cә lәra pari-wa-ma gәtilo lәra a-ta-si-n that strip make.happen-nmlz-pl good strip neg-become-ant-supp tә. fact ‘Those strips that I made might not have become good strips.’ (I110)5

5. For the Chantyal examples, notations like (I110) refer to the clause numbers of the examples in published discourses and can be found either in Noonan (2005a) or Noonan (1999), discourses A-U and V-Z respectively.

Nominalizations in Bodic languages 225

Agent and especially patient nominals are often referred to as internally headed relative clauses in the literature on Bodic languages. In the context of some languages, that might be a reasonable analysis, but for many or perhaps most Bodic languages, these constructions are probably best analysed as agent and patient nominals. For instance, in (6b), the patient nominal cә lәra pari-wa-ma, ‘those strips that I made,’ is treated as a single nominal and the nominalized verb receives the plural suffix -ma.6 In Bodic languages, when nominalizations appear as main clauses, the typical effect is one of mirativity, i.e. the sense that the predication so expressed is in some sense surprising, contrary to expectation, or in some way exasperating. The following examples from Chantyal, all taken from spoken narratives, illustrate this sense.

(7) Chantyal ci-wa dә sit-nmlz fact ‘I’ll stay!’ (U202)

(8) Chantyal aay, kattay talay tha-i nә a-tha-wa tane gosh definitely cut-ant foc neg-cut-nmlz affirmation ‘Gosh, it did not even cut, right!’ (I56)

(9) Chantyal bɦalu nә puli-puli la-wa ro bear foc wiggle-wiggle do-nmlz hearsay ‘Bear wiggled!’ (L21)

(10) Chantyal bәnnu-ye nal tato ta-si-wa gun-gen barrel hot become-ant-nmlz ‘The barrel of the gun had become hot!’ (R29) Ebert (1994) discusses the use of main clause nominalizations for questions and answers to questions, as in the following example from Athpare. (11) Athpare a-nis-u-es-u–e-n-i ni-ni-ŋ-get-ni-ŋ-na. 2-see-3.pat-pfv-3.pat-pst-nmlz-q see-neg-1sg-aux-neg-1sg-nmlz ‘Have you seen it?’ ‘I haven’t seen it.’ 6. The patient nominal in (6b) cannot be analysed as consisting of a head followed by a postmodifier, since postmodifiers do not otherwise occur in Chantyal.

226 Michael Noonan

Bickel (1999) and Watters (2002) present detailed discussions of the phenomenon for the two Bodic groups, Kiranti and Kham, respectively. These two additional uses of nominalizations point to two essential features of nominalizations in Bodic: they express predications, hence their ability to appear as main clauses, yet they are also noun phrases, and so they can appear as agent and patient nominals. The latter feature also points to the way these nominalizations figure in adnominal modification: at least in Bodic, they are probably best viewed as noun phrases juxtaposed to the noun phrases that they are modifying, the two noun phrases constituting, therefore, a sort of appositional structure. Thus, if we take the basic meaning of reysi thũ-wa as the agent nominal ‘drinker of raksi,’ then the expression in (12) below (12) Chantyal reysi thũ-wa mәnchi raksi drink-nmlz person ‘the person who drinks raksi’ can be understood as being at some level simply an appositive: ‘the drinker of raksi,’ ‘the person,’ as shown in (13). (13) [agent/patient nominali]NP [nouni] NP The relative clause interpretation is arrived at inferentially in a manner similar to the way compounds are understood. At least in the context of the Bodic languages, this would account for the range of uses of these constructions.

4. Sources of nominalizers Except for rGyalrong, all branches and sub-branches of Bodic provide evidence for a nominalizing suffix *pa which can be traced back with this function to Proto-Bodic (DeLancey 2005). On the basis of the comparative evidence, we can suppose that the nominalization–relativization syncretism probably existed at the Proto-Bodic stage in more or less its present form and that *pa was the major, and possibly the exclusive, marker of nominalization. At that stage there were likely no distinctions of tense and aspect: where these occur today, they are secondary (DeLancey 2005). The origin of *pa cannot be definitely determined at this stage, but there is some evidence for the kind of entity from which it might have been derived. In several languages in different branches of Bodic is attested a form also reconstructed to *pa which must go back to Proto-Bodish too. This form, which is traditionally distinguished from nominalizing *pa, is found as a gender marking suf-

Nominalizations in Bodic languages 227

fix together with forms reconstructable to *mo, *ma and *pho. All of these forms can be seen in the following set from the Tamangic language Nar-Phu (Noonan 2003c). (14) Ungendered noun rɦo ‘bond friend’ nôkyu ‘dog’ ræ ˆ ‘goat’ tæ [<*rta] ‘horse’

Male Castrated male rɦo-pε nôkyu-pho râ-pho râ-pε pho-tyεn phô-rtε

Female rɦo-mε nôkyu-mo râ-mo mô-rtε

Nar-Phu -pε derives from *pa, -mε from *ma, -mo from *mo and -pho from *pho. Note that the latter two can appear either before or after the ungendered noun. Cognate forms are found in a number of Bodic languages including Classical and Modern Tibetan (Beyer 1992; Denwood 1999) and the Kiranti languages (Ebert 1994). Some of these forms are clearly related to freely occurring monosyllabic words found in Classical Tibetan: *ma to ‘mother,’ *mo to ‘female, she,’ *pho to ‘male.’ In contrast, there is no freely occurring *pa to which either the gendered form or the nominalizer could plausibly be related, but it is certainly likely that the gendered *pa derived from a similar sort of gendered noun. If the source of the gendered *pa was a noun, as indeed it probably was, then there could well be a connection between the two *pa’s. Where other nominalizing suffixes have developed in Bodic, they typically have a source in a generic noun. DeLancey (2005) shows that, of the four common nominalizing suffixes in Lhasa Tibetan (one of which is derived from *pa), two have origins in nouns: -sa, the locative-dative nominalizer, which derives from the widely attested Proto-Bodic etymon *sa ‘earth, soil,’ and the agentive nominalizer -mkhan, which appears to derive from a noun meaning ‘one who knows a thing thoroughly, making a trade or profession of it.’7 Similarly, the widely attested nominalizer reconstructable to *mi (found, for example, in Kiranti languages and Western Himalayish and in nonBodic Qiangic; cf. LaPolla 2003b) clearly derives from *mi ‘person.’ Therefore, it is possible that both *pa’s derived from the same noun, and the gendered noun points to the sort of noun it might have been. Some other Bodic nominalizers may be traced to combinations of older nominalizers with other morphological material. One possibility is the widely attested Bodic sequential converbal suffix *si, which may come to signal past senses in nominalizations and which, over time, may merge phonologically with a nominalizer. In Chantyal, for example, *pa is realized as the suffix -wa, already illustrated in a number of examples. The converbal suffix -si, still found in that role in 7. The stem occurs also in the noun mkhan-po ‘clerical teacher, professor, doctor of divinity, abbot,’ and is etymologically related to the adjective mkhas-pa ‘skilled, skillful.’

228 Michael Noonan

Chantyal, occurs with -wa to produce a tense distinction in nominalizations, as shown in the comparison between (15) and (16) below. (15) = (12) Chantyal reysi thũ-wa mәnchi raksi drink-nmlz person ‘the person who drinks raksi’ (16) Chantyal reysi thũ-si-wa mәnchi raksi drink-ant-nmlz person8 ‘the person who drank raksi’ The combination -si-wa is commonly pronounced [šo], though the fuller pronunciation is also possible. A number of Bodic languages have nominalizers or attributives which seem to derive from this same combination of morphemes, e.g. Sunwar -šo (DeLancey 1992) and Magar -cyo [čo] (Grunow-Harsta, in preparation).

5. Relativization with the genitive In many of the languages classified as Tibetic in the Appendix, a nominalization, when used attributively with a noun, is accompanied by the genitive. It may always be so accompanied, as in the Tamangic language Gurung (Glover 1974): (17) Gurung cá pxra-bá-e mxi jaga that walk-nmlz-gen person pl ‘those walking people (= sentries)’ (18) Gurung bana·-r-bá-e sĩ forest-loc-nmlz-gen wood ‘trees from the forest’ (19) Gurung dxĩ-r-bá-e ax-chyã·-bá-e gara· gadi house-loc-nmlz-gen neg-good-nmlz-gen influences ‘the evil influences in the house’ 8. Here -si is glossed ‘anterior’ since its function in examples like this is to provide a relative past, i.e. anterior, tense.

Nominalizations in Bodic languages 229

or it may only sometimes occur with the genitive, as in Classical Tibetan, where preposed relatives have the genitive (cf. examples (20) and (22) below), while postposed relatives do not (Beyer 1992), as in (21) and (23):9 (20) Classical Tibetan bla-ma-s btul-ba-i bgegs lama-erg tame-nmlz-gen demon ‘the demon which the lama tamed’ (21) Classical Tibetan bgegs bla-ma-s btul-ba demon lama-erg tame-nmlz ‘the demon which the lama tamed’ (22) Classical Tibetan mgyogs-po-i rta fast-nmlz-gen horse ‘fast horse’ (23) Classical Tibetan rta mgyogs-po horse fast-nmlz ‘fast horse’ Finally, in Tibetan some nominalizers, e.g. -mkhan, never occur with the genitive. The addition of the genitive to the nominalization has the effect of subordinating it to the head. However, it is not entirely clear why only preposed — but not postposed — nominalizations should be so marked and therefore formally subordinated, except that in Bodic genitives precede their heads. DeLancey (2005) suggests that the construction with the genitive should be reconstructed to the common ancestor of the Tibetan Complex and Tamangic, a position which I also took in Noonan (1997). Nevertheless, I now believe that there are several pieces of evidence which caution against this conclusion. First, we now have much better data on the Tamangic languages than we did in the mid1990s and the picture now appears to be far more complicated. The data for the languages in the Tamangic group are summarized below: a. Chantyal: it never uses the genitive. b. Thakali: Georg’s (1996) grammar makes no mention of the genitive with rela9. The postposed relatives in (21) and (23) are probably best considered to be patient nominals. The lack of the genitive in such cases is, then, explained by the fact that the construction is not really one of modification.

230 Michael Noonan

tive clauses, while Hari and Maibaum (1970) assert that the genitive is optional. It should be noted, however, that these scholars investigated different dialects of Thakali. c. Seke: Isao Honda (personal communication) reports that the genitive is optional with nominalizations. d. Manange: Hildebrandt (2003) reports that relative clauses are formed with the nominalizer -pʌ [<*pa], but notes that “at times in relativized contexts the vowel quality of /ʌ/ fronts and sounds like [pe] or [pœ].” DeLancey (2005) interprets this difference to reflect the addition of the genitive: -pʌ-i > -pe. One problem with this interpretation is that the genitive in Manange is -lʌ, not -i. It is probable that, at one point, Manange had a genitive in -i, the modern genitive in -lʌ deriving historically from the dative, often -ra in other Tamangic languages. However, the data from closely related Nar-Phu suggest another interpretation of Manange -pe. e. Nar-Phu: in this language, relative clauses with present senses use the nominalizer ‑pε [<*pa], while those with past senses use -pi. This could be the nominalizer and the genitive -ye. However, this could also be the nominalizer and the morpheme -i which produces past tense interpretations in the copula, as in mû-i, the indirect (i.e. non-witnessed) past of the copula. The source of this -i is not clear,10 but it is not likely the genitive. f. Tamang: the examples in Taylor’s (1973) article suggest that the genitive may be used with relative clauses in Western Tamang, while Mazaudon (2003) states that the genitive is not found in Eastern Tamang. In turn, Varenkamp (2003), also discussing Eastern Tamang, maintains that “it is most common to express the relative with the nominalization only,” i.e. not with the genitive, thus implying that the genitive may also be used. g. Gurung: Glover’s (1974) grammar states that the genitive is always used with relative clauses, making Gurung, then, the only Tamangic language to use the genitive consistently. In sum, within the Tamangic group, the genitive seems firmly established only in Gurung; elsewhere it is either optional or is not used. The other branches of Tibetic present a mixed picture. Ghale (Smith 1999), for example, uses no case marking with nominalizations used adnominally, while Tshangla (Andvik 2003) uses the dative-locative, not the genitive. It should be noted that, in the latter language, the dative–locative and the genitive may both be expressed by -ga; the dative-locative, however, has an alternative form -gu, which 10. The best guess is that this past tense -i is a reduction of -ci, the past tense morpheme on main clause verbs. In other Tamangic languages (e.g. Chantyal), the cognate form reduces to -i under certain conditions.

Nominalizations in Bodic languages 231

is not shared with the genitive. The two cases can, therefore, be formally distinguished and it is the dative-locative that marks nominalizations used adnominally. Nonetheless, Tshangla is close to the pattern found in the Tibetan Complex. It is of course possible that the Tamangic languages formerly employed the genitive and subsequently lost it. It is also possible that, as a group, they never had it, that it is an innovation in the Tibetan Complex and that the Tamangic languages which have the feature have acquired it through areal diffusion from the Tibetan Complex, as did Tshangla. It should be noted in this regard that the Tamangic speaking peoples were in times past under the cultural influence of Tibet and many adopted either the Bon religion or Tibetan Buddhism from that source. Therefore, prior to the Gurkhali conquest of Nepal, whatever literacy the Tamangic peoples possessed was likely to be in Tibetan. In fact, these languages have numerous lexical borrowings from Tibetan, especially those whose speakers are still primarily Tibetan Buddhists, such as Manange, Nar-Phu and Seke.

6. Innovations in the system of nominalizations Innovations in nominalizations have primarily been of three types: (i) elaboration of the categories coded by nominalizers themselves; (ii) extension of the nominalizer–relativization syncretism into new functions; and (iii) the elimination of the nominalization–relativization syncretism. These three types of innovations are discussed in turn below. Elaboration of the categories coded by nominalizers may take a number of forms. First, it may involve the increase in the number of nominalizer morphemes, each of them specialized for a particular sort of lexical meaning. As noted in Section 4 above, Lhasa Tibetan, for example, has four common nominalizer suffixes specialized for meaning. Second, a number of languages have innovated by introducing tense–aspect distinctions (e.g. Chantyal, Nar-Phu). Finally, a third type of elaboration involves the creation of person–number agreement systems for nominalizations. This is characteristic of the Central Himalayish languages, which, even if they are not a clear-cut genetic grouping, nonetheless share a similar typological profile, or at least they did until Indo-European Nepali began to exert considerable influence on the development of these languages over the course of the last two or three centuries (Noonan 2003a). Ebert (1993, 1999) has argued that the pattern of complex subordinate structures (nominalizations and converbs) in the Kiranti languages involving complex person–number agreement patterns as well as tense–aspect distinctions are the result of an ancient contact zone linking the Central Himalayish languages with the Munda and North-Central Dravidian languages. This pattern can also be

232 Michael Noonan

seen in Kham, located at the opposite end of the Central Himalayish range, with the Central Himalayish languages in between exhibiting it to a greater or lesser extent.11 Concerning the extension of the nominalizer–relativization syncretism into new functions, a striking example can be found in Chantyal. In this language the nominalizer morpheme -wa may be suffixed onto adverbs, relative words, locative nouns and case-marked nouns when they are used as modifiers of nouns. As these are discussed in some detail in Noonan (1997), I will simply present a few examples here. (24) Chantyal mәŋgәle-ri-wa mәnchi-ma Mangale-loc-nmlz person-pl ‘people from Mangale’ (25) Chantyal syãlkhәrkә-õra-wa mәnchi Syalkharka-circ-nmlz person ‘person from around Syalkharka’ (26) Chantyal tәyla-wa saka yesterday-nmlz ancestor ‘yesterday’s ancestors’ (V101) (27) Chantyal yәwta dyammәr-ma citro-ma-ye ɦә-sәrә-wa phәlphul-ma-ye one dogwood-pl barberry-pl-gen that-manner-nmlz fruit-pl-gen rәksi raksi ‘raksi from some fruits like dogwood and barberry’ (Q329) As examples (28) and (29) show, this use of -wa contrasts with the genitive. (28) Chantyal ram-siŋ-wa photo Ram-com-nmlz photo ‘Ram’s photo’ (i.e. a photo Ram owns) 11. The issue of the dating and origin of Sino-Tibetan person-number systems has been the object of vigorous debate. See, for example, DeLancey (1989), LaPolla (1992, 1994, 2003a) and van Driem (1993).

Nominalizations in Bodic languages 233

(29) Chantyal ram-ye photo Ram-gen photo ‘Ram’s photo’ (i.e. a photo Ram owns/a photo taken of Ram) Suffixation of -wa is recursive, the limitations being those of sense and processability. Example (30) below shows a non-relative attributive formed from a casemarked attributive nominal. (30) Chantyal mәŋgәle-ri-wa-ma-siŋ-wa photo Mangale-loc-nmlz-pl-com-nmlz photo ‘the photo belonging to the people from Mangale’ Example (31) shows that this form may also fill a nominal slot. In other words, an attributive nominal can be built off of another attributive nominal. (31) Chantyal na-sә mәŋgәle-ri-wa-ma-siŋ-wa-ra dekhә-i. I-erg Mangale-loc-nmlz-pl-com-nmlz-dat show-pfv ‘I showed it to the owners from Mangale.’ While the generalization of the nominalizer as a general marker of attribution has proceeded further in Chantyal than in any other language known to me, other examples can be found. Ebert (1994), for example, describes nominalizer affixes in use with deictic and locative terms (i.e. markers of vertical position) in the Kiranti languages. Finally, as regards the elimination of the nominalization–relativization syncretism, this process has happened partly as a result of contact, partly as a result of purely internal evolution. As already noted (cf. Section 2 above), in the Western Himalayish languages relative clauses take the form of correlative constructions (e.g. who believes my argument, that person will be enlightened) and include relative pronouns, some borrowed from Indo-European, some derived from native word stock, mostly from interrogative forms. Western Himalayish aside, not many Bodic languages have lost the nominalization–relativization syncretism. Magar is an example of a language which has innovated a generalized attributive suffix -cyo [čo], used for both simple adjectives and relative clauses, contrasting with a nominalizing suffix -ke, used for both simple and complex nominalizations (Grunow Harsta, in preparation).

234 Michael Noonan

7. Summary The foregoing discussion has shown that the nominalization–relativization syncretism is a characteristic of a wide region — the Indo-Altaic speech area — of Asia. It is a feature which can spread relatively easily by language contact, as can particular manifestations of this feature, such as the addition of the genitive for adnominal use. The nominalization–relativization syncretism is a prominent characteristic of the Bodic languages, which, in general, have either preserved the original system or elaborated it, incorporating new features (agent/patient nominals, main clause nominalizations). In addition, the nominalizers themselves are often elaborated, with new semantic distinctions introduced (tense–aspect, person–number distinctions, specialized nominalizers, etc.). As for the source of new nominalizers, these are generic nouns.

Abbreviations 1 2 3 acc ant aux circ com cvb dat det do erg fact foc gen intr ipfv

first person second person third person accusative anterior auxiliary circumlative comitative converb dative determiner direct object ergative factitive focus genitive intransitive imperfective

iv loc neg nmlz npst pat pfv pl pol poss prog prs pst q sbj sg supess supp

concord class IV locative negation nominalizer non-past patient perfective plural polite possessive progressive present past question marker subject singular superessive suppositional

References Anderson, Gregory. 2002. Case-marked clausal subordination in the Burushaski complex sentence. Studies in Language 26(3): 543–572. Andvik, Eric. 2003. Tshangla. In Graham Thurgood & Randy LaPolla (eds.), The Sino-Tibetan languages, 439–455. London: Routledge. Beyer, Stephan. 1992. The Classical Tibetan language. Albany NY: State University of New York Press.

Nominalizations in Bodic languages 235

Bickel, Balthasar. 1999. Nominalization and focus in some Kiranti languages. In Yogendra P. Yadava & Warren W. Glover (eds.), Topics in Nepalese linguistics, 271–296. Kathmandu: Royal Nepal Academy. Bickel, Balthasar. 2003. Referential density in discourse and syntactic typology. Language 79(4): 708–736. Bickel, Balthasar & Johanna Nichols. 2003. Typological enclaves. Paper presented at the 5th Biannual Conference of the Association for Linguistic Typology, Cagliari, September, 2003. Binnick, Robert. 1979. Modern Mongolian: A transformational approach. Toronto: University of Toronto Press. DeLancey, Scott. 1989. Verb agreement in Proto-Tibeto-Burman. Bulletin of the School of Oriental and African Studies 52: 315–333. DeLancey, Scott. 1992. Sunwar copulas. Linguistics of the Tibeto-Burman Area 15(1): 31–38. DeLancey, Scott. 2005. Relativization and nominalization in Bodic. Tibeto-Burman linguistics: Proceedings of the 28th Annual Meeting of the Berkeley Linguistics Society, 5–72. Berkeley CA: Berkeley Linguistics Society. Denwood, Philip. 1999. Tibetan. Amsterdam: John Benjamins. Ebert, Karen. 1993. Kiranti subordination in the South Asian areal context. In Karen Ebert (ed.), Studies in clause-linkage, 83–110. Zürich: ASAS-Verlag. Ebert, Karen. 1994. The structure of Kiranti languages. Universität Zürich: Arbeiten des Seminars für Allgemeine Sprachwissenschaft. Ebert, Karen. 1999. Non-finite verbs in Kiranti languages. In Yogendra P. Yadava & Warren W. Glover (eds.), Topics in Nepalese linguistics. Kathmandu: Royal Nepal Academy. Eliade, Mircea. 1964. Shamanism: Archaic techniques of ecstasy. Princeton NJ: Princeton University Press. Georg, Stefan. 1996. Marphatan Thakali. Munich: Lincom. Glover, Warren W. 1974. Sememic and grammatical structures in Gurung (Nepal). Norman OK: Summer Institute of Linguistics. Grunow-Harsta, Karen. In preparation. Magar grammar. PhD dissertation, University of Wisconsin-Milwaukee. Hari, Annemarie & Anita Maibaum. 1970. Thakali texts. In F. K. Lehman (ed.), Occasional papers of the Wolfenden Society on Tibeto-Burman linguistics. Vol. III: Tone systems of Tibeto-Burman languages of Nepal. Part III. Texts I, 165–306. Urbana IL: University of Illinois. Hildebrandt, Kristine. 2003. A grammar and glossary of the Manange language. In Carol Genetti (ed.), Tibeto-Burman languages of Nepal: Manange and Sherpa, 2–189. Canberra: Pacific Linguistics. Hitchcock, John. 1967. A Nepalese shaman and the classic inner Asian tradition. History of Religions IV(2): No. 2. Jakobson, Roman. 1931. K xarakteristike evrazijskogo jazykovogo sojuza. Selected Writings 1: 144–201. The Hague: Mouton de Gruyter. LaPolla, Randy. 1992. On the dating and nature of verb agreement in Tibeto-Burman. Bulletin of the School of Oriental and African Studies 55(2): 298–315. LaPolla, Randy. 1994. Parallel grammaticalizations in Tibeto-Burman: Evidence of Sapir’s drift. Linguistics of the Tibeto-Burman Area 17(1): 61–80. LaPolla, Randy. 2003a. Overview of Sino-Tibetan morphosyntax. In Graham Thurgood & Randy LaPolla (eds.), The Sino-Tibetan languages, 22–42. London: Routledge LaPolla, Randy. 2003b. Qiang. In Graham Thurgood and Randy LaPolla (eds.), The Sino-Tibetan languages, 573–587. London: Routledge:

236 Michael Noonan Masica, Colin P. 1976. Defining a linguistic area (South Asia). Chicago IL: University of Chicago Press. Maslova, Elena. 1999. A grammar of Kolyma Yukaghir. Habilitationsschrift: Universität Biele feld. Mazaudon, Martine. 2003. Tamang. In Graham Thurgood & Randy LaPolla (eds.), The SinoTibetan languages, 291–314. London: Routledge. Nichols, Johanna. 1992. Language diversity in space and time. Chicago IL: University of Chicago Press. Noonan, Michael. 1997. Versatile nominalizations. In Joan Bybee, John Haiman & Sandra A. Thompson (eds.), Essays on language function and language type. In honor of T. Givón, 373–394. Amsterdam: John Benjamins. Noonan, Michael. 1999. Chantyal dictionary and texts. With Ram Prasad Bhulanja, Jag Man Chhantyal & William Pagliuca. Berlin: Mouton de Gruyter. Noonan, Michael. 2003a. Recent language contact in the Nepal Himalaya. In David Bradley, Randy LaPolla, Boyd Michailovsky & Graham Thurgood (eds.), Language variation: Papers on variation and change in the Sinosphere and in the Indosphere in honour of James A. Matisoff. Canberra: Pacific Linguistics. Noonan, Michael. 2003b. Chantyal. In Randy LaPolla & Graham Thurgood (eds.), The SinoTibetan languages, 315–335. London: Routledge. Noonan, Michael. 2003c. Nar-Phu. In Randy LaPolla & Graham Thurgood (eds.), The SinoTibetan languages, 336–352. London: Routledge. Noonan, Michael. 2005a. Chantyal discourses. With Ram Prasad Bhulanja. Himalayan Linguistics Archive 2: 1–254. Noonan, Michael. 2005b. Patterns of development, patterns of syncretism of relational morphology in the Bodic languages. Paper presented at the 17th International Conference on Historical Linguistics, Madison, July 2005. Smith, Holly. 1999. Ghale grammar. MA thesis, University of Wisconsin-Milwaukee. Stilo, Donald. 2005. Iranian as buffer zone between the universal typologies of Turkic and Semitic. In Éva Ágnes Csató, Bo Isaksson & Carina Jahani (eds.), Linguistic convergence and areal diffusion: Case studies from Iranian, Semitic and Turkic, 35–63. London: RoutledgeCurzon. Taylor, Doreen. 1973. Clause patterns in Tamang. In Austin Hale & David E. Watters (eds.), Clause, sentence and discourse patterns in selected languages of Nepal. Part II: Clause, 81–174. Norman OK: Summer Institute of Linguistics. Van Driem, George. 1993. The Proto-Tibeto-Burman verbal agreement system. Bulletin of the School of Oriental and African Studies 56(2): 292–334. Varenkamp, Bryan. 2003. A look at -ba in Central Eastern Tamang. In Tej Ratna Kansakar and Mark Turin (eds.), Themes in Himalayan linguistics, 219–232. Heidelberg: South Asia Institute. Watters, David E. 1975. Siberian shamanistic traditions among the Kham-Magars of Nepal. Contributions to Nepalese Studies 2(1): 123–168. Watters, David E. 2002. A grammar of Kham. Cambridge: CUP.

Chepang Hayu Sunwar

Classical N. Gam Kham Dolakha N. Maikot Kham Jyapu Newari Nishi Kham Kathmandu N. Sheshi Kham Thakale Kham Kaike Magar Raji

Thangmi Baraam

ThangmiBaraam

West Himalayish

Athpare Byangsi Bantuwa Chaudangsi-Byangsi Belhare Chhitkuli Camling Darmiya Dumi Gahri Khaling Kanashi Limbu Kinnauri Thulung Marchha Pattani Tinnani Old Zhangzhung New Zhangzhung

Kiranti

Ghale

Ghale

Bodish

Appendix. Proposed genetic relationships within the Bodic section of Tibeto-Burman

HayuChepang

Kham-Magar

Newari

Central Himalayish

Bodic

Chantyal Gurung Manange Nar-Phu Tamang Thakali Seke

Tamangic

Tibetic

Bumthang Menba

East Bodish

Balti Central Monpa Classical Tibetan Dura Dzongkha Jad Jirel Ladakhi Leh Lhasa Tibetan Nubra Nyam-Kad Purki Sham/Purik Sherpa Spiti Tod

Central Bodish

Tibetan Complex

Tshangla

Tshangla

Cogtse Gyarong Caodeng rGyalrong

rGyalrong

chapter 9

On the rise and fall of Korean nominalizers* Seongha Rhee Hankuk University of Foreign Studies This paper addresses the origin and development of the system of Korean nominalizers from Old to Present-day Korean, paying special attention to their sources, the semantic changes they underwent over time, the competition between various nominalizers for the same functional domain, and the subsequent specialization of some of the forms. The study shows that, in order to avoid functional overlaps, certain nominalizers have become constrained to appear in particular contexts (e.g. -ki and -ci, restricted to affirmative and negative constructions respectively) or to express different levels of illocutionary force (e.g. -m, -ki, -ci and -kes when used as sentential end-markers). The paper also shows how certain nominalizers which overlapped functionally lost their original nominalizing function and acquired new uses in related functional spaces (e.g. the adnominalizers -n and -l, derived from old nominalizers).

1. Introduction In the history of the Korean language there have been numerous nominalizers, developed from various and sometimes obscure sources. The multiplicity of forms, the diversity of grammatical functions and delicate differences in the morphosyntactic and functional behaviour of these nominalizers have attracted the attention of scholars, who have attempted to present generalizations of these phenomena. However, most authors have addressed these issues from a largely synchronic viewpoint, and as a consequence their studies lack expositions of the emergence of the various nominalizers and the interplay among them. The main objective of the current paper is to fill this gap by tracing the development of these nominalizers. Hence, the paper deals with the rise and fall of Korean nominalizers: how they emerged as grammatical forms, how they were used in historical data, how their meanings changed en route and how some of them stopped functioning as nominalizers. The paper also aims at illustrating the emergence of new grammat* This research was supported by the Hankuk University of Foreign Studies Research Fund. Special thanks go to Professors Elizabeth Closs Traugott, Hyun Jung Koo, Foong Ha Yap and Marilyn Plumlee for providing me with valuable comments and directing my attention to relevant literature, and to my research assistant Sujin Eom. All remaining errors, however, are mine.

240 Seongha Rhee

ical functions closely related to nominalization, due to functional overlap among the nominalizers; it discusses additional functions acquired by some of the former nominalizers, specialized functions of nominalization which the present-day nominalizers began to carry and some of the entirely new functions which they obtained as a result of functional extension.

2. Nominalizers in history A brief historical survey of nominalizers in Korean shows that in the Old Korean period, which extends up to the fourteenth century, six nominalizers are attested, namely -l, -m, -n, -i, -ki and -ti. An areal or genetic relationship is evident from the fact that some of these are also attested in other languages, such as Turkish, Mongolian and Manchu. For example, -m, -l and -n are nominalizers in Turkish (Kang 1976) and Mongolian (Poppe 1954), while -n is the most common nominalizer in Manchu (Möllendorff 1892) and in Old Mongolian (Poppe 1955). Among these six nominalizers, -m was the most widely used formative from Old Korean through Middle Korean (fifteenth and sixteenth centuries), but started to be gradually replaced by -ki in the Early Modern Korean period (seventeenth to nineteenth centuries). The nominalizer -m, in turn, was challenged by the short-lived -ti (and -ci, the later variant of -ti). Coexistence of these forms led to functional specializations, examples of which are the use of -ti and -ci in negative constructions and of -ki elsewhere (cf. Section 4.3 below). Despite gaining supremacy over -ti and -ci, the formerly dominant nominalizer -ki is again being gradually overridden by the periphrastic construction involving the nominal kes ‘thing’ in Modern Korean (twentieth and twenty-first centuries) (cf. Kwon 1995). Chronologically, then, nominalizers followed the order [-m > -ki > -kes] as regards productivity. The current predominance of -kes is well illustrated by the following two facts: on the one hand, in Modern Korean it is among the most frequently used words (Choi 1994) and, on the other hand, it is one of the most versatile morphemes; in addition to its primary role as a nominalizer, it is used in complementation, clausal connection and sentential ending with diverse tense, aspect and modality functions. The present-day dominant position of -kes is the result of a centuries-long competition for supremacy among nominalizers. The remaining three Old Korean nominalizers, -l, -n and -i disappeared early in the history of the language as productive formatives. Thus, -l and -n are now used exclusively as adnominalizers,1 whereas -i is an unproductive derivational 1. An adnominalizer is a grammatical category of verbal suffixes which signal that the host verb modifies a noun. Three forms belong to this category in Present-day Korean: anterior adnominalizer (-n), prospective adnominalizer (-l) and simultaneous adnominalizer (-nun), which,

On the rise and fall of Korean nominalizers 241

Old Korean 14th c.

Middle Korean 15th c.–16th c.

Early Modern Korean 17th c.–19th c.

Modern Korean 20th c.

-l -n -i -m -ki -ti -ci -kes Figure 1. Diachronic development of Korean nominalizers

nominalizer whose final products survive only in fossilized forms in Modern Korean (see Sections 3.2 and 4.1 below for further discussion). The rise and fall of the Korean nominalizers just described is diagrammatically presented in Figure 1.

3. The grammaticalization of Korean nominalizers 3.1 Sources Nominalizers in Korean are old grams developed from diverse sources, and the origins of some of them remain unidentified. For example, the early nominalizers -l and -n, now defunct, do not have identifiable sources. As for the origin of -i, there has been considerable controversy. Some scholars, notably Kim (1978), Ryu (1990) and Kang (1993), among others, have advanced the claim that the nominalizer -i was derived from the proximal demonstrative i, with obvious relation to the demonstrative and personal pronoun i, which indexes either a thing or a person. It has also been suggested that this is the source of the nominative case marker -i though distinct from tense markers, are correlated to past, future and present, respectively. See Section 4.4 for further discussion.

242 Seongha Rhee

and of the copula i-. If we accept the assumptions that grammar has a foundation in cognition and that the development of grammatical functions is motivated by conceptual change (Heine 1997), the hypothetically shared origin of these grammatical markers raises the interesting implication that the pronoun, the demonstrative, the nominalizer, the nominative case marker and the copula all have some conceptual relatedness by virtue of sharing a single source, and thus there should be supporting attestations across languages. We shall return to this issue in Section 3.3 below. Another source, which accounts for the largest number of nominalizers, is the lexeme signifying ‘thing’ or ‘place.’ The nominalizers, -ki, -ti, -ci and kes all have this source meaning, the first three even sharing the source lexeme itself (i.e. tA ‘place’). One peculiarity associated with the source lexeme, in terms of its grammatical category, is found in the nominalizer -m. Hong (1957) and Kang (1993) suggest the verb sAm- ‘regard, deem, do, make’ as its source, which supposedly underwent a series of phonologically reductive changes: [sAm > zAm > Am > -m] (Kang 1993: 49, 65).2 However, recruiting a verb for the development of a nominalizer seems unusual due to its lack of nominal characteristics.3

3.2 Uses of nominalizers The nominalizers -l and -n, though thought to have been actively used in Old Korean (Lee 1974; Hong 1983a), occur very infrequently in Middle Korean literature and completely disappear in Modern Korean. Examples (1a–b) illustrate the use of -l and -n in their nominalizing function as attested in Middle Korean.4

(1) Middle Korean a. olh-i taA-l ep-kenAl this.year-nom exhaust-nmlz not.exist-as ‘as this year doesn’t (seem to) end’ (1577, Palsimswuhayngcang 36)

2. As Marilyn Plumlee points out (personal communication), the change from /s/ to /z/ would imply a fortification rather than a reduction and, therefore, the reductive process should be considered as having involved some fluctuation. Likewise, the deletion of /z/ without intermediate stages may seem unnatural. The Old and Middle Korean phoneme /z/, which disappeared completely in Modern Korean, in fact survives either as /s/ or /w/ in Present-day Korean. 3. Verbs, however, seem to serve indirectly as a source of nominalizers. One of the reviewers points out that verbs have been noted as sources of deictics, which, in turn, become sources of nominalizers. 4. The Extended Yale Romanization (Rhee 1996) was used for transliteration of the Korean data.

On the rise and fall of Korean nominalizers 243

b. alph-Ay hehA-sya-n tai hA-syosye. front-at allow-hon-nmlz according.to do-hon.opt.end ‘Please do (it) according to the previous permission.’ (1463, Pephwakyeng Enhay 2: 138) In (1a), the nominalized clausal subject taAl ‘exhausting, reaching the end’ of an embedded causative sentence occurs without a nominative case marker, an omission which is common in Korean. Since the main verb in this clause is unambiguously ep- ‘not exist,’ the nominal status of taAl is unequivocal and so is the nominalizer status of -l. Likewise, in (1b) the necessity of a nominal host for the prolative case marker tai ‘according to’ leads to the interpretation of hehAsyan as a nominal and, consequently, to the interpretation of -n as a nominalizer.5 The use of -i for nominalizing verbs, adjectives, nouns and even onomatopoeic words was among the most productive derivational processes in Middle Korean (cf. Section 4.1 below). Its application was extensive, used for varied functions. One of these was to designate individuated entities, such as the nomenclature of animals and persons named after the use of their typical sounds or their characteristic shapes, as shown in (2), where the onomatopoeic word is combined with a nominalizer. Other functions of nominalizer -i were to derive eventive nominals, as shown in (3a), or propositional nominals, as in (3b) below.

(2) a. b. c.

Middle Korean kulyek-i ‘wild goose’ (lit. ‘one that makes the sound of ‘kulyek’’) Middle Korean pwuheng-i ‘owl’ (lit. ‘one that makes the sound of ‘pwuheng’’) Modern Korean ttwungttwung-i ‘a fat person’ (lit. ‘one that looks ‘ttwungttwung’’)

(3) Middle Korean a. seng ssa-ø sal-i-lAl sicakhA-nila. castle build-nfin live-nmlz-acc begin-end ‘He built the castle and began a life (there).’ (1458, Welinsekpo 1: 44) b. wuli pap mek-i mAch-atun we meal eat-nmlz finish-if ‘when we finish eating’ (1517, Penyek Nokeltay 1: 55)

5. The prolative marker -tai also originated from tA ‘place’ and, therefore, the nominalizer -n could have been (re)analysed as an adnominalizer. The categorical fluctuation between these two grammatical functions may have been affected by context-induced reinterpretation in such situations (Heine et al. 1991).

244 Seongha Rhee

The nominalizer -m, in turn, was widely used in Old Korean (Hong 1983a), and was in fact the most common nominalizer in this period as attested in Hyangka poetry; cf. example (4) below.

(4) Old Korean ka-n pom kuli-m-ay pass-adnz spring miss-nmlz-at ‘because of missing the past spring’/‘because things miss the spring that went by’ (692–702 A D, Mocwukcilangka)

In Modern Korean -m is still productively used. There has been much research concerning the meanings of the nominals derived from this process, especially in contrast with its competitor -ki. The results, however, are contradictory (cf., among others, Choe 1961; Jang 1966; Im 1974; Chae 1979; Shim 1980; Kwon 1981; Hong 1983b; Song 1992; K. Kim 1996). Hong’s (1983b) generalization, for example, is that the -m-derived nominals tend to highlight the temporary and concrete nature of an event; cf. the examples in (5) below.

(5) Modern Korean a. ka- ‘go’ > kam ‘going’ b. po- ‘see’ > pom ‘seeing’ c. aphu- ‘be painful’ > aphum ‘pain’ d. palk- ‘be bright’ > palkum ‘brightness’

As for the development of the -ti class of nominalizers (i.e. -ti, -ki and -ci), there are contradictory claims as to the order of emergence between -ti and -ki (cf. Lee 1991 vs. Hong 1983a). Scholars generally agree, however, that -ci is definitely a later development. The difficulty in establishing the order is attributable to the fact that both -ti and -ki are attested in Old Korean, albeit infrequently. By Middle Korean they become more widely used and there are many instances in parallel texts which indicate that they are interchangeable in this period (Shim 1990). However, the use of -ki became increasingly frequent from Middle Korean, driving out its rival -ti and even infringing on the domains occupied by its older and more widelyused competitor -m. With -ki gaining power, -ti gradually changed to -ci through a then widespread phonological change of palatalization, which eventually resulted in the total disappearance of -ti by Early Modern Korean times. An interesting aspect of this change is that, with the disappearance of -ti, -ki came into competition with -ci, which was relegated to a peripheral role, that of nominalization in negative sentences (cf. Section 4.3 below). In Modern Korean -ki is still a very productive nominalizer. Illustrative examples of the nominalizers -ki, -ti and -ci as used in early texts are given in (6a–c).

On the rise and fall of Korean nominalizers 245

(6) a. Middle Korean cyemun nah-ay kulsu-ki-wa kal psu-ki-wa pAyho-ni young age-at write-nmlz-and sword use-nmlz-and learn-as ‘as I learned writing and fencing at a young age’ (1481, Twusienhay I.7: 15) b. Middle Korean pephuy-lo kyecip sam-ti hA-si-nila. religious.joy-with wife regard-nmlz do-hon-end ‘He regarded his religious joy as his wife [i.e. thus kept celibacy].’ (1575, Wenkakkyeng 3: 77) c. Early Modern Korean tut-ci kAcang saylop-tota. hear-nmlz very be.new-exclam ‘It is truly a new thing to hear.’/‘To hear it is very refreshing.’ (1632, Twusienhay II.17: 16)

Finally, the grammaticalization of kes has received much attention (cf. notably Yoon 1995; E. Kim 1996; Sohn 1997; Park 1999). Despite its obvious lexical status as a noun, it lacked syntactic autonomy (Wang 1988) and semantic specificity as far back as the earliest extant data, a fact suggesting that it had considerably lost its syntagmatic variability (Lehmann 1995) and that its meaning had been generalized as a result. In fact, Middle Korean kes is generally classified as a member of a class of defective nouns, which includes a large number of substantives which could not appear by themselves and had to occur obligatorily with a modifier. These defective nouns had bleached meanings, such as ‘time,’ ‘place,’ ‘person,’ ‘thing,’ ‘cause,’ and so on, but kes gradually expanded its context of use and replaced other defective nouns (Wang 1988; Jung 1991; E. Kim 1996). Among the earliest examples of kes are those given under (7a) and (7b) below.

(7) Middle Korean a. mul-ey-s kes-i-mye mwuth-uy-s kes-i-mye water-at-gen thing-cop-and land-at-gen thing-cop-and ‘whether it be a thing in the water or a thing on the land [i.e. fish or animal]’ (1459, Welinsekpo 1: 11) b. thayca-s-pep-un kecusmal-Al ani hA-si-non kes-i-ni prince-gen-rule-top lie-acc not do-hon-adnz thing-cop-as ‘since the prince’s rule states that princes should not tell a lie’ (1447, Sekposangcel 6: 25)

246 Seongha Rhee

3.3 Semantic change Given that grammaticalization is a process whereby the grammaticality of a linguistic form increases and that grammaticality involves both meaning and function, semantic change constitutes an important concomitant of grammaticalization and, thus, it is sometimes even equated with grammaticalization. However, as Traugott (personal communication) points out, despite strong correlation, equating semantic change with grammaticalization is erroneous, since there are numerous instances of semantic change which do not involve grammaticalization. In addition, distinguishing between lexical semantic change and grammatical semantic change is useful, the latter being more regular than the former (cf. Lessau 1994: 756). Since it is hypothesized that grammaticalization is actuated by semantic changes (cf. Traugott and Dasher 2002: 283; see also Fleischman 1982; Bybee et al. 1994; Hopper and Traugott 2003), a discussion of the semantic changes occurring in the course of the development of the Korean nominalizers from their known sources is called for. The first semantic and functional change to be considered is that associated with the development of the nominalizer -i, whose emergence, as mentioned in Section 3.1, involved a demonstrative which was also the source of the demonstrative and personal pronoun i, the nominative case marker -i and the copula i-. Let us examine the relatedness of these forms in their semantic and functional aspects from a crosslinguistic viewpoint. The relation between demonstrative and pronoun seems to have a strong conceptual motivation. In its most primitive form, probably accompanied by a gestural sign, a demonstrative functions as a device to index an entity in the real world. A pronoun, in turn, is a more grammaticalized use of the demonstrative to index an entity according to the paradigmatic organization of means of reference. This is an instance of semantico-pragmatic change of a linguistic form from the de re domain to the de dicto domain. As a matter of fact, it is common for a proximal demonstrative ‘this’ and a proximal pronoun signifying ‘this person, this thing’ to share a source, as evidenced by the demonstrative-pronoun relations in English this, French il from the demonstrative ille in Latin, Pharaonic Egyptian pw (Gardiner 1957), Lezgian a/am (Haspelmath 1993), Turkish o (Lewis 1985) and Early Eastern Australian Pidgin English dat (Baker 1995), as discussed in Traugott (1980), Casad (1984), Givón (1984), Heine and Reh (1984), Diessel (1999), Heine and Kuteva (2002) and numerous others. The relation between nominalizer and pronoun is also conceptually motivated in that both are devices which enable speakers to refer to an entity. Horie (1998) and Yap et al. (2004), focussing on the fact that Japanese -no as a sentential nominalizer involves events or propositions, suggest that the pronoun involves

On the rise and fall of Korean nominalizers 247

reference to first-order entities (i.e. things), whereas the sentential nominalizer involves reference to second-order and third-order entities (i.e. events and propositions, respectively). This relation is also evident in Korean. Thus, for instance, ku kes (literally ‘that thing’), involving the nominalizer kes, is the source of the nonhuman third person pronoun kukes. The change is so minimal, and apparent only in the orthographic representation where deletion of the intervening space between the two morphemes occurs, that some Koreans still write it as two separate words, obviously due to the formal transparency of its phrasal origin in the source construction. The evolution of the nominative marker -i from the demonstrative is also very interesting. In Hyangka, an Old Korean literary genre written with Chinese characters, the selection of the Chinese characters for content items was based on their meaning, whereas for functional items, such as case particles and endings, it was based on either their sound or their meaning. In this genre, the nominative marker was often indicated by a Chinese character si, meaning ‘this,’ or by i, which has the phonetic value [i]. Therefore, it is reasonable to hypothesize that a reanalysis of the type illustrated in the change from (8a) to (8b) below was involved in the genesis of the nominative marker, which is immediately reminiscent of the development of the subject from the topic (cf. Li and Thompson 1976).

(8) Middle Korean a. seycon i sangtwusan-ay ka-si-a […] Buddha this Mt. Sangdu-to go-hon-and ‘Buddha, this went to Mt. Sangdu and […]’ b. seycon-i sangtwusan-ay ka-si-a […] Buddha-nom Mt. Sangdu-to go-hon-and ‘Buddha went to Mt. Sangdu and […]’ (1447, Sekposangcel 6: 1)

From the viewpoint of reanalysis, through which semantic and functional change comes about, a similar account can explain the development of the copula i-, the only difference being that the former pronoun and demonstrative follows the complement, preceding the sentential end-marker, thus lending itself to a reanalysis as the marker of a predicate. The development of a copula from a demonstrative following a topic is also attested in ancient Egyptian (cf. Gardiner 1957 and Koelle 1968, as discussed in Lehmann 1995). From a semantic point of view, it was indicated in Section 3.1 that most nominalizers developed from nouns denoting ‘thing’ or ‘place,’ which suggests that, despite their significant level of abstraction in current use, their relation to the grammatical function is intuitively transparent. This direct relationship between the source semantics and the resultant grammatical function lends support to the source determination hypothesis, which states that the actual meaning of the

248 Seongha Rhee

grammaticalizing construction uniquely determines the grammaticalization path and the resulting grammatical meaning (cf. Bybee et al. 1994: 9). The nominalizers -ti, -ki, -ci and kes illustrate this, the first three sharing a common source ti ‘thing,’ further traceable to tA ‘place’ (Hong 1983a). As for the source of kes, Hong (1983a) suggests that it developed from kes/kas/ kach, which formerly meant ‘thing, skin, surface.’ In this regard, the form encoding the concept of ‘skin, surface’ of a thing extended its use to refer to the thing itself, a case of synecdoche. Incidentally, Modern Korean has a form keth meaning ‘skin, fur, surface, appearance,’ whose pronunciation is identical with that of kes in many phonological environments. The semantic change from ‘thing’ to the hardto-define grammatical meaning of nominalizer did not occur in one fell swoop, but involved many intermediate changes (cf. Park 1999). What we witness first is the change from ‘thing’ primarily referring to entities which are tangible, highly individuated and often generic, as in (9a) and (9b), to any conceptual entities which are intangible and abstract, as in (9c), where kes shows phonological reduction into ke.

(9) Modern Korean a. mul-ey-s kes water-at-gen thing ‘thing(s) in the water/fish’ b. mek-ul kes eat-prosp.adnz thing ‘thing(s) to eat/food’ c. insayng-i-la-nun ke-y cham teseps-ta. life-be-comp-adnz thing-nom truly be.ephemeral-decl ‘The thing which is called “life” is truly ephemeral.’/ ‘Life is truly ephemeral.’

Further semantic generalization occurs in order to include events, states or activities. It is at this stage that kes can be used to nominalize propositional content by virtue of its ability to refer to conceptually complex entities, thus acquiring the status of a full-fledged nominalizer, as illustrated in the following contemporary example: (10) Modern Korean ku-ka ecey cwuk-ess-ta-nun kes-i-pnita. he-nom yesterday die-pst-comp-adnz thing-cop-end ‘The thing is that he died yesterday.’/‘He died yesterday.’ The semantic changes associated with kes, however, do not stop here. It further loses whatever little semantic content it still retains, to refer to not only almost any-

On the rise and fall of Korean nominalizers 249

thing, be it an entity, event or proposition, but also something which cannot be conceptually identified. In other words, the semantic content of kes is completely lost, as is shown in the examples under (11) below, where even native speakers cannot identify the referent of kes and are unable to distinguish the propositional meaning of (11a) and (11b) from that of their counterparts without nominalization, (11a’) and (11b’), except that the former sound more formal and carry a sense of assertiveness on the part of the speaker. (11) Modern Korean a. palo kuttay cenhwa-ka o–n kes-i-ta. right then phone-nom come-ant.adnz thing-cop-decl ‘Right then the phone rang.’ b. loma-nun kulehkey myelmangha-n kes-i-ta. Rome-top so perish-ant.adnz thing-cop-decl ‘Rome fell like that.’ a′. palo kuttay cenhwa-ka o-ass-ta. right then phone-nom come-pst-decl ‘Right then the phone rang.’ b′. loma-nun kulehkey myelmangha-ess-ta. Rome-top so perish-pst-decl ‘Rome fell like that.’ The series of semantic changes of kes can be characterized as semantic generalization through metaphor, since its referential domain was expanded from a concrete to an abstract domain. However, as we have seen in the last stage of semantic generalization, the semantic content of kes is completely lost and only its function survives. It may be for this reason that the nominalizer kes has been subject to massive phonological reduction, often resulting in [k]. In other words, kes seems to be unable to resist reductive changes due to the loss of its semantic content.6 On the other hand, the concomitant phonological reduction renders its lexical source opaque and promotes the development of other epistemic functions, marking the speaker’s attitudes toward the proposition. We shall return to this issue in Section 4.4 below. The semantic change undergone by the nominalizer -m, whose alleged origin is the verb sAm- ‘regard, consider’ (cf. Kang 1993; also Section 3.1 above), shows an interesting aspect of human conceptualization. The nominalizer -m can func6. There are grammatical markers which once involved kes in their source constructions, from which kes disappeared due to extensive reduction. For instance, the topic marker -lan developed from -lako hanun kesun, which literally means ‘as for the thing that people call x.’ See Rhee (2004b) for a discussion of extensive paradigmatic change in grammar due to this type of extreme formal reduction.

250 Seongha Rhee

tion with a wide variety of syntactically complex forms, but when it is used with a simple verb, the derived nominals often refer to concrete entities, as shown in the examples listed in (12). (12) Modern Korean a. cwi- ‘grasp’ > cwum ‘fist’ b. ssawu- ‘struggle’ > ssawum ‘fight’ c. el- ‘freeze’ > elum ‘ice’ d. kuli- ‘draw’ > kulim ‘picture’ These examples show that the verb–noun relationship mediated by the nominalizer -m is construed as one where the effected entity (the noun) is regarded as integral to its effecting event (the verb). Thus, for example, in (12a) above, the noun ‘fist’ is an effected entity of the action of ‘grasping.’

4. Specialization of nominalizers in Modern Korean We have seen in the preceding sections that several nominalizers have existed in the history of the Korean language, each experiencing its own rise and fall. Most of them have survived up to Modern Korean, at least in form, therefore constituting an extreme form of layering (Hopper 1991; Hopper and Traugott 2003); in other words, the old forms and the new forms exist side by side carrying the same or similar functions. Multiple forms coexist and compete for primacy of use in the function concerned, leading to a situation where a small number of forms are predominantly used, whereas all others are relegated to secondary options, usually chosen for genre or register-specific sub-functions. This is exactly the state of affairs in Korean nominalization. As mentioned in Section 2 above, the nominalizer kes is most frequently used in Modern Korean. All other forms are less common and less productive than kes, and each of them has its own specialization, resulting in meaning differences or different levels of illocutionary force when used as sentential ending functions. We now turn to a discussion of differences among the nominalizers available in the language in Present-day Korean.

4.1 Differential levels of abstraction The nominalizer -i has the sole function of naming entities derived from onomatopoeic words (see Section 3.2 above). A large lexical set of names of fish, insects and birds has been formed through this process. Moreover, in child language many land-animals and inanimate entities have such names. Some examples of this phenomenon in child language are given in (13).

On the rise and fall of Korean nominalizers 251

(13) Modern Korean ttallang-i ‘rattle’ (lit. ‘one that makes the sound “ttallang”’) kkwulkkwul-i ‘pig’ (lit. ‘one that makes the sound “kkwulkkwul”’) ppangppang-i ‘car’ (lit. ‘one that makes the sound “ppangppang”’) For verbs and adjectives, the nominalizers show distinct patterns of nominalization in order to derive names of concrete or abstract entities. Their division of labour in Modern Korean is such that, of the three competing nominalizers, -i, -m and -ki, only one is allowed for each word in the process of deriving a fully-fledged noun, whereas either one or both of the other two are allowed for gerundivalnominalization. This is illustrated in Table 1, which shows that -i is the least productive and is never utilized to derive a gerund. Table 1 further shows that -m and -ki are both very productive in deriving fully-fledged nouns and gerunds. Considering how productive -i is reported to have been in lexicalization until Middle Korean, when it was applicable to verbs, adjectives, nouns, onomatopoeia, personal names, etc. (cf. Song 1992; Section 3.2 above), we conclude that its function has considerably weakened over time. Table 1. Entity-name derivation by nominalizers in Modern Korean STEM ket- ‘walk’ mwut- ‘ask’ wus- ‘laugh’ mek- ‘eat’ pel- ‘earn’ phwul- ‘solve’ talli- ‘run’ tenci- ‘throw’ teha- ‘add’ twulyep- ‘be afraid’ kippu- ‘be glad’ sulphu- ‘be sad’ kwut- ‘be hard’ kwulk- ‘be thick’ sey- ‘be strong’ noph- ‘be high’ kil- ‘be long’ kiph- ‘be deep’

-i

meki ‘food’ peli ‘income’ phwuli ‘solution’

nophi ‘height’ kili ‘length’ kiphi ‘depth’

-m

-ki

kelum ‘pace’ mwulum ‘question’ wusum ‘laughter’ Gerund Gerund Gerund Gerund Gerund Gerund twulyewum ‘fear’ kippum ‘joy’ sulphum ‘sorrow’ Gerund Gerund Gerund Gerund Gerund Gerund

Gerund Gerund Gerund Gerund Gerund Gerund talliki ‘running’ tenciki ‘throwing’ tehaki ‘addition’ Gerund Gerund Gerund kwutki ‘hardness’ kwulkki ‘thickness’ seyki ‘strength’ Gerund Gerund Gerund

On the other hand, the nominalizer kes exhibits an interesting pattern of behaviour. In terms of its categorical status, kes is still a noun and is syntactically treated as such. Therefore, it requires an adnominal form preceding it, examples of which are -l for prospective, -nun for simultaneous or -n for anterior with respect

252 Seongha Rhee

to reference time,7 as shown in (14), where kes shows ambiguity between a nounreading and a nominalizer-reading. (14) Modern Korean a. ku-ka mek-ul kes he-nom eat-prosp.adnz thing/nmlz ‘thing that he will eat’/‘that he will eat’ b. ku-ka mek-nun kes he-nom eat-sim.adnz thing/nmlz ‘thing that he eats’/‘that he eats’ c. ku-ka mek-un kes he-nom eat-ant.adnz thing/nmlz ‘thing that he ate’/‘that he ate’ In view of the fact that -i, though unproductive, largely participates in lexicalization, and that kes shows the highest level of productivity, even turning a propositional sentence into a nominal, the level of conceptual abstraction is lowest with the -i-nominalization and highest with the kes-nominalization. As for the relative degrees of abstraction with -m and -ki, there seems to be some variation. However, the examples in (15) below suggest that the -m-nominalization has a relatively lower degree of abstraction than the -ki-nominalization, in the sense that the meaning of the latter tends to be more procedural than that of the former.8 (15) Modern Korean sal- ‘live’ salm ‘life’ salki ‘living’ sanun kes ‘to live’

cwuk- ‘die’ cwukum ‘death’ cwukki ‘dying’ cwuknun kes ‘to die’

wus- ‘laugh’ wusum ‘laughter’ wuski ‘laughing’ wusnun kes ‘to laugh’

4.2 Differential levels of morphosyntactic bondedness Differing specialization among the nominalizers entails differences in syntagma. The three nominalizers at issue have considerable freedom in selecting syntactic constituents as objects for nominalizing processes. For instance, -ki and -m can 7. Note, however, that these adnominal markers may accompany an epenthetic vowel /u/. 8. The notion of the relative degree of abstraction is a tricky issue, and the situation here may be more relevant to quantification, as pointed out by Yap (personal communication). The two nominalizers indeed show different behaviour as to quantification; for example, the quantifier is more compatible with -m-nominalization than with -ki-nominalization. This difference itself seems to be due to different levels of abstraction of reified events and of perception of ‘nouniness’ of the resultant nominals.

On the rise and fall of Korean nominalizers 253

take finite clauses marked with the past tense for nominalization, which, in the case of kes, is done by selecting an anterior adnominalizer, as shown in (16). (16) Modern Korean ku-ka hakkyo-ey {ka-ss-{ki, m}, ka-n-kes} he-nom school-to {go-pst-{nmlz} go-ant.adnz-nmlz} ‘that he went to school’ However, despite their apparent freedom, there are differences between them regarding morphosyntactic bondedness. For example, the three competing nominalizers show different levels of acceptability depending on the morphosyntactic complexity of the nominalized constituents. The examples in (17) below show that -ki is the least compatible and kes is the most compatible with clausal nominalization, with -m occupying a middle level of compatibility. (17) Modern Korean a. ku-ka phathi-ey ka-{*ki, ?m, √nun kes}-ul al-a? he-nom party-to go-{nmlz}-acc know-q.end ‘Do you know about his going to the party?’ b. ku-ka keki-ey ka-{*ki, ?m, √nun kes}-un isangha-ta. he-nom there-at go-{nomz}-top be.strange-decl ‘His going there is strange.’

4.3 Affirmative vs. negative specialization One of the interesting phenomena in the division of labour among nominalizers is that -ki and -ci, both originating from the same lexical source (cf. Lee 1991; Kang 1993; also Section 3.1), exhibit specialization in terms of the affirmative vs. negative distinction of the proposition being nominalized. In their historical forms -ki and -ti, these two formatives were interchangeable as nominalizer/complementizer until Middle Korean (cf. Song 1973, 1977), when the specialization gradually took place, with -ti increasingly occurring with syntactic negation constructions, as in (18b), or lexical negation constructions, as in (18c).9 (18) Modern Korean a. na-nun hakkyo-ey ka-ki caymiiss-ta. I-top school-to go-nmlz be.interesting-decl ‘I like going to school.’/‘For me it is fun to go to school.’ b. ku-nun hakkyo-ey ka-ci anh-ass-ta. he-top school-to go-nmlz be.not-pst-decl ‘He didn’t go to school.’ 9. In both (18b) and (18c) -ci is the palatalized modern counterpart of -ti.

254 Seongha Rhee

c. ku-ka hakkyo-ey ka-l-ci molu-n-ta. he-nom school-to go-prosp-nmlz not.know-pres-decl ‘I don’t know if he will go to school.’ Rhee (2004a) suggests that -ci began as a simple nominalizer and then extended its function to a complementizer in Modern Korean. The implication of this extension is that the nominalizer -ki, which was formerly unable to take future-marked finite clauses for nominalization, can now take any type of finite clause as complement with the aid of the anterior adnominalizer -n, the simultaneous adnominalizer -nun and the prospective adnominalizer -l.

4.4 Adnominalizer vs. nominalizer specialization In an exploration of the paths of development of nominalizers in the history of Korean, the most intriguing phenomenon is the one displayed by the former nominalizers -l and -n. Despite their use in the nominalizer function in Middle Korean, they are used exclusively as adnominalizers in Modern Korean, -l being a prospective adnominalizer and -n functioning as an anterior adnominalizer. This functional change from nominalizer to adnominalizer is interesting in that the two functions, though obviously related,10 belong to two distinct grammatical categories, one an adjective (i.e. modifier), the other a noun (i.e. modified). Some of the early examples of -l and -n nominalization are given in (19a–b). (19) a. Middle Korean nolay-lAl nooy-ya sulphu-l–s eps-i song-acc repeat-and be.sad-nmlz-gen not.exist-advr pulu-nAni […] sing-and ‘repeat the song and sing it without being sad [sadness] and […]’ (1481, Twusienhay I.25: 53) b. Early Modern Korean tek-i-ye pok-i-la ho-n-Al naA-la virtue-cop-and happiness-cop-comp do-nmlz-acc advance-purp o-soita. come-hon.hort ‘Please come to offer the gifts named ‘Virtue’ and ‘Happiness’.’ (1610, Tongtong; Akhakkweypem II 5: 8) 10. Cf. Nishi (2005) and Yap and Matthews (this volume) for Japanese -no; Shin (2005) for Mandarin de; Noonan (1997, this volume) for Bodic languages; DeLancey (1986), Noonan (1997) and references therein for Tibeto-Burman.

On the rise and fall of Korean nominalizers 255

In the above examples, -l and -n are unequivocally nominalizers by virtue of their hosting a case marker (genitive and accusative, respectively). What is interesting is that in Middle Korean there were numerous adnominalizers which depended on such modal notions as indicative, subjunctive and purposive as well as tense– aspect distinctions (cf. Lee 1992: 276–282). These adnominalizers invariably contain -n for past or anterior and -l for future, prospective or undetermined tense. This implies that the Modern Korean adnominalizers date back to Middle Korean, and that Middle Korean adnominalizers had nominalizer functions (Lee 1967; Kim 1975). An examination of the attestations of Middle and Early Modern Korean data where -l and -n carry the nominalizer function, as (19a–b) above, reveals that they indeed carry the semantic properties with respect to these tense–aspect distinctions.11 In view of the fact that -l and -n were among the oldest nominalizers in Korean, as in other typologically related languages, it can reasonably be hypothesized that they were nominalizers with tense–aspect distinctions from the beginning, which, with the other competing nominalizers emerging as a new layer in the nominalizer domain, diverted their paths onto the adnominalizers. The ultimate specialization of these former nominalizers was, therefore, semantically motivated under the pressure of their emerging competitors which had freedom with respect to tense–aspect distinctions. There has been some controversy as to the developmental direction between adnominalizers in relative clauses and nominalizers (cf. DeLancey 1989; Genetti 1992, 1994; Noonan 1997; Hennesy and Givón 2002, among others). The direction of the development between these two functions in -l and -n in Korean is by no means conclusive because of the lack of extensive data showing their original functions prior to Middle Korean. For the same reason, neither can we determine whether the development is related to the availability of appositive interpretation of relative clauses, as suggested by DeLancey (1986).12 11. Traugott (personal communication) points out that nominalizers are not often tense-aspect markers. However, some recent work on Guarani and aboriginal languages suggests that nominals can also be marked for tense or aspect. Modern Korean nominalizers allow co-occurrence of tense-aspect markers, though they do not mark tense and aspect by themselves, in clear contrast with the Old and Middle Korean nominalizers -l and -n. As suggested by Traugott, this issue merits further research. 12. Yap (personal communication) points out that the adnominalization function typically involves embedding, which is a more conservative context (Bybee et al. 1994), and may be exposed to a lesser degree of reductive processes of grammaticalization. It is thus possible to hypothesize that there has been no directional change and that the adnominalizer function had been the original function of -l and -n which persisted over a long time. What we know from the limited data available is that both -l and -n were formerly fully-fledged nominalizers and are now exclusively adnominalizers.

256 Seongha Rhee

4.5 Sentential end-marking functions It is common for linguistic forms to acquire new functions in related domains in the course of grammaticalization. In the grammaticalization of Korean nominalizers, there arose other diverse functions which are intricately interrelated, those of complementizer, adverbializer, clausal connector and sentential end-marker, whose boundaries are often not discrete (cf. Rhee 2004a). In the process of the emergence of these functions, different semantic, syntactic and pragmatic mechanisms operate, largely triggered by the functional and conceptual similarities among them. Of special importance here is their sentential end-marking function with special illocutionary and modal forces. The nominalizer -m can be used to mark the end of a sentence, a function almost restricted to posted prohibitive orders, either by explicitly expressing prohibition, as in (20a), by far the most common structure, or by presenting a factual statement which induces prohibitive interpretation, as in (20b). (20) Modern Korean a. cinipha-ci mos ha-m. enter-comp neg do-end ‘Do not enter.’/‘No entering.’ b. ilpangthonghayng-i–m. one.way.thoroughfare-cop-end ‘This road is one-way.’ On the other hand, -ki rarely functions as a sentential end-marker, its use being restricted to those oral contexts where the suggestions are not assertive but only suggestive. This is often used by children in casual play situations, as shown in (21). (21) nolli-ki eps-ki ridicule-nmlz not.exist-end ‘Don’t ridicule me!’/‘No ridiculing.’ The lack of assertive force of -ki-ending sentences contrasts starkly with its derived form -ci in its sentential ending function. While -ci in its nominalizer/complementizer function is restricted to negative contexts in embedded clauses (cf. Section 4.3 above), in its sentential end-marking function it has no restriction as to negative vs. positive distinctions, as shown in the following examples. (22) Modern Korean a. nay mal an tut-ko mos payki-ci. my word neg listen-and neg bear-end ‘You cannot endure without following my instruction.’/‘You cannot help obeying me.’

On the rise and fall of Korean nominalizers 257

b. c.

ku salam cham coh-un salam-i-ci the person very good-adnz person-cop-end ‘He is a nice person indeed!’ ese cip-ey ka-ci quick house-to go-end ‘Why don’t you go home right away!’

The sentential end-marker -ci in the above examples encodes the speaker’s emotion, belief or determinative attitude to the proposition (cf. Rhee 2004a).13 This semantic characteristic seems to have been inherited from its complementizer function. However, there has been a long drawn-out controversy as to the semantic features of -ci, often leading to contradictory characterizations of the marker. For example, Martin (1992: 453) labels the sentence end-marker -ci as suspective, thus claiming absence of assertive force by virtue of its being low on the scale of epistemic certainty. By contrast, Jang (1973), Ko (1976), Suh (1984) and many others, while agreeing that it encodes supposition, maintain that its meaning relates to promises, agreement-seeking, proposal and so on, all involving a certain degree of speaker’s conviction. The apparent lack of confidence in sentences marked with -ci seems to be attributable to the use of the suppositive morpheme -keyss which often co-occurs with it. Of particular importance to the development of the sentential end-marking function among the nominalizers is that of kes, which, as seen in Section 2 above, is one of the most versatile morphemes in Modern Korean, forming diverse grammatical markers, with varying degrees of morphosyntactic transparency. Since it is still a noun in terms of grammatical categorization (cf. Sections 3.2 and 4.1), it requires all the morphosyntactic trappings needed by regular nouns in order to function properly. Thus, when it develops into a sentential end-marker, it recruits a copula and a sentence-type indicator. Since the construction in which kes participates needs the sentence-final morphemes, the construction is, strictly speaking, a constellation of penultimate sentence-final morphemes rather than the sentence end-marker per se. One of the most commonly used grammatical markers involving the nominalizer kes is -lkesi-, whose function is to serve as the marker of the future tense. In turn, the sentential end-markers -lkei-, -nkei- and -nunkei-, in which kes occurs in the eroded form ke, express the speaker’s attitude and conviction about the proposition (cf. Yap et al. 2005 for Chinese; Yap and Matthews, this volume). This development seems to be attributable to the calquing of the English construction ‘the thing is that . . .’ (Choi 1994) through extensive language 13. The determinative force associated with the ending -ci is prominent in the construction x-ci y-ka anita ‘it is x, not y.’

258 Seongha Rhee

c ontact.14 In addition to the most conservative form -lkesi-, there exist diverse variant forms, such as -lkei, -kke, -kel, -key, -lkeya, -lkkeya, -lkkel and -lkkey, among others (and -nkyo, -kki, -kkelo, etc. in dialectal varieties; Kim 1990). The use of kesforms in sentential ending is illustrated in the following examples. (23) Modern Korean a. nayil-un pi-ka o-lkesi-ta. tomorrow-top rain-nom come-fut-decl ‘It will rain tomorrow.’ b. ta cwuki-epeli-lkei-a. all kill-pfv-end-end ‘I will kill you all!’ c. kuttay ku-ka o-nkei-a. then he-nom come-end-end ‘Right then he came.’ Another set of endings developed from kes is that comprising -nkel, -nunkel and -lkel, which often mark the speaker’s subjective feeling of helplessness about a past event, a current state of affairs or a future event, respectively. When a sentence with one of these endings is uttered with a slight rising intonation, it carries a protest overtone to the addressee. A closely related development is that of the sentence ending with a prospective adnominalizer. For instance, one of the functions of -lkel is to mark the speaker’s regret about a past event. The final element in these endings, -l, is an accusative marker which indicates that the sentence is an elliptical structure, and the entire construction is, in fact, an accusative-marked nominalized clause. Development of sentential endings from ellipsis is a common phenomenon in Korean (cf. Rhee 2002), and the meanings associated with the elided elements come to be associated with the meanings of the non-elided element through pragmatic inference. Therefore, the meanings of the original nonelided counterparts of the examples in (24) below are something akin to ‘What should I do with the fact that I ate all the food?’ or ‘I did not study then despite the fact that I should have studied harder.’ (24) Modern Korean a. pap-ul ta meku-nkel. food-acc all eat-end ‘I ate it all! [What can I do since there is nothing left?]’ 14. Marilyn Plumlee (personal communication) raises a question as to this calquing effect, primarily because the construction ‘the thing is that …’ in English belongs to a formal register, and thus shows low frequency of use. This issue certainly deserves further investigation.

On the rise and fall of Korean nominalizers 259

b. kuttay kongpwu-lul yelsimhi ha-lkel. then study-acc diligently do-end ‘Oh, I should have studied harder then.’ Another sentential ending which developed from kes is -lkey, which carries a promissive function, thus being restricted to first person subject, declarative sentences, as in (25a–b). (25) Modern Korean a. nay-ka sathang sa-cwu-lkey. I-nom candy buy-give-end ‘I will get you some candies.’ b. nay-ka kkok o-lkey. I-nom exactly come-end ‘I promise I will come.’ As was the case with other nominalizers (see above), kes also has a sentential ending function in the form -lkes, as illustrated in the following examples, which are commonly found in posted signs of prohibition or instructions. (26) Modern Korean a. tuleo-ci ma-lkes. enter-comp stop-end ‘Don’t enter!’ b. oynccok-ulo tol-lkes. left-to turn-end ‘Turn left.’

5. Conclusion This paper has aimed to show the different kinds of nominalizers which have been used in Korean throughout history, their origins and the way in which they coped with the cramped situation in the functional domain of nominalization. We have seen that, over the period of approximately 1,500 years for which attestations exist, there have been numerous nominalizers in use in Korean. Their origins can be traced to lexemes with such meanings as ‘place,’ ‘thing,’ ‘this’ or ‘regard.’ In the course of time, under pressure from other competitors, some nominalizers reverted to specialization in a function which had previously been only partially associated with them, as shown by the fact that the -n and -l nominalizers developed into adnominalizers. On the other hand, the nominalizer -i, succumbing to paradigmatic pressure, lost productivity and survives only in some fossilized forms.

260 Seongha Rhee

While most nominalizers still available in Modern Korean have a wide variety of uses with relative syntagmatic freedom, the nominalizers involving kes surpass all others, ranging from a nominalizing use to a sentential ending function with diverse subjectified meanings encoding the speaker’s stance. The development of kes into other grammatical domains seems to be triggered by its formal opacity due to reductive phonological changes.

Abbreviations acc adnz advr ant comp cop decl end exclam fut gen hon hort

accusative adnominalizer adverbializer anterior complementizer copula declarative sentential ending exclamative future genitive honorific hortative

neg nfin nmlz nom opt pfv prosp prs pst purp q sim top

negation non-finite nominalizer nominative optative perfective prospective present past purposive question marker simultaneous topic

References Baker, Philip. 1995. Some developmental inferences from the historical studies of pidgins and creoles. In Jacques Arends (ed.), The early stages of creolization, 1–24. Amsterdam: John Benjamins. Bybee, Joan L., William Pagliuca & Revere Perkins. 1994. The evolution of grammar: Tense, aspect and modality in the languages of the world. Chicago IL: University of Chicago Press. Casad, Eugene. 1984. Cora. In Ronald W. Langacker (ed.), Studies in Uto-Aztecan grammar. Vol. IV: Southern Uto-Aztecan grammatical sketches, 151–459. Arlington TX: Summer Institute of Linguistics and University of Texas. Chae, Wan. 1979. Myengsahwaso -ki-ey Tayhaye (On nominalizer -ki). Kwukehak (Korean Studies) 8: 95–107. Choe, Hyon-Bae. 1961. Wulimalpon (Korean Grammar). Seoul: Jungeumsa. Choi, Kyu-Il. 1994. Hankwuke Kes-uy Sayongkwa Uymi Kinung (3) (Uses and semantic functions of kes in Korean (3)). Cheju University Journal of Humanities and Social Sciences 38: 17–58. DeLancey, Scott. 1986. Relativization as nominalization in Tibetan and Newari. Paper presented at the 19th International Conference on Sino-Tibetan Languages and Linguistics. Ohio State University, Columbus, September, 1986. DeLancey, Scott. 1989. Relativization and nominalization in Tibeto-Burman. Ms. University of Oregon.

On the rise and fall of Korean nominalizers 261

Diessel, Holger. 1999. Demonstratives: Form, function and grammaticalization. Amsterdam: John Benjamins. Fleischman, Suzanne. 1982. The future in thought and language: Diachronic evidence from Romance. [Cambridge Studies in Linguistics 36]. Cambridge: CUP. Gardiner, Alan. 1957. Egyptian grammar: Being an introduction to the study of hieroglyphs. 3rd edn. Oxford: OUP. Genetti, Carol. 1992. Semantic and grammatical categories of relative clause morphology in the languages of Nepal. Studies in Language 16(2): 405–427. Genetti, Carol. 1994. A descriptive and historical account of the Dolakha Newari dialect. Tokyo: Institute for the Study of Languages and Cultures of Asia and Africa. Givón, Talmy. 1984. Syntax: A functional-typological introduction. Vol. I. Amsterdam: John Benjamins. Haspelmath, Martin. 1993. A grammar of Lezgian. Berlin: Mouton de Gruyter. Heine, Bernd. 1997. Cognitive foundations of grammar. Oxford: OUP. Heine, Bernd, Ulrike Claudi & Friederike Hünnemeyer. 1991. Grammaticalization: A conceptual framework. Chicago IL: University of Chicago Press. Heine, Bernd & Tania Kuteva. 2002. World lexicon of grammaticalization. Cambridge: CUP. Heine, Bernd & Mechthild Reh. 1984. Grammaticalization and reanalysis in African languages. Hamburg: Buske. Hennesy, Pelin & Talmy Givón. 2002. Note on the grammar of Turkish nominalizations. In Joan L. Bybee & Michael Noonan (eds.), Complex sentences in grammar and discourse: Essays in honor of Sandra A. Thompson, 125–144. Amsterdam: John Benjamins. Hong, Jongseon. 1983a. Myengsahwa Emiuy Pyenchen (Historical change of nominalizing particles). Kwukekwukmwunhak (Korean Language and Literature) 89: 31–89. Hong, Jongseon. 1983b. Myengsahwa Emi -um-kwa -ki (The nominalizing particles –um and –ki). Language 8(2): 241–272. Hong, Ki-Moon. 1957. Cosenelyeksamwunpep (Historical linguistics of Korean). Pyongyang: Kwahakwen Publishing. Hopper, Paul J. 1991. On some principles of grammaticization. In Elizabeth Closs Traugott & Bernd Heine (eds.), Approaches to grammaticalization. Vol. I, 17–35. Amsterdam: John Benjamins. Hopper, Paul J. & Elizabeth Closs Traugott. 2003. Grammaticalization. 2nd edn. Cambridge: CUP. Horie, Kaoru. 1998. On the polyfunctionality of the Japanese particle no: From the perspectives of ontology and grammaticalization. In Toshio Ohori (ed.), Studies in Japanese grammaticalization: Cognitive and discourse perspectives, 169–192. Tokyo: Kurosio Publishers. Im, Hong-Pin. 1974. Myengsahwauy Uymithuksengey Tayhaye (On semantic characteristics of nominalization). Kwukehak (Korean Studies) 2: 83–104. Jang, Seok-Jin. 1966. Some remarks on Korean nominalization. Language Research 2(1): 18–31. Jang, Seok-Jin. 1973. A generative study of discourse in Korean: On connecting sentences. Language Research 9(2): 1–145. Jung, Howan. 1991. Mayinilumssi sA-uy Hyengthayloncek Kwuseng (Morphological construction of dependent noun sA). Hankul (Korean) 211: 45–78. Kang, Eun-Kook. 1993. Cosene Cepmisauy Thongsicek Yenkwu (A diachronic study of Korean suffixes). Seoul: Sekwanghakswulcalyo Publishing.

262 Seongha Rhee

Kang, Kil-Won. 1976. Hankwukewa Thoikieuy Myengsahyengsengcepsoksauy Pikyo (A comparison of nominalizing connectives in Korean and Turkish). Chungnam National University Journal of Liberal Arts 3(2): 5–26. Kim, Eon-Ju. 1996. Kes-uy Pwunphowa Kinung (Distribution and function of kes). Wulimal Yenkwu (Korean Studies) 6: 179–216. Kim, Heung-Soo. 1975. Cwungseykwukeuy Myengsahwa Yenkwu (A study of nominalization in Middle Korean). MA thesis, Seoul National University. Kim, Ki-Bok. 1996. Myengsahwa Pomwunso -um-kwa -ki-uy Yenkwu (A study of the nominalizing complementizers -um and -ki). Emwunyenkwu (Studies in Language and Literature) 28: 19–34. Kim, Seung-Gon. 1978. Hankwuke Cosauy Thongsicek Yenkwu (A historical study of Korean particles). Seoul: Tayceykak Publishing. Kim, Tae-Yeop. 1990. Uyconmyengsa kes-uy mwunpephwawa mwunpep pyenhwa (Grammaticalization of the defective noun kes and grammatical change). Daegu Emwun Nonchong (Daegu Liberal Arts Papers) 8: 177–198. Ko, Yong-Kun. 1976. Hyentaykwukeuy Mwuncheypepey Tayhan Yenkwu (A study on the stylistics in Modern Korean). Language Research 12(2): 16–53. Koelle, Sigismund Wilhelm. 1968. Outlines of grammar of the Vei language: Together with a VeiEnglish vocabulary and an account of the discovery of the Vei mode of syllabic writing. Farnborou: Gregg International Publishers. Kwon, Jae-Il. 1981. Hyentaykwukeuy ki-myengsahwa Nayphomwun Yenkwu (A study of kinominalized subordinate sentences in Modern Korean). Hankul (Korean) 167: 73–94. Kwon, Jae-Il. 1995. 20seyki Choki Kwukeuy Myengsahwa Kwumwun Yenkwu (A study of nominalization in early 20th century Korean). Hankul (Korean) 229: 203–232. Lee, Deuk-Choon. 1991. Yeksacek Ihwahyensanguloseuy (ti>ki (t>k)) Pyenhwa (Change of (ti>ki (t>k)) as a historical dissimilation). Enemwunhak Hakswulnonmwuncip (Journal of Language and Literature), 1–25. Yenben: Yenben People’s Publishing Company. Lee, Ki-Moon. 1967. Hankwuke Hyengsengsa (A history of the development of Korean). Seoul: Korea University. Lee, Ki-Moon. 1974. Kwukesa Kaysel (An introduction to the history of Korean). Seoul: Minjungsegwan. Lee, Soong-Nyung. 1992. Cwungseykwuke Mwunpep: 15seykilul Cwulo Haye (Middle Korean grammar: With special reference to the 15th century). 2nd edn, revised. Seoul: Ulyou Munhwasa. Lehmann, Christian. 1995. Thoughts on grammaticalization: A programmatic sketch. 2nd edn. Munich: Lincom. Lessau, Donald A. 1994. A dictionary of grammaticalization. Bochum: Brockmeyer. Lewis, Geoffrey L. 1985. Turkish grammar. Oxford: OUP. Li, Charles N. & Sandra A. Thompson. 1976. Subject and topic: A new typology of language. In Charles N. Li (ed.), Subject and topic, 457–490. New York NY: Academic Press. Martin, Samuel E. 1992. A reference grammar of Korean. Tokyo: Charles E. Tuttle Company. Möllendorff, Paul Georg von. 1892. Manchu grammar with analysed texts. Shanghai: American Presbyterian Mission Press. Nishi, Yumiko. 2005. The decline of adnominals and the rise of a post-predicate pronoun and a complementizer: The case of Classical Japanese no. Ms. Cornell University.

On the rise and fall of Korean nominalizers 263

Noonan, Michael. 1997. Versatile nominalizations. In Joan Bybee, John Haiman & Sandra A. Thompson (eds.), Essays on language function and language type. In honor of T. Givón, 373–394. Amsterdam: John Benjamins. Park, Joo Young. 1999. A study of grammaticalization of kut. MA thesis, Sang Myung University, Seoul. Poppe, Nicholas. 1954. Grammar of written Mongolian. Wiesbaden: Otto Harrassowitz. Poppe, Nicholas. 1955. Introduction to Mongolian comparative studies. Helsinki: Suoma laisugrilainen Seura. Rhee, Seongha. 1996. Semantics of verbs and grammaticalization: The development in Korean from a cross-linguistic perspective. Seoul: Hankuk Publisher. Rhee, Seongha. 2002. From silence to grammar: Grammaticalization of ellipsis in Korean. Paper presented at New Reflections on Grammaticalization 2, University of Amsterdam, April 2002. Rhee, Seongha. 2004a. When stop goes further: From malta ‘stop’ to auxiliary verbs in Korean. Korean Language Research 13: 309–339. Rhee, Seongha. 2004b. On grammatical forms in the intermediate stage in grammaticalization. Paper presented at 2004 Summer Conference of Linguistic Science Society. Catholic University of Daegu, August 2004. Ryu, Ryul. 1990. Cosenmallyeksa (1) (A history of Korean (I)). Pyongyang: Sahoykwahak Publishing. Shim, Jae-Kee. 1980. Myengsahwauy Uymikinung (Semantic functions of nominalization). Language 5: 79–102. Shim, Jae-Kee. 1990. Kwuke Ehwilon (Korean lexicology). 4th edn. Seoul: Jipmoondang. Shin, Mikyong. 2005. Bridging the worlds of adnominal and pronominal: The case of Korean -n geot. Paper presented at New Reflections on Grammaticalization 3, Santiago de Compostela, July 2005. Sohn, Choon-Sup. 1997. Uyconmyengsa kes-ey Tayhan Kochal (Thought on the dependent noun kes). Hankwukenemwunhak (Korean Language and Literature) 38: 47–62. Song, Cheol-Eui. 1992. Kwukeuy Phasaynge Hyengseng Yenkwu (A study of derivation in Korean). Seoul: Thayhaksa Publishing. Song, Seok-Choong. 1973. Some negative remarks on negation in Korean. Language Research 9(2): 252–263. Song, Seok-Choong. 1977. Pwucenguy Yangsanguy Pwucencek Yangsang (A negative aspect of the negative aspect). Kwukehak (Korean Studies) 5: 45–100. Suh, Jung-Soo. 1984. Contaypep Yenkwu (A study on honorification). Seoul: Hanshin Publishing. Traugott, Elizabeth Closs. 1980. Meaning-change in the development of grammatical markers. Language Science 2: 44–61. Traugott, Elizabeth Closs & Richard B. Dasher. 2002. Regularity in semantic change. Cambridge: CUP. Wang, Moon-Yong. 1988. Kuntay Kwukeuy Uyconmyengsa Yenkwu (A study of dependent nouns in Early Modern Korean). Seoul: Hansaym Publishing. Yap, Foong Ha, Yang Gu & Hongyong Liu. 2005. From nominalizer to stance marker: A syntactic account of grammaticalization and pragmaticization phenomena. Paper presented at From Ideational to Interpersonal: Perspectives from Grammaticalization (FITIGRA) Conference, Leuven, February 2005.

264 Seongha Rhee

Yap, Foong Ha, Stephen Matthews & Kaoru Horie. 2004. From pronominalizer to pragmatic marker: Implications for unidirectionality from a crosslinguistic perspective. In Olga Fischer, Muriel Norde & Harry Perridon (eds.), Up and down the cline: The nature of grammaticalization [Typological Studies in Language 59), 137–168. Amsterdam: John Benjamins. Yoon, Kyung-Mi. 1995. Kes-uy Hyengthayloncek Thukseng Yenkwu (A study on the morphological characteristics of ‑kes). Dong-A University Kwukekwukmwunhak Nonmwuncip 14: 219–234.

chapter 10

The grammaticalization of clausal nominalizers in Burmese Andrew Simpson University of Southern California This paper is concerned with the grammaticalization of clausal nominalizers in two different but closely-related forms of Burmese, Colloquial Burmese and Literary Burmese. A contrastive overview of the morphosyntactic properties of the nominalizers thii and mii of Literary Burmese and their Colloquial Burmese counterparts te and me, together with the application of a number of tests for the identification of nominalized constructions, reveal that grammaticalization is more advanced in the colloquial language than in the literary variety: te and me have lost their nominal specifications and been reanalysed as grammatical elements of a different categorial type, instantiating verb-related mood and realis–irrealis distinctions. The comparison of the system of nominalization in the two complementary varieties of Burmese allows for insights into the evolution, spread and reinterpretation of nominalization structures within a language.

1. Introduction The crosslinguistic phenomenon of nominalization is an area of morphosyntax which has attracted growing interest in both formal and functional linguistics in recent years, as data from an increasing number of languages become available and add important insights into the grammaticalization paths and syntactic structure underlying the emergence of nominalizing elements. The present paper sets out to probe this area of morphosyntactic development further with a study focussed on Burmese, a language which is particularly rich in nominalization structures and where a highly informative picture of the results of the grammaticalization of clausal and sentential nominalizers can be found through a comparison of two different though closely-related forms of the language: Colloquial Burmese and Literary Burmese. A careful examination of synchronic patterns in Colloquial and Literary Burmese provides evidence of the source and complex structure of clausal nominalizers in the language. It also indicates that, in certain cases, earlier nominalizers have undergone regrammaticalization as functional morphemes of a different formal type, no longer instantiating nominal categories, but being reanalysed as modal and complementizer-type elements. Quite generally, the patterning found in Burmese nominalization raises questions about the assumed syn-

266 Andrew Simpson

chronic status of nominalizers in other languages, and about how the occurrence of cross-categorial reanalysis of (already) grammaticalized morphemes can effectively be detected. The structure of the paper is as follows. Section 2 first considers how nominalizers may be identified in a language and what morphosyntactic properties and sources nominalizers seem to exhibit in general. Section 3 then turns to Burmese and attempts to establish how the current, productive clausal nominalizers in Literary Burmese have undergone grammaticalization producing complex, fused forms which occur in a range of syntactic environments. This leads on to a comparison with similar patterns in Colloquial Burmese, where the conclusion is reached that nominalization in the colloquial language has progressed a significant stage further and has resulted in the reanalysis of erstwhile nominalizers as grammatical elements of a rather different clausal type. Section 4 then closes the paper with a brief summary of its findings and questions which are raised by the patterning observed.

2. Nominalizers and their common and divergent properties: A brief overview 2.1 What are nominalizers and what sources do they develop from? In the broad, developing literature concerned with nominalization, it is not uncommon to find the term nominalizer being made use of rather loosely to refer to elements whose status as nominalizers has not been particularly clearly established. As we begin this investigation it is therefore useful to attempt to clarify what kinds of morphemes should be genuinely labelled as nominalizers and how one might attempt to identify a particular grammatical element as a nominalizer rather than as some other kind of syntactic category. The term nominalizer is a purely functional label which is appropriately used to refer to all those morphemes or words which have the specific function of creating a nominal morphosyntactic form as the result of their combination with other kinds of non-nominal input. Nominal categories, and hence the presence of functional elements which may be nominalizers, can, in turn, be identified in two basic ways: (i) through the occurrence of noun-like or nominal morphological patterns and/or (ii) via syntactic privileges otherwise commonly associated with nouns and their syntactic projections. Concerning the morphological evidence for the presence of a nominal (rather than a verbal, adjectival or other) category, the potential occurrence of a range of common nominal attachments or inflections on a syntactic phrase or a combination of certain other syntactic categories with the phrase can be taken as indica-

The grammaticalization of clausal nominalizers in Burmese 267

tion that a phrase is nominal, patterning in a morphological and combinatorial way like other simple nouns and their expansions into noun-rooted phrases. The following are typical morphosyntactic indications that a syntactic constituent is nominal: a. Occurrence of case inflections on a constituent. b. Possible pluralization of the constituent. c. Possible enumeration of the constituent (i.e. combination of the constituent with numerals). d. Potential occurrence of demonstratives and adjectives with the constituent, rather than complementizers and adverbs. e. Use of case-marking strategies associated specifically with nouns in the marking of arguments of the head of the constituent (e.g. use of possessive or genitive case to mark the heads’s arguments rather than nominative or accusative case). Syntactically, a complex constituent may be identified as a nominal phrase if it shows the distribution of other simplex phrases which are clearly nominal, for example, the ability to occur in subject position or the ability to be coordinated with other clearly nominal categories. Non-nominal categories, such as verbal or adjectival phrases, are regularly excluded from such positions. However, if a verbal or adjectival phrase in combination with some additional morpheme is found to allow for occurrence in subject position or in coordination with other noun phrases, this may be taken as reasonable evidence for the nominalized status of the complex constituent and for the nominalizing function of the morpheme combined with the verb or adjective and their dependents. Because it is the outward morphosyntactic behaviour of a constituent that identifies it as nominal, and because there are various ways in which nominal morphosyntax can be exhibited (case marking, pluralization, combination with adjectives, etc.), nominalizers may in fact originate from a range of different lexical sources. Consider a formal syntactic representation of the internal structure of a referential nominal phrase, a DP, in a head-initial language with classifiers, as given in (1).

D = Determiner or Demonstrative Num = Numeral position Cl = Classifier position N = Noun

DP

(1) D

NumP Num

ClP

Cl

NP N

...

268 Andrew Simpson

If it is assumed that the occurrence of any lower syntactic head position, such as Noun, Classifier or Numeral, will necessarily result in the projection of other, higher portions of the structure in (1), and that different parts of the structure in (1) may identify a constituent as ‘nominal’ in different ways (the occurrence of numerals in Num or demonstratives in D), it is possible to see that the use of any of the head constituents (D, Num, Cl or N) in (1) might in principle be used to build up a phrase with certain overt nominal properties when combined with a nonnominal complement, thus nominalizing the latter. For example, were an element of syntactic or lexical category Noun to be combined with a verbal constituent (if complex, then a verb phrase), the result would be a structure such as (2) below. In (2), a full array of nominal properties might be expected to characterize the new nominalization, given the presence of the full set of syntactic heads and projections above the Noun position. In the representation in (2), NMLZ is intended to indicate a morpheme which has the function of a nominalizer and the original syntactic category of a noun (i.e. a nominalizer derived from a noun, perhaps via bleaching of the lexical content of the noun):

(2)

DP D

NumP Num

ClP

Cl

NP N

VP

NMLZ Alternatively, however, a verbal (or other non-nominal) constituent might allow for conversion into a category with certain nominal properties if combined with a nominalizer sourced from one of the higher head categories in (1), such as an element of type D (Determiner or Demonstrative). Such a constituent would be expected to allow for the syntactic distribution of other nominal expressions (e.g. unrestricted occurrence in subject position). However, it might not allow for the full array of properties commonly associated with noun phrases and their expansions into DPs due to the lack of lower portions of the nominal structure in (1), notably the Numeral, Classifier and Noun heads and their projections. Such a possibility is schematized in (3), where the nominalized constituent is indicated

The grammaticalization of clausal nominalizers in Burmese 269

as being possibly a verb phrase (VP), a tense phrase (TP) or a fully-clausal complementizer phrase (CP):

DP

(3) D

VP/TP/CP

NMLZ Two other intermediate possibilities are also anticipated to be available and exist, namely (i) the use of a classifier as a nominalizing element combining with a non-nominal complement, as in (4); and (ii) the use of a numeral as a nominalizer, as in (5). In both instances, the structures produced are expected to have some, but not necessarily all, typical properties of nominal projections. For example, if adjectival modification is assumed to occur via the adjunction of an adjectival phrase to a noun phrase, the absence of a noun phrase in structures (4) and (5) may be expected to correspond to a lack of adjectival modification with such nominalizations.

(4)

DP D

NumP Num

ClP

Cl VP/XP NMLZ

(5)

DP D

NumP Num VP/XP

NMLZ Interestingly, all the possible ways in which a nominalizer might theoretically be instantiated, as outlined above, do indeed seem to occur in different languages, and nominalizers grammaticalize from a variety of sources. Nominalizers ultimately derived from nouns appear to be very common in occurrence (e.g. Korean

270 Andrew Simpson

kes, as discussed in Simpson and Wu 2001), but it is also possible to identify classifier, numeral and demonstrative or determiner sources of other nominalizers too. Burmese, for example, makes use of the numeral ‘one’ (ta-) in a range of nominalizations. Thai and Bengali, in turn, show evidence of classifier use in a nominalizing function with certain clausal constituents, whereas Chinese, Japanese and Lakota can be argued to have derived nominalizers from earlier elements of type D (cf. Simpson 2001, 2003a, 2003b; Simpson and Wu 2001 for extensive discussion).

2.2 Is nominalization a lexical or a syntactic process? In addition to variation in the source category of nominalizing elements, a second important parameter of variation in the (synchronic) realization of nominalizers is whether the attachment of such elements occurs as a lexical process or is effected as part of a syntactic derivation. Certain nominalizers seem to be very clearly phrasal attachments, combined with a constituent which is a full syntactic phrase. This is the case of the sentential and clausal nominalizers found in a significant number of languages, where a full clause is converted into a nominal argument of some other predicate by the use of an appropriate nominalizer (e.g. Japanese no, Korean kes; cf. Simpson and Wu 2001; see also the papers by Horie, Rhee, and Yap and Matthews in this volume). If sentential and other phrasal nominalizers convert a full syntactic constituent into a nominal output, such nominalizers need to be considered syntactically independent words, which combine with a complement during the syntactic derivation of a sentence. In this respect, they would be similar to determiners, complementizers and other functional elements which are understood to be discrete grammatical words rather than parts of other words (i.e. bound morphemes). Other nominalizers, however, may appear to be affixes attached to word-level elements such as verbs, adjectives, and so on, rather than to verb phrases, adjective phrases, etc. Such nominalizations may have many more of the unpredictable properties of lexical processes (perhaps being restricted and unproductive and giving rise to allomorphic variation), and so be assumed to be purely morphological attachments, which combine with non-nominal roots or bases during the pre-syntactic creation of words.1 An example of such morphological attachment would be nominalizations produced with English -ant, as in servant or defendant. 1. It is also possible that certain nominalizers attached to word-level elements might be syntactic attachments, combining with a word-level constituent within a syntactic structure, perhaps via cliticization or some form of (syntactic) head-movement. Various approaches to English nominalizing -ing assume such an analysis. Lotha (Tibeto-Burman) a-would be another candidate for a similar treatment (cf. Herring 1991).

The grammaticalization of clausal nominalizers in Burmese 271

2.3 Nominalizers and verbalizers Given the very widespread nature of nominalizers in the world’s languages, even being present in languages not considered to have morphological attachments, such as Vietnamese, it is natural to wonder whether there might be equivalents to nominalizers functioning in the verbal domain, what could be called verbalizers, and, if such elements exist, what they might in turn indicate about the nature of nominalizers. In other words, if nominalizers combine with non-nominal input to create words and constituents which can be utilized as noun (phrase)-like arguments, are there also functional elements which combine with non-verbal input to create constituents which can then be embedded as verbal units? And if not, why do such elements not exist? Although there has been little discussion of such potential counterparts to nominalizers in the literature, certain reflection suggests that there are indeed verbal elements which correspond to nominalizers in their function of converting non-verbal input into a syntactically utilizable verbal form. Furthermore, it seems that such verbalizers occur both as syntactically independent words and as purely morphological attachments, in a way very similar to the occurrence of nominalizers. An example of lexical or morphological verbalization would be the application of affixes such as English -ize to adjectival input to form new verbs, e.g. grammatical — grammaticalize. Examples of syntactic verbalizers can be given in two quite common types. A first, crosslinguistically well attested type is the class of light verbs, such as (prototypically) make or do, which are used to combine with nominal input to produce verbal forms, such as Hindi kaam karna ‘work do’ = ‘to work’ or Japanese benkyoo suru ‘studying do’ = ‘to study.’ A second potential candidate for consideration as a syntactic verbalizer would be the simple occurrence of copulas, used to embed non-verbal input and create a verbal structure which allows the application of tense and aspect; e.g. John was a great help = John helped a lot. Supposing, therefore, that there are indeed functional elements in the verbal domain equivalent to nominalizers, the study of light verbs and copulas (as well as morphological verbalizers) may be used in a comparative way to further open up and inform our understanding of the patterning and morphosyntax of nominalization. A comparison of nominalizers and verbalizers also has the potential to lead to interesting typological questions and to the issue of whether there really is the full cross-categorial equivalence which one might pre-theoretically expect in such a domain, and if not, why full equivalence does not exist.2 2. For example, while it may seem that nominalizers grammaticalize from a range of functional elements commonly associated with nouns, such as demonstratives, classifiers and numerals, it is not obvious that verbalizers are derived from higher functional elements associated with

272 Andrew Simpson

2.4 Nominalization and compounding In connection with the source of the grammaticalization of nominalizers, in certain cases (at least) it may be interesting to consider the possible connection of nominalization with compounding in a language. It is quite plausible that, in various instances, nominalizers grammaticalize from nouns which are frequently used to create nominal compounds forming a connected class of items, e.g. -man in doorman, chairman or tax man. With the occurrence of bleaching of the meaning of such nouns and a concomitant extension of the way in which they can be used to combine with other words in compounding, a simple process of nominalization may well evolve. In this regard, it is intriguing to find a correspondence between the direction of headedness in compounds in certain languages and the linear position of nominalizers relative to nominalized material. For example, languages such as English, Japanese and Burmese, which have right-headed compounds, also have nominalizers occurring to the right of the constituent or word nominalized. By contrast, languages which have left-headed compounds, such as Thai and Vietnamese, have nominalizers which occur to the left of the constituent or word nominalized. If such a patterning is non-coincidental and can be found to occur consistently in a wider sampling of languages, it may establish an interesting diachronic link resulting from processes of grammaticalization between two types of morphological operation regularly treated as being unrelated.

3. Nominalization in Burmese Having considered some of the general issues involved in the study of nominalization and the grammaticalization of nominalizers in a language, we now turn to an investigation of nominalization phenomena in Burmese. The discussion will focus in particular on the sentential and clausal nominalizers present in the language, as these can be shown to reveal much about the way in which reanalysis applies to create complex new grammaticalized morphemes and words, and give rise to shifts between categorial types.3 As briefly mentioned in Section 1 above, verbs, such as tense, aspect and mood morphemes; on the contrary, the main source of verbalizers would seem to be verbs. Why this asymmetry between the source of nominalizers and that of verbalizers exists is not immediately clear. 3. In addition to the clausal and sentential nominalizers discussed in this work, Burmese also exhibits a wide range of other nominalizers, many of which combine with sub-clausal constituents. For simple reasons of space and focus, no attempt to describe these will be made here. The reader is referred to Okell (1969), Okell and Allott (2001) and Hopple (2003) for useful information and a comprehensive listing of all putative nominalizers in Burmese.

The grammaticalization of clausal nominalizers in Burmese 273

Burmese is commonly described as having two complementary forms: Colloquial Burmese and Literary Burmese. The latter is used orally in formal announcements, news broadcasting and is the most common written form of Burmese, whereas the former occurs in most spoken communication and is also sometimes found in more informal writing. The primary and most obvious difference between the two varieties of Burmese concerns the instantiation of their functional-grammatical morphemes, including the elements employed as nominalizers, and it is here that a comparison of patterns in Colloquial and Literary Burmese is often interesting.4 The present examination of nominalization in Burmese will begin with patterns in Literary Burmese in Section 3.1 and then move on to Colloquial Burmese in Section 3.2.

3.1 Sentential nominalizers in Literary Burmese: The elements thii and mii In Literary Burmese, the morpheme thii occurs in clause-final position, both in main clauses (as a sentence-final morpheme), as in example (6) below, and in clauses which are embedded as arguments of other predicates.

(6) Literary Burmese U-Win-Win manee-ga yauq-laa thii. U-Win-Win yesterday-pst arrive-come nmlz ‘U Win Win arrived yesterday.’

When thii is used to embed clauses as the arguments of a predicate, it is naturally accompanied by a case marker.5 Examples (7) and (8) show this with the embedding of clauses as the object of a verb, while (9) and (10) illustrate the embedding of a clause in subject position. It should also be noted that the use of thii in examples (6) to (10) is obligatory and clauses may not occur as the arguments of verbs without this morpheme.

(7) Literary Burmese canaw [U-Win-Win manee-ga yauq-laa thii]-kou caa ya thii. I [U-Win-Win yesterday-pst arrive-come nmlz-acc hear get nmlz ‘I heard that U Win Win arrived yesterday.’

4. A highly significant proportion of non-grammatical vocabulary is shared by Colloquial and Literary Burmese, though the latter also has available certain variant forms (frequently multisyllabic) which are often made use of in writing (cf. Okell 1969; Saw Tun 2005). 5. The case marking of nominal arguments is generally optional in Burmese, though heavily preferred in certain instances, both as parsing aids and sometimes also to signal contrastive emphasis, as in Korean (Schütze 2001). Informants indicate that the use of case particles with thiiembedded clauses is much preferred. For further discussion of factors triggering the use of casemarking particles in Burmese, cf. Kassevitch (2005).

274 Andrew Simpson

(8) Literary Burmese hsain-shin-hmaa [hkalee-twee hseileiq hkou thii]-kou myin-ya thii. shopkeeper-nom [child-pl cigarette steal nmlz-acc see-get nmlz ‘The shopkeeper saw the children stealing the cigarettes.’

(9) Literary Burmese [thuu bamaa-sagaa pyaw nee thii]-hmaa theiq kaun hla thii. [he Burma-language speak asp nmlz-nom very good intens nmlz ‘He speaks Burmese very well.’ (lit. ‘His speaking Burmese is very good.’)

(10) Literary Burmese [leezeiq-twin daq-poun yaiq thii]-hmaa tayaa-m-win pa. [airport-within photograph take nmlz-nom not-be-legal pol ‘It is illegal to take photographs inside the airport.’ (lit. ‘Taking photographs in the airport is illegal.’) Thii-suffixed clauses consequently have both the syntactic distribution (occurring in subject and object position) and marking (nominative and accusative case) of nominal phrases. The element thii therefore exhibits key aspects of the patterning of typical clausal nominalizers. Thii is also found to occur in relative-clause structures, such as the one in (11).6 (11) Literary Burmese canaw we thii saouq I buy nmlz book ‘the/a book I bought’ In such an environment thii does not maintain its mid-level tone, but occurs instead in high creaky tone, one of the four tones which regularly occur on Burmese words. As a suprasegmental morpheme, such high creaky tone frequently signals genitive case, which is otherwise realized as a discrete morpheme ye, pronounced with high creaky tone. (12) Literary Burmese canaw ye ein I gen house ‘my house’

6. In addition to thii, Literary Burmese also makes use of a morpheme thaw as an alternative linking element in relative clauses. This element is particularly frequent with intransitive predicates and adjectives (cf. Okell 1969 and Hopple 2003 for further discussion of thaw).

The grammaticalization of clausal nominalizers in Burmese 275

Sequences such as that in (12) may in fact have two pronunciations. If the possessor (here canaw) normally occurs with a mid-level tone, it can maintain this tone as a genitive-marked possessor. However, it is also possible for (12) to be pronounced with high creaky tone on both ye and the possessor canaw, the tone associated with genitive case spreading leftwards from ye to the nominal element which it marks to its left. A third pattern commonly found is for a (normally) mid-level tone possessor to occur pronounced with high creaky tone even in the full absence of genitive ye (canawcreaked ein ‘my house’). The high creaky tone of genitive ye can, therefore, become disassociated from its regular host and simply occur marking an appropriate nominal element. In relative clauses such as that in (11) above, clause-final creaky tone thii is consequently most naturally analysed as the combination of a clausal nominalizer (mid-level tone) thii and genitive case, so that relative clauses in Burmese are instances of the modification of a noun by a nominalized clause, linked via genitive case.7 If we now consider what the source of the nominalizer thii might be, it can be observed that mid-level tone thii occurs elsewhere in Literary Burmese, regularly functioning as a pre-nominal demonstrative, as in (13). (13) Literary Burmese thii saaouq this book ‘this book’ Assuming this demonstrative element to be the most likely source of the nominalizer thii then suggests that clausal nominalizations in Literary Burmese have an internal structure such as that in (3) (parameterized in a head-final way), representing the combination of a D(eterminer)-type element with a clause to produce a DP nominal projection.8 Interestingly, mid-level thii also occurs in a third, rather different function in Literary Burmese, namely as a common instantiation of nominative case marking a subject. Example (14), thus, shows four different uses of thii: as a demonstrative, nominative case, linking a relative clause to a following nominal (with creaky tone) and in simple (non-relativized) clause-final position.9 7. Such an analysis is well-supported by the occurrence of genitive case with putative nominalizations in many instances of relative-clause marking and linking in Tibeto-Burman languages (cf., for example, Noonan 1997; Thurgood and LaPolla 2003). 8. Cf. also Herring (1991) for evidence from other languages that clausal nominalizers may be sourced from demonstratives in Tibeto-Burman languages. 9. A teasing question is how the nominative case marker thii might be linked to and possibly derived from the demonstrative element thii. One possibility may be to consider that subject noun phrases are more frequently definite and anaphoric in reference than objects are, and so might

276 Andrew Simpson

(14) Literary Burmese thii pyiqsii-myaa-thii Daw-Hla-Mee htaa hke thii-myaa hpiq pa thii. this thing-pl-nom Daw-Hla-Mee put asp nmlz-pl be pol nmlz ‘These are the things that Daw Hla Mee left behind.’ Assuming a demonstrative source of nominalizer thii to be plausible and likely, the grammaticalization of thii in its nominalizer function brings with it a further question relating to word order. As demonstrative thii occurs preceding the nominal complement with which it combines, why might it be positioned following a clausal complement when nominalizing the latter? Although no definitive answer can be provided in the absence of data on the early development of nominalizing thii, two speculations can be offered here. First, as Burmese is a head-final language, and thii as a demonstrative is likely to be positioned in a phrase-initial specifier position (SpecDP), it can be hypothesized that, when such an element grammaticalized as a nominalizer, it came to occupy the head-position of its (DP) phrase in an occurrence of the Spec-head reduction process frequently argued to characterize instances of phrasal grammaticalization (cf. Simpson and Wu 2002a and van Gelderen 2004). Given the linear organization of (elements in) specifier and phrasal head positions in a head-final, specifier-initial language such as Burmese, Spechead reduction and grammaticalization would be expected to relocate a demonstrative from a phrase-initial specifier position to a phrase-final head-position, and so result in nominalizer thii coming to follow its clausal complement. (15)

DP Spec

D′ XPcomplement Dhead

A second possibility might be to hypothesize that thii came to be used as a demonstrative in a resumptive position following a clausal subject, as occurs in certain Indic languages. This is schematized in (16). (16) [John likes Mary]i, thisi is true. ‘It is true that John likes Mary.’/‘That John likes Mary is true.’ Such a linear sequence might then lead to grammaticalization of the demonstrative as a nominalizer following the clause it introduces: (17) [[John likes Mary]-this] is true. attract marking with a demonstrative, encoding a familiar, [+definite] referential value.

The grammaticalization of clausal nominalizers in Burmese 277

In addition to the question of the linear order of thii relative to its complement clause, an interesting complication for the analysis of thii and of both its synchronic status and its grammaticalization comes from the fact that thii is actually not just a simplex nominalizer derived from a demonstrative, but also encodes realis mood, being used to combine with clauses which represent past or present actions, states and habits. Where future actions and hypothetical situations are referred to, a fully parallel set of nominalized constructions are built with the nominalizer mii, which differs from thii only in encoding irrealis mood (i.e. non-realized actions or states).10 This is illustrated in (18) and (19) with clauses in subject position and nominative case, and in (20) with an accusative-marked clause in object position. (18) Literary Burmese [than shi mii]-hmaa-le ahman hpyiq ii. [lice be nmlzIRR-nom-too true be nmlz ‘That there will be lice too is true.’ (Okell and Allott 2001: 158) (19) Literary Burmese [thati pyu ya mii]-hmaa htaransitsataa-amyouasaa-pin hpyiq be [attention do must nmlzIRR-nom transistor-type-emph thii. nmlzRLS ‘What we will have to pay attention to is the transistor type.’ (Okell and Allott 2001: 158) (20) Literary Burmese [naneq ngaa naayii khan shi mii]-kou aloulou-hman-mi [morning five o’clock about be nmlzIRR-acc estimate thii. nmlzRLS ‘He estimated that it must be about five o’clock.’ (Adapted from Okell and Allott 2001: 158) Example (21) shows the occurrence of irrealis mii in sentence-final position, paralleling thii in example (6), while (22) shows its occurrence embedding a relative clause. As with thii when it links a relative clause to a noun, mii carries high creaked tone in (22) corresponding to genitive case, and is elsewhere pronounced with a mid-level tone. 10. Besides thii and mii, Literary Burmese has an element ii (pronounced with a high creaky tone) which occurs as a stylistic variant of thii and mii in sentence-final position or embedded quotations with no specification for (ir)realis (cf. Okell and Allott 2001: 271–272, and footnote 15 below). An example of the sentence-final use of ii occurs in (18).

278 Andrew Simpson

(21) Literary Burmese maneqhpan pyan laa mii. tomorrow return come nmlzIRR ‘He will come back tomorrow.’ (22) Literary Burmese [thuu yuu laa mii] pyiqsii-myaa [he take come nlmzIRR.GEN thing-pl ‘the things he will bring’ A further interesting parallel in morphosyntactic patterning between thii and mii is that the latter also occurs in a pre-nominal interrogative specifier function, which corresponds closely to the demonstrative function of thii. However, whereas demonstrative thii is realis in picking out a referent with definite reference, interrogative mii is used for hypothetical reference in questioning the reference of a noun. (23) Literary Burmese mii-thuu which-person/one ‘who’ The clausal nominalizers thii and mii in Literary Burmese are, therefore, complex elements incorporating two parts and functions: (i) an (ir)realis component: m-/ th- and (ii) a nominalizer component, the common or shared vowel coda: -ii. Of further potential relevance here is the observation that, in addition to prenominal demonstrative and interrogative thii and mii, Literary Burmese also contains a simplex demonstrative element unspecified for (ir)realis distinctions whose form corresponds exactly to the nominalizing component -ii in thii and mii. (24) Literary Burmese ii saaouq this book ‘this book’ Quite possibly, then, the complex demonstrative-nominalizers thii and mii are grammaticalizations of the combination of (ir)realis together with this simplex demonstrative ii.11 The creaky tone nominalizers occurring in relative clauses will, in turn, consist of three distinct components: (ir)realis + demonstrative-nominalizer + genitive case, and th-/m- + ii + creaky tone. 11. The simplex demonstrative ii is furthermore found in early Burmese inscriptions before thii and mii are attested, supporting the feeding relation hypothesized here with thii and mii being formed from ii together with an encoding of (ir)realis meaning. Thanks are due to a reviewer of the chapter for drawing attention to the early occurrence of ii in Burmese inscriptions.

The grammaticalization of clausal nominalizers in Burmese 279

3.2 Sentential nominalizers in Colloquial Burmese Turning now to patterns in Colloquial Burmese, one finds that there are elements which occur in the same clause or sentence-final and relative-clause positions as the nominalizers thii and mii in Literary Burmese. Similar to Literary Burmese, the linking elements in relative-clause structures carry a creaky tone. Also parallel to the situation in Literary Burmese, there are two sets of elements which occur in these positions, one set encoding realis mood, the other encoding irrealis mood: a. Realis clause: te realis relative clause linker: te+creak. b. Irrealis clause: me irrealis relative clause linker: me+creak. Examples (25) to (29) below illustrate the use of these morphemes in clause-final, sentence-final and relative clause environments. (25) Colloquial Burmese thuu maneqphan laa me htin te. he tomorrow come nmlz think nmlz ‘I think he will come tomorrow.’ (26) Colloquial Burmese thuu manee-ka saaouq we te. he yesterday book buy nmlz ‘He bought a book yesterday.’ (27) Colloquial Burmese thuu manee-ka we te+creak saaouq he yesterday buy nmlz book ‘the book he bought yesterday’ (28) Colloquial Burmese thuu maneqphan thamain-saaouq we me. he tomorrow history-book buy nmlz ‘He will buy a history book tomorrow.’ (29) Colloquial Burmese thuu maneqphan we me+creak thamain-saaouq he tomorrow buy nmlz history-book ‘the history book he will buy tomorrow’ Such neat parallels between Literary and Colloquial Burmese diverge, however, in instances where clauses occur as the clearly nominalized arguments of verbs of perception and cognition in subject and object position. In such environments, where the overt occurrence of case markers confirms the nominalized status of the embedded clauses, the elements which embed clauses in Colloquial Burmese

280 Andrew Simpson

are not, in fact, homophonous with the clause or sentence-final and relative clause morphemes te and me, but instead have the forms taa for realis clauses and hmaa for irrealis clauses. Examples (30) to (34) show the occurrence of these elements in a range of subject, object and object of postposition and (ir)realis combinations. (30) Colloquial Burmese [thuu thoun-nayii-hmaa hote-ka htweq]-taa-kou myin-ya te. [he three.o’clock-at hotel-from exit-nmlz-acc see-get nmlz ‘I saw him leaving the hotel at three o’clock.’ (31) Colloquial Burmese [thin ne]-taa-ga hkeq-th-la? [learn asp-nmlz-nom difficult-q ‘And how about learning it, is it difficult?’ (32) Colloquial Burmese [thuu baa-hma m pyaw]-taa-ne apyin htweq thwaa te. [he whatever not say-nmlz-with outside exit go nmlz ‘I left as/when he did not say anything.’ (33) Colloquial Burmese [ale pyan]-hmaa-kou pyaw th-laa? [visit return-nmlz-acc say q ‘Did he say he was going to come for a visit?’ (34) Colloquial Burmese [thuu dii-lou louq]-hmaa-ga theiq m kaun pu, htin te. [he this.way do-nmlz-nom very not good not think nmlz ‘I think it is not good at all that he is going to do that.’ The most plausible analysis of the elements taa and hmaa is that they result from the collapse of te and me together with the light noun haa, meaning ‘one, thing’ (similar to English this one, that one, the blue one): te + haa > taa; me + haa > hmaa. Elsewhere in Burmese, the element haa occurs with this meaning ‘one, thing’ and can also optionally fuse with the demonstrative dii resulting in the form daa: dii haa ‘this one, this thing’ > daa ‘this, that.’ The clause-final elements taa and hmaa are also found to occur in headless relative clauses, as in (35) to (37). (35) Colloquial Burmese [canaw hmaa]-taa-ga asein-caw pa. [I order-nmlz-nom vegetable-fried pol ‘What I ordered was fried vegetables.’ (lit. ‘the one/thing I ordered’)

The grammaticalization of clausal nominalizers in Burmese 281

(36) Colloquial Burmese [thuu Yangoun-hmaa we]-hmaa-ga seqbein pa. [he Yangoun-in buy-nmlz-nom bicycle pol ‘What/the thing he is going to buy in Rangoon is a bicycle.’ (37) Colloquial Burmese [canaw yee-htaa]-taa-kou thuu theiq m caiq pu. [I write-asp-nmlz-acc he very not like not ‘What I wrote he really does not like.’ In such environments taa and hmaa again arguably result from the collapse of te and me and the dummy or light noun haa, resulting in a complex fused morpheme with the meaning ‘the one, the thing that …’ Finally, taa and hmaa are additionally found in alternation with te and me in sentence-final position, as in examples (38) and (39). In such instances, the use of taa and hmaa, rather than te and me, results in a clear difference in meaning and a cleft-like interpretation with focus on one part of the sentence as new information which is set off against a presupposed background (similar to the Chinese shi-de construction and Japanese no-desu forms; cf. Simpson and Wu 2002b; Simpson 2003a). (38) Colloquial Burmese canaw zee-hmaa we taa. I market-in buy nmlz ‘I bought it in the market.’/‘It is in the market that I bought it.’ (39) Colloquial Burmese thuu maneqphan yauq hmaa. he tomorrow arrive nmlz ‘He will arrive tomorrow.’/‘It is tomorrow that he will arrive.’ From such a distribution and patterning, one can hypothesize a likely threestep route of grammaticalization and development of taa and hmaa from te and me: (i) from an earlier stage in which te and me occurred in all clause-final (including relative clause-final) positions, similar to the distribution of Literary Burmese thii and mii, it can be supposed that the combination of te and me with haa ‘one’ in headless relative clauses resulted in two collapsed or fused relative clause nominalizers taa and hmaa; (ii) where such taa- and hmaa-final relative clauses might have occurred as the natural object of a copula in a main clause (e.g. This is the one/thing I bought yesterday), significantly this would then have resulted in taa and hmaa occurring in sentence-final position, as copulas are regularly null in Colloquial Burmese; (iii) finally, from such occurrence as (relative) clausal nominalizers in sentence-final position, it can be hypothesized that taa and hmaa may

282 Andrew Simpson

have allowed for an expansion of use as clause-final nominalizers in clauses which were not only sentence-final, but alternatively embedded as the clausal arguments of verbs of perception and cognition (and also as subjects of one-place predicates such as ‘(to be) easy/difficult/interesting,’ etc.). If the above can be assumed to be a reasonable path of development for taa and hmaa as general clausal nominalizers not just restricted to relative clauses, an important question now arises about the synchronic status of te and me. If te and me are the original clause-final morphemes in Colloquial Burmese (and partial inputs to the later creation of taa and hmaa) and correspond to Literary Burmese thii and mii, as seems most likely, and if the elements thii and mii in Literary Burmese can be shown to be nominalizers, a fairly natural conclusion is that te and me may have shared this function as nominalizers too, at least in their earlier stages of development. Such an assumption is supported by the observation that a genitive case creaky tone is present with te and me in their occurrence as clause-final elements in relative clauses. If (genitive) case is naturally taken to be marked only on nominal constituents, then te- and me-final relative clauses must be assumed to have been nominal(ized) clauses at least when the genitive creak was first added to te and me. Consequently, like Literary Burmese thii and mii, te and me can be concluded to have had a nominalizing function when added to clauses, in addition to expressing (ir)realis mood, at least at some point in their development. The question to be considered now is whether synchronically these elements are (still) nominalizers in the way that their frequent Literary Burmese equivalents thii and mii can be shown to be. The answer to this question would seem to be, clearly and interestingly, ‘no.’ When simple tests for the nominal status of te- and me-marked clauses are employed, te and me do not in fact show signs of nominalizing the clauses with which they are combined. First of all, case markers cannot occur on clauses ending in te or me, unlike clauses ending in the clear nominalizers taa and hmaa. Compare (40) and (41) below with earlier (30), (31), (33) and (34). (40) Colloquial Burmese [U-Win-Win manee-ga yauq te](*-kou) pyaw te. [U-Win-Win yesterday arrive nmlz-acc say nmlz ‘He said that U Win Win arrived yesterday.’ (41) Colloquial Burmese [canaw maneqhpan thwaa me](*-kou) htin te. [I tomorrow go nmlz-acc think nmlz ‘I think I will go tomorrow.’ 12 12. Matisoff (1972) offers an example of the attachment of accusative case -kou to a clause

The grammaticalization of clausal nominalizers in Burmese 283

Secondly, postpositions such as ne ‘with’ can only occur with clearly nominalized clauses ending in either taa or hmaa and not te and me: (42) Colloquial Burmese sagaa sa pyaw laa taa/*te ne ta-pain-neq […] word begin say come nmlz/nmlz with at-the-same-time ‘at the same time that he began speaking’ Thirdly, other category-sensitive patterns involving elements introducing rationale clauses similarly indicate that, while taa and hmaa do create nominal categories, te and me do not. Specifically, the Burmese words mouq and caun (meaning ‘because’) are elements which only allow for combination with nominal phrases and are found to naturally occur with taa- and hmaa-final (nominalized) clauses. However, they may not occur with te- or me-final clauses, indicating clearly that the latter elements (synchronically) do not nominalize the clauses with which they combine.13 (43) Colloquial Burmese ngwee lou taa/*te caun money need nmlz/nmlz because ‘because he needed money’ (44) Colloquial Burmese maneqhpan ethe-twee laa hmaa/*me mouq tomorrow guest-pl come nmlz/nmlz because ‘because there are guests coming tomorrow’

ending in te, namely (i) below. However, this sentence is actually not well formed and is rejected by native speakers of Burmese. First of all, the occurrence of negation should automatically displace the occurrence of te as in all other negative sentences. Secondly, -kou cannot co-occur with te in other, non-negative environments. Hence, even if the negation is removed, the sentence remains ungrammatical with -kou. (i) *[[hkinpya shi te m-shi te]-kou be-hne thi m-le? *[[you be nmlz not-be nmlz-acc how know q ‘How will I know whether you are there or not?’ 13. This restriction of mouq and caun on the categorial status of their complements permitting only nouns or noun phrases would not seem to be reducible to the meaning of mouq and caun as ‘because (of).’ Other lexical items in Burmese with a similar meaning, such as lou ‘because,’ impose fully opposite selectional restrictions, combining only with verbal or clausal categories and not allowing noun or nominal complements. It is significant to note that lou does allow for combination with a clause ending in te or me, thus supporting the assumption shortly to be put forward here that such clauses are not nominalizations.

284 Andrew Simpson

The conclusion to draw from the above is that the elements te and me in Modern Burmese do not create nominal categories and, therefore, synchronically are not nominalizers. However, the available evidence and the patterns surveyed from elsewhere in the language and connected with te and me clearly suggest that te and me were, at one time, part of a nominalizer paradigm similar to Literary Burmese thii and mii. Therefore, it would seem that Colloquial Burmese te and me can be seen as cases of nominalizers which have undergone a further development in their path of grammaticalization, significantly losing any nominal-related categorial specification which would support case marking and other properties of nominal(ized) constituents, and undergoing reanalysis as grammatical markers more strictly associated with verbal syntax and the representation of (ir)realis mood. If we assume that nominal syntactic patterns (such as case marking) result from the presence of a DP (cf. Section 2.1), the loss of such patterns can be interpreted as indicating that an element which previously caused a DP to be projected has undergone reanalysis over time. Where an erstwhile nominalizer thus loses its ability to project a DP, it can be suggested that it has undergone a shift in categorial identity and is no longer one of the syntactic types which normally cause a DP to be projected, that is, Determiner or Demonstrative, Numeral, Classifier or Noun (cf. Section 2.1). In such instances, it may be concluded that the nominalizer has become reanalysed as an instance of a syntactic category which is associated with verbs rather than with nouns, for example, as an occurrence of T(ense), Asp(ect), Mood, or C(omplementizer), and that this shift in identity to a non-nominal category causes the loss of nominal morphosyntax. In the case of main clause te and me, it is reasonable to assume that their categorial identity is now strictly (ir)realis Mood. Hence, their hypothesized reanalysis would be from an occurrence and function as a D-type nominalizer to a plain Mood marker combined with verbs, that is, an instance of cross-categorial regrammaticalization from one major functional domain to another (lateral regrammaticalization; Simpson 2003a).14 A further conclusion following from the above is that, if te and me are no longer nominalizers in Burmese, then the occurrences of te and me with a creaky tone in relative clauses can no longer be synchronically the combination of a nominalizer with genitive case.15 Here again it would seem that categorial reanalysis must 14. Such a putative reanalysis might seem to represent a simplification and reduction of the featural specification of te and me: from being [+ nominal, + mood], by hypothesis, te and me become simply [+ mood]. 15. Indeed, creaked te and me do not allow for separation of the genitive case from te and me via the use of independent genitive case marker (creaked) ye: *[… te/me ye]. This suggests that creaked te and me are fully fused, new forms and are not decomposable into nominalizer and genitive case sub-parts. Also note that, if te and me do not induce nominal syntactic constituents (i.e. nominalize other phrases), then the creaky tone suprasegmental which is elsewhere an in-

The grammaticalization of clausal nominalizers in Burmese 285

have applied converting a nominalizing morpheme plus genitive case into a new, composite form which is (i) non-nominal/not a category which projects a nominal constituent, and (ii) specified for occurrence with verbs and their projections in the environment of relative clauses. If the occurrence of a creaky tone on te and me in relative clauses suggests that te and me in such clauses are not identical to te and me elsewhere in clause-final position (hypothesized to be instantiations of just Mood), then the dedicated embedding function of creaked te and me in relative clauses can be suggested to represent a reanalysis of te and me as new (relative clause) Complementizer elements. Such elements would be formally of type C(omplementizer), and, therefore, distinct from pure Mood te and me, though carrying a specification for (ir)realis mood in a way which recalls the complex occurrence of subject agreement on Complementizers in languages such as Flemish and Bavarian German (Fuss 2005).16 The syntactic patterning observed with Colloquial Burmese te and me and the way in which this is distinguished from the nominalizer forms thii and mii in Literary Burmese therefore leads to the hypothesis that a pair of erstwhile nominalizers have undergone a further, important reanalysis as elements lacking nominal specifications, and have come to instantiate new verb-related categories. Generally, then, it can be argued that the grammaticalization of elements as nominalizers does not necessarily represent a final stage of functional development and that nominalizers may, in fact, develop further into verb-related grammatical categories with formally different syntactic properties. To the extent that the distinction between clause-final te and me and relative-clause embedding creaked te and me seems to require the assumption of a categorial identity for the latter which is formally different from the former, the paradigms examined here also provide support for the existence of a special C(omplementizer) category. Such a category has often been assumed for elements such as English that, taken to be reanalysed as a C(omplementizer) from an earlier source as a D(emonstrative), dication of genitive case can no longer be genitive case when present with te and me (in relative clauses), as case marking only marks nominal constituents. Finally, the conclusion that genitive case in relative clauses has been reanalysed as part of a different type of morpheme is supported by other instances in Burmese where genitive case marking seems to have clearly undergone reanalysis. In both Colloquial and Literary Burmese there are sentence-final morphemes which have developed from genitive case markers (creaked ye in Colloquial Burmese and creaked ii in Literary Burmese) and which now function as markers of (i) surprise in Colloquial Burmese and (ii) the end of the sentence in an (ir)realis neutral way in Literary Burmese. Given the complementary distribution which these elements have with other markers of the category of Mood, it is most natural to see them as reanalysed into this category and no longer performing any (genitive) case function. 16. For a view on how a single morpheme might relate to two syntactic featural specifications, such as Complementizer and Mood or Tense and Mood, cf. Simpson (2003a).

286 Andrew Simpson

but clear evidence for such a categorial switch or reanalysis in English is not easy to identify. The Burmese patterns under study helpfully seem to suggest that the occurrence of D(emonstrative)-to-C(omplementizer) regrammaticalization and the occurrence of a formally distinct category Complementizer are indeed quite plausible.

4. Summary of conclusions and consequences for further work This study of Burmese has attempted to explore and highlight a number of properties of nominalization and its grammaticalization which can now be summarized in brief. In Section 2.1 issues concerning the sources of nominalizers were discussed and it was suggested that nominalizers may in fact arise from a variety of sources, including elements of types D, Cl, Num and N. In the course of the investigation of Literary Burmese in Section 3, evidence was presented indicating that clausal and sentential nominalizers in Burmese may be sourced from demonstratives of category D which have also grammaticalized a modal feature encoding (ir)realis distinctions. Section 2 also emphasized the need for specific morphosyntactic evidence when attempting to classify elements as nominalizers and noted that the term nominalizer may sometimes be used without clear confirmation of the nominalizer status of an element. In Section 3 it was then argued that, while certain morphemes in Burmese can be concluded to be nominalizers, other elements with a superficially similar distribution have in fact undergone reanalysis as elements of a different categorial type and no longer function as nominalizers in the language. Considering the syntactic realization of relative clauses in Burmese, it was shown that at least two functional morphemes are (historically) involved in the linking of a relative clause to a noun (phrase): (i) a nominalizer and (ii) genitive case. This raises an important question about the identity of linking morphemes in other similar languages (such as, for example, Chinese) where just a single element links a relative clause to a head-noun; are such elements nominalizers or occurrences of genitive case, or possibly even a grammaticalized composite of both? Finally, the examination of Burmese also showed how nominalization structures are potentially unstable and may arguably undergo reanalysis as purely verb-related forms, with nominalizers being absorbed into the functional system projected in a clause. Considerably rich in the information provided by its two systems of functional elements in the colloquial and literary forms of the language, Burmese consequently offers an interesting perspective on variation in processes of grammaticalization and on the reanalysis of grammaticalized morphemes in new directions.

The grammaticalization of clausal nominalizers in Burmese 287

Abbreviations acc accusative asp aspect emph emphatic gen genitive intens intensifier irr irrealis nmlz nominalizer

nom pl pol pst q rls

nominative plural politeness marker/particle past question marker realis

References Fuss, Eric. 2005. The rise of agreement: A formal approach to the syntax and grammaticalization of verbal agreement. Amsterdam: John Benjamins. Herring, Susan C. 1991. Nominalization, relativization, and attribution in Lotha, Angami and Burmese. Linguistics of the Tibeto-Burman Area 14: 55–72. Hopple, Paulette M. 2003. The structure of nominalization in Burmese. PhD dissertation, University of Texas, Arlington. Kassevitch, Vadim. 2005. Syntactic and morphological markers in Burmese: Are they really optional? In Justin Watkins (ed.), Studies in Burmese linguistics, 87–96. Canberra: Pacific Linguistics. Matisoff, James A. 1972. Lahu nominalization, relativization, and genitivization. In John Kimball (ed.), Syntax and semantics. Vol. I. New York: Seminar Press: 237–257. Noonan, Michael. 1997. Versatile nominalizations. In Joan Bybee, John Haiman & Sandra A. Thompson (eds.), Essays on language function and language type. In honor of T. Givón, 373–394. Amsterdam: John Benjamins. Okell, John. 1969. A reference grammar of Colloquial Burmese. London: OUP. Okell, John & Anna Allott. 2001. Burmese/Myanmar dictionary of grammatical forms. London: RoutledgeCurzon. Saw Tun. 2005. Modern Burmese writing: An examination of the status of Colloquial Burmese. In Justin Watkins (ed.), Studies in Burmese linguistics, 201–220. Canberra: Pacific Linguistics. Schütze, Carson. 2001. On Korean ‘case stacking:’ The varied functions of the particles ka and lul. The Linguistic Review 18(3): 193–232. Simpson, Andrew. 2001. Definiteness agreement and the Chinese DP. Language and Linguistics 2(1): 125–156. Simpson, Andrew. 2003a. Empty determiners and nominalization in Chinese, Korean and Japanese. In Audrey Li & Andrew Simpson (eds.), Functional structure(s), form and interpretation: Perspectives from Asian languages, 131–161. London: RoutledgeCurzon. Simpson, Andrew. 2003b. On the status of ‘modifying’ DE and the structure of the Chinese DP. In Sze-Wing Tang & Luther Liu Chen-Sheng (eds.), On the formal way to Chinese language, 74–101. Stanford CA: CSLI. Simpson, Andrew & Zoe Wu. 2001. The grammaticalization of formal nouns and nominalizers in Chinese, Japanese and Korean. In Thomas E. McAuley (ed.), Language change in East Asia, 250–283. London: RoutledgeCurzon.

288 Andrew Simpson

Simpson, Andrew & Zoe Wu. 2002a. Agreement, shells and focus. Language 78(2): 287–313. Simpson, Andrew & Zoe Wu. 2002b. From D-to-T: Determiner incorporation and the creation of tense. Journal of East Asian Linguistics 11(2): 169–209. Thurgood, Graham & Randy LaPolla (eds.). 2003. The Sino-Tibetan languages. London: Rout ledgeCurzon. Van Gelderen, Elly. 2004. Grammaticalization as economy. Amsterdam: John Benjamins.

chapter 11

The grammaticalization cline of cardinal numerals and numeral systems* Ferdinand von Mengden University of Hamburg This paper investigates the grammaticalization of cardinal numerals and numeral systems from a crosslinguistic perspective. The grammaticalization cline postulated here starts with a referential expression denoting a body part to a cardinal numeral and leads on to a functional affix which occurs in numeral systems and marks arithmetical operations in complex numeral expressions (e.g. English -teen in fifteen or -ty in fifty). The paper also shows that the numerals of the lowest cardinalities may follow a path different from that of the other simple numerals: they may evolve from pronominal elements, such as demonstratives and determiners, which often derive themselves from the cardinal numeral ‘1.’ The fact that low numerals and pronominal elements can be both each other’s sources and each other’s targets seems to challenge the unidirectionality of grammaticalization.

1. Introduction Cardinal numerals have received only marginal attention in the study of individual word classes, whether focussing on their morphology or on their syntax. This neglect applies particularly to studies on grammaticalization. In their World lexicon of grammaticalization (2002), Heine and Kuteva mention a couple of processes for which expressions for particular numerical values are claimed to be either typical sources or typical targets of the change. However, they do not include the crosslinguistically observable pathway of (cardinal) numerals as a source/target category. Two of the processes described in the World lexicon will be relevant for the present study: (i) the development from a word for ‘hand’ to the numeral ‘5’ (Heine and Kuteva 2002: 166) and (ii) a list of nine developments in which the numeral ‘1’ evolves into elements expressing concepts which can roughly be subsumed either as ‘indefiniteness’ or ‘singulativity, unity’ (Heine and Kuteva 2002: 219–226). In what follows, I will argue that these processes are only two aspects of a broader crosslinguistically observable developmental path in the context of * My thanks to Katerina Stathi, Florian Haas, Thomas Hanke and an anonymous reviewer for their critical comments on an earlier version of this paper.

290 Ferdinand von Mengden

cardinal numerals. In particular, I will postulate a cline comprising the following two major steps: a. Cardinal numerals are more likely to develop out of body part nouns. The potential target is any non-complex numeral (not only or necessarily ‘5’) out of a range of source expressions denoting body parts (not only or necessarily ‘hand’). b. Within numeral systems, cardinal numerals may develop into affixes which can be interpreted as grammatical markers inasmuch as they mark the relations among the constituents of complex numerals. These two claims can both be connected with an idea of grammaticalization which is generally accepted among linguists: the developments described in the following pages represent a crosslinguistic pathway from a body part noun to a numeral and further to a functional affix encoding internal arithmetic relations within complex numerals.1 Both the semantic and the formal aspects of this development are consistent with what is commonly understood as grammaticalization: a development running from a referential expression (body part noun) to an extensional modifier (a quantifier; cf. Section 2) and further to a syntactically bound relational marker. This process will be discussed in more detail in Section 4. First, however, I propose to delimit the category cardinal numeral (Section 2) and thus define the class of lexemes which constitutes the target of an early stage of this process and which will eventually also become the source category for the emergence of functional affixes within numeral systems. Inseparable from any definition of the cardinal numeral, and of especial significance in the later stages of the proposed development, is the question of how to define a numeral system (cf. Section 3).

2. Quantification 2.1 Numerically specific vs. numerically unspecific Cardinal numerals can be categorized as a subclass of quantifiers. Quantifiers can be numerically specific or numerically unspecific. As Figure 1 below shows, numerically unspecific quantifiers comprise expressions such as all, many, few 1. Hanke (2005: 111) describes numeralization and grammaticalization as two similar but distinct processes. Although Hanke’s remarks on numeralization seem to refer generally to the same panchronic phenomena which I am describing here, I would argue that the key difference from other grammaticalization processes is merely the target/source category cardinal numeral. A scenario similar to the one proposed here has been suggested by Wiese (2003: particularly 144–145), though not in the context of grammaticalization.

The grammaticalization cline of cardinal numerals 291

Quantification Numerically unspecific

Numerically specific (assignment of cardinalities to sets)

Universal

Other

Existential

Systemic (Cardinal numerals)

Non-systemic

all every both ...

few several many most ...

some a(n)

one two three fourteen twenty-three one hundred and seventy-six ...

dozen score three twenties twice a hundred

Figure 1. Types of quantifiers

or some. Numerically specific quantifiers, on the other hand, specify how many members of the respective class there are. They assign cardinalities to sets by indicating the exact number of elements in a class irrespective of the extensional meaning of the quantified noun.2

2.2 Systemic vs. non-systemic Cardinal numerals form an important set of numerically specific quantifiers. However, languages may contain additional expressions which have the same semantic function as cardinal numerals (numerically specific quantification) and very similar, if not identical, morphosyntactic properties. Such are, for instance, simple expressions like dozen ‘12’ or score ‘20,’ but also complex expressions (which may have cardinal numerals as their constituents) like three tens, twice a hundred and also, I would argue, expressions of the type fourteen hundred. In spite of the obvious similarities between these quantifiers and cardinal numerals, I would recommend excluding the former from the category ‘cardinal numeral.’ In order to justify this delimitation, we need to define the difference be2. A detailed discussion of the range of quantifier types cannot be provided here. See, for instance, the discussions in Langacker (1991: 81–89) or Gil (2001). As my focus is on cardinal numerals, the distinction between numerically specific and numerically unspecific plays a more central role here than it does in the two descriptions just mentioned.

292 Ferdinand von Mengden

tween cardinal numerals and other expressions of cardinality. In what way is, say, dozen different from twelve in English? The crucial difference between twelve and dozen is that the former is part of the conventionalized counting sequence, while the latter is not. It may seem odd at a first glance to assume that the defining feature of a word class is its membership of a conventionalized sequence of expressions. However, the conventionalized counting sequence plays a crucial role in the approach to ‘number’ of two rather unrelated scientific fields. In logic, one of few ways to define ‘number’ is based on the idea of an ordered progression of welldistinguished elements (cf. Benacerraf 1965), which is exactly what the counting sequence represents. Likewise, studies in cognitive sciences, for instance in developmental psychology and first language acquisition, show that the evolution of a counting sequence is a necessary prerequisite for the acquisition of numeracy during infancy: a child first needs to memorize the conventionalized counting sequence, at least up to a certain point, in order to be able to grasp the notion of ‘number’ (cf. Wiese 2003: particularly 151–188). As we will see in Section 3.1 (cf. footnote 5), there are also language-internal criteria which distinguish systemic from non-systemic expressions of cardinality, such as their role in a recursive principle in word formation processes involving complex numerals. Such non-systemic expressions may therefore assign cardinalities to sets, but they are not part of the numeral system. The grammaticalization path described here applies only to cardinal numerals as the numerically specific quantifiers which constitute the numeral system of a language. Before describing the cline proposed above in more detail, some further explanatory remarks on the properties of numeral systems of natural languages are necessary. The next section looks only at those aspects of numeral systems which are relevant in the context of the grammaticalization process and should not be taken as comprehensive.

3. Numeral systems 3.1 Simple and complex numerals A numeral system contains a set of mono-morphemic, arbitrarily shaped numeral forms (simple numerals). It also comprises a set of morphosyntactic rules which combine these simple numerals by means of underlying arithmetical operations to form more complex numerals. There is thus a distinction between simple and complex cardinal numerals. Simple numerals, in turn, can be subdivided into atoms and bases. In English, for example, the atoms are the expressions one, two, three, etc., up to nine, whereas the bases are the expressions ten, hundred, thousand, etc. These examples should make the concept of atom and base intuitively

The grammaticalization cline of cardinal numerals 293

quite clear and this, in turn, may help us now to discuss briefly an unambiguous and crosslinguistically valid definition of atoms and bases. One of the main properties of atoms and bases is their mutual interaction in the numeral system. They play complementary roles in the formation of complex numeral expressions and in the arrangement (ordering) of the elements in the counting sequence. In the overall system, atoms constitute a sequence of continuously recurring elements within the counting sequence or, as Seiler (1990: 190, with my own addition in square brackets) puts it, atoms are “that particular set of numerals that has the highest potential of being recursively used in cycles or with [multiples of] bases.”3 In any combination of atoms and bases, atoms are the variables and bases are the constants. Defining bases in this way, the definition of base is still dependent on that of atom. Greenberg (1978: 269–270) simply describes bases as “serialised multiplicands,” that is, as factors in multiplications which are applied to a whole series (sequence) of multipliers.4 For example, the atoms of the English numeral system are the expressions one, two, three, etc., up to nine, because the same elements recur in the same order at any higher point of the counting sequence. They occur, for example, as addends in thir-teen ‘3 + 10,’ four-teen ‘4 + 10,’ fif-teen ‘5 + 10’ or in twenty-one ‘20 + 1,’ twenty-two ‘20 + 2,’ twenty-three ‘20 + 3’ (eleven and twelve are exceptions which I shall pass over for now). They can also occur as an ordered sequence of multipliers to a base, as in twen-ty ‘2 × 10,’ thir-ty ‘3 × 10,’ for-ty ‘4 × 10’ or as in one hundred, two hundred, three hundred; and they may be added to such combinations again as an ordered sequence of addends, as in two hundred and one, two hundred and two, etc. These examples demonstrate that the numeral system employs a recursive principle by which a base is combined either with an ordered sequence of atoms or with a complex numeral, which itself is a combination of a base and an ordered sequence of atoms. Every member of the class of cardinal numerals is therefore necessarily a component part of a more complex expression, which is itself a member of this class.5 3. For a valuable summary of the properties of atoms and bases, see Seiler (1990: 190–196). Note that I follow Seiler’s terminology in this respect. Greenberg (1978), by contrast, uses atom for what I refer to as simple numeral, that is, as a cover term comprising both bases and our atoms. 4. It must be noted that, while the order of the arguments in addition and in multiplication is irrelevant for the result of both types of arithmetical operation, in linguistic numeral systems their order has morphosyntactic significance. I follow Greenberg’s (1978) terminology in referring to the constants of addition and multiplication as augend and multiplicand and to the variables of each operation as addend and multiplier, respectively. For the terms serialised augend and serialised multiplicand, cf. Greenberg (1978: 266 and 269, respectively) and Section 3.2 below. 5. This recursive principle is also a crucial criterion for excluding non-systemic numerals from the class of cardinal numerals as proposed in Section 2.2 above. Although we may say two dozen,

294 Ferdinand von Mengden

3.2 Variant forms So far I have described a prototypical numeral system. However, hardly any numeral system is devoid of idiosyncrasies. In English, for instance, the expressions eleven ‘11’ and twelve ‘12’ deviate from the other elements of that subsection of the counting sequence in that they are not formed as additions to the base ‘10.’ Thus, while eleven and twelve are mono-morphemic and arbitrary in their shape (at least synchronically), the subsequent numbers are expressed as thir-teen ‘3 + 10,’ fourteen ‘4 + 10,’ fif-teen ‘5 + 10’ and so forth, and, therefore, have a non-arbitrary shape formed according to the recursive principle of the numeral system. Such idiosyncrasies do not defy the regularity of the overall system. In fact, idiosyncrasies are integrated into the general pattern and take part in the recursive formation of more complex numerals. An expression requiring the numerical value ‘12’ as addend, for instance, never employs the (regular) constituent *two-teen, but it is only the idiosyncratic twelve that occurs in more complex combinations (e.g. one hundred and twelve, twelve thousand eight hundred and so on). Crosslinguistically, there are clearly identifiable types of idiosyncrasies in numeral systems, though I cannot provide a detailed typology of such irregularities here. One type of irregularity, however, is important in our context and will, in fact, play a central role in the grammaticalization process of cardinal numerals: if a given element is a constituent of a more complex numeral, it can vary in form with respect to the corresponding simple expression. Often, these variants are mere allomorphs, as are the following examples from the numeral system of English:

(1) a. /twen-/ for /tuː/ in twen-ty b. /θɜr-/ for /θriː/ in thir-teen, thir-ty c. /fɪf-/ for /faɪv/ in fif-teen, fif-ty

In (1a), the atomic numeral two /tuː/ ‘2’ occurs in its variant form twen- /twen-/ if used as a multiplier of the base ‘10,’ i.e. in twenty ‘2 × 10.’ The atomic numeral three /θrɪː/ ‘3’ occurs as /θɜr-/ in any combination with ‘10,’ whether as an addend or as a multiplier; cf. (1b). In the same way, the atom five /faɪv/ ‘5’ becomes /fɪf-/ (cf. (1c)) in any combination with an immediate augend or multiplicand ‘10.’ More significant in our context are the variants shown in (2).

(2) a. /-tiːn/ for /ten/ in thir-teen, four-teen, etc. b. /-tɪ/ for /ten/ in twen-ty, thir-ty, etc.

three dozen, etc., the recursive properties of non-systemic expressions is extremely limited; cf. ?fourteen dozen, *a hundred and a dozen, *a dozen hundred.

The grammaticalization cline of cardinal numerals 295

I propose that these elements, -teen and -ty, are functional suffixes and that they are therefore more grammatical in the sense that they are the product of an advanced step in a grammaticalization process. In the numeral system of English, -ty and -teen do not only occur in complementary distribution with the corresponding default form ten, but both have also independent functions within the system. Thus, -teen stands for ‘10’ if, and only if, ‘10’ is a constituent of a complex form in which this numerical value is a serialized augend, i.e. where it is the augend to the ordered sequence of atoms, as in thir-teen ‘3 + 10,’ four-teen ‘4 + 10,’ fif-teen ‘5 + 10’ and so on. Similarly, -ty stands for ‘10’ if, and only if, ‘10’ is a constituent of a complex form in which it is used as serialized multiplicand, i.e. where it is the multiplicand to the ordered sequence of atoms, as in twen-ty ‘2 × 10,’ thir-ty ‘3 × 10,’ for-ty ‘4 × 10,’ etc. If we assume that -teen and -ty are diachronically derived from a form representing ‘10’ we can say that -teen and -ty are functional variants of the more lexical simple numeral ten.6 Given, furthermore, that -teen and -ty are bound forms, we can argue that they are functional morphemes and, consequently, the way in which they emerge can be taken as a development comparable to that of a grammaticalization process.7 So far our discussion has been based exclusively on examples from English. Although other numeral systems have their own particularities, the general recursive principle of numeral systems is universally the same. This principle, which requires that combinations of atoms and bases are built on arithmetical operations, is crucial for the emergence of a numeral system and, hence, for the development from a fully referential noun to a cardinal numeral as a particular type of quantifier, and later on to a functional element within numeral systems. In the next section I will propose the individual steps of this development. Although this paper is not based on a sufficiently comprehensive and representative sample of crosslin6. For -ty, see, for example, the very detailed study by Szemerényi (1960: 115–140). 7. It may be argued that, in formations like fif-teen, thir-ty, etc., atom and base together form a new lexeme so that we are dealing with a compound-like formation rather than with a type of affixation. While superficially both interpretations seem possible, there is still one difference between the two view-points which I consider crucial in our context. If, say, fif-teen were a concatenation of two roots rather than that of an affix to a root, there would be no way to explain how the type of arithmetic operation is encoded in these expressions. Only if we ascribe the respective functions ‘+ 10’ to -teen and ‘× 10’ to -ty can we account for the exact nature of the relation between the two constituents. In addition to the numerical meaning ’10,’ the suffixes -teen and -ty encode the arithmetical operations ‘+’ and ‘×’ respectively, and thus have a functional character. It may now be objected that this functional character could also be maintained for atomic variants, i.e. for those variants listed in (1) above. In English at least, this is impossible, because thir- and fif- occur in both the additions thir-teen/fif-teen and in the multiplications thirty/fif-ty. Therefore, the type of relation between the two elements — whether addition or multiplication — must be encoded in the variants of the base (2) and not in those of the atom (1).

296 Ferdinand von Mengden

guistic data, I believe that the remarkable uniformity of the overall principles of numeral systems in natural languages allows for conclusions of a pathway of cardinal numerals which are valid crosslinguistically.

4. The emergence of numeral systems 4.1 Methodology To examine grammaticalization processes in the context of numerals it is essential that we trace back the lexical sources of cardinal numerals and identify evidence of a crosslinguistically predominant pathway leading to the emergence of cardinal numerals. Unfortunately, the comparative method is only of limited use in this respect. Often, numeral systems are already established and have considerable scope as early as the stage of a reconstructed proto-language. In the earliest traceable period of Proto-Indo-European, for instance, the reconstructed proto-forms are already genuine cardinal numerals and, as such, elements of a system comprising at least two decimal bases, namely ‘10’ and ‘100,’ and whose scope therefore extends at least as far as ‘999.’ In such cases, we have scant evidence for the source expressions of the respective numerals and so are forced to employ alternative methods if we want to find out more about the pre-numeral (or pre-systemic) stage of a lexeme. The comparative uniformity of cardinal numerals, from a crosslinguistic perspective, proves a great advantage for our present purpose, and so too the fact that natural languages have numeral systems of varying complexity and scope. These two factors make it possible to resort to implicational universals on cardinal numerals and numeral systems from which inferences can be drawn regarding their (pre-)history. We can assume, for example, that if the existence of feature b in a language requires synchronically the existence of feature a in the same language, and if the reverse is not true, diachronically the emergence of feature a must necessarily precede that of feature b. This method may not be generally applicable in reconstructing linguistic forms. Its logical feasibility in the context of numerals is due to two inherent and necessary properties of numeral systems. One is the fact that numbers (and hence numerals) are an ordered progression of well-distinguished elements (cf. Section 2.2 above). Their arrangement as an ordered progression creates a particular kind of hierarchy among systemic number expressions which is not found in other parts of speech. Secondly, universally numeral systems do not have gaps (Greenberg 1978: 254–255).8 The lack of gaps in the counting sequence makes the 8. Where there seem to be gaps in our data, these are usually gaps in the documentation of

The grammaticalization cline of cardinal numerals 297

following inferences possible: (i) if a is a necessary condition for b, a must necessarily be situated at a lower valued place in the counting sequence than b; (ii) because a lower valued numeral necessarily precedes a higher valued one diachronically (otherwise there would be gaps), the emergence of a necessarily precedes that of b diachronically. At least in the context of cardinal numerals we can therefore read an implicational universal b ⇒ a (‘b, only if a’) as a diachronic statement of the form ‘a precedes b.’

4.2 Step 1: Subitizing The simplest conceivable set of cardinal numerals in a language would be one that consists only of arbitrary and mono-morphemic lexical elements. Once such a counting sequence reaches a certain size, it would be difficult to memorize the correspondence between an expression and the respective numerical value because of the arbitrariness of the correspondence. In fact, there are few examples of languages with numeral systems with simple expressions only. Crosslinguistically, there seems to be a gap between languages with simple number expressions reaching only ‘2’ or ‘3,’ and exceptionally ‘4,’ and numeral systems with a considerably larger scope. This gap may be explained by the fact that humans distinguish very small cardinalities in a different way from higher cardinalities. While we need to count items in order to determine the size of a set of four or more members, cardinalities up to ‘3’ or ‘4’ are perceived instantaneously. This cognitive process is called subitizing. Wiese (2003: 95) describes this ability as follows:9 numeral systems, not in the systems themselves (cf. Greenberg 1978: 254–255; Hurford 1987: 86–88; Hanke 2005: 24). Hurford, however, seems to consider the possibility of numeral systems with gaps. In my view, this is impossible, because the cognitive capacity to assign cardinalities above ‘4’ to sets is dependent on the existence of a continuous sequence. The cases where the last elements before a base are formed as subtractions from that base (i.e. a counting sequence expressed as {[…], ‘5,’ ‘6,’ ‘7,’ ‘10–12,’ ‘10–11,’ ‘10’}), which Hurford (1987: 88–90) takes as “evidence for the existence of a discontinuity in numeral expressions at a historically previous stage in the language,” may just as well be taken as evidence for the impossibility of gaps: whenever a base emerges, the resulting gap needs to be filled immediately, even though simple expressions have so far not been introduced for the intermediate numerical values. Comrie (1997: 44–46) also makes a case for the possibility of gaps in numeral systems. The general problem with Comrie’s claim is that he implicitly takes any cardinality expression as being an element of the numeral system and, accordingly, offers non-systemic expressions as evidence. However, if a distinction is not drawn between systemic and non-systemic cardinality expressions, it would ultimately be redundant to speak of numeral systems at all. 9. For a more detailed description of subitizing, cf., for example, Hurford (1987: 93–94) and Wiese (2003:Chapter 4, particularly 95–98).

298 Ferdinand von Mengden

‘Subitising’ […] refers to a nearly instantaneous process that allows us to discriminate small sets of different sizes. Subitising occurs automatically, accurately, and without conscious attention; it enables us to recognise the cardinality of sets with up to three or four elements, for instance, it enables us to tell two stars from three stars, without invoking numerical strategies like counting.

Where a language has number expressions only up to ‘3’ or ‘4,’ these expressions are not actually cardinal numerals in the sense in which we have defined them in Section 2.2 above. They constitute neither an ordered progression nor a recursive system. There is no distinction at this stage between systemic and nonsystemic expressions of cardinality. However, diachronically, the development of expressions for subitizable cardinalities is at least a first step towards the development of a numeral system. We will return to this group of expressions for subitizable cardinalities in Section 5, after the remaining stages of the grammaticalization cline have been described.

4.3 Step 2: Beyond subitizing: The emergence of numeracy in language 4.3.1 Step 2a: Additional lexemes Phylogenetically, the first instance of numeracy, that is, the capability to distinguish cardinalities of sets, is not a spoken number expression but a gestural one. As already mentioned, numeracy is based on the availability of an ordered progression of well-distinguished elements. Such instances of numbers do not need to be linguistic and the most obvious device for humans to conventionalize such a progression is the human body, where salient points are in ordered location with respect to each other. Referring indexically to these ordered body parts may have the function of assigning cardinalities. In other words, body parts can be used for counting just as, for instance, finger counting is common in most societies of the world. Not only is body-part counting a frequently observed non-linguistic (or prelinguistic) method of counting, it is also the method of counting which gradually develops into a linguistic counting system. There is sufficient evidence — provided in studies in various disciplines such as linguistics, cultural history, ethnology or anthropology — to demonstrate that the use of fingers or the subsequent pointing at a conventionalized series of body parts goes parallel with the expression of the word for the respective body part (cf., for example, Menninger 1957: 43–49; Greenberg 1978: 257; Ifrah 1981: 24–29; Majewicz 1981; Heine 1997: 19–21). We can assume, therefore, that once a particular progression of salient body parts has been conventionalized for counting, the expressions for the respective body parts become a counting sequence. Once this linguistic sequence of expressions is established, the accompanying gestures are no longer needed and the expressions

The grammaticalization cline of cardinal numerals 299

themselves become the main instances of numbers in a language community. We can safely assume that body-part expressions are the main sources for cardinal numerals. In fact, what we have described is a shift from a fully referential noun (‘body part’) to a more grammatical expression (quantifier). Whether a quantifier counts as a grammatical expression in the same sense as, for instance, an inflectional affix may of course be disputed. In any case, a quantifier is a nonreferential expression. As an extensional modifier of a referential expression (of the head noun), it prototypically serves, semantically and syntactically, as a functor of an argument. In this respect, a quantifier is at any rate more functional than a referential noun and it is in this sense that the developmental step which we have described here may be taken as a step on a particular cline of grammaticalization. Furthermore, we have seen that the lexical body-part meaning of the expression may still be in use, even after a more functional usage of the expression — its use as a numerically specific quantifier — has emerged. We have therefore a clear case of layering, a phenomenon typical of grammaticalization processes: the contemporary use of both the original body-part expression and the (more grammatical) numeral. In many cases the original (lexical) meaning of the expression is eventually rendered opaque.

4.3.2 Step 2b: Sporadic complex forms Once linguistic number expressions have become autonomous and distinct from their lexical sources, a new problem arises. While the order of a sequence of salient body parts was a visible and culturally logical sequence of points of reference, the order of number expressions is now arbitrary and, therefore, needs to be memorized. The larger the sequence, the more difficult it will be to memorize all arbitrary forms. To extend the scope of the counting sequence, complex number expressions are formed (cf. Greenberg 1978: 256), either overtly or implicitly, by combining a pre-existing number word with another morpheme. The morpheme attached may or may not be another numeral, but the formation of the new expression is in any case motivated by an underlying arithmetical operation. Consider the examples in (3).

(3) a. b.

Arrarnta (Australian) tara-ma-nyinta 1-and-2 ‘3’ (Stampe 1976: 604) Welsh (Celtic, Indo-European) deu-naw 2 [×] 9 ‘18’ (King 1993: 113–114)

300 Ferdinand von Mengden

c. d.

Lithuanian (Baltic, Indo-European) dvý-lika 2-remaining [beyond ten] ‘12’ (Comrie 1992: 763, Section 16.1.11.1) Proto-Indo European *oḱtō(w) 4.dual ‘8’ (Gamkrelidze and Ivanov 1995:I, 747, Section 10.3.5)

Independently of how explicitly the operator is expressed, each of the complex forms in (3) is based on underlying arithmetical operations. In some cases the arithmetical operations are expressed by combining pre-existing number forms with each other — with the arithmetical operator expressed overtly (3a) or not (3b). In other cases, the relation is expressed by combining a numeral with a non-numeral morpheme — whether lexical, as in (3c), or grammatical, as in (3d). At this point of development, combinations of this type are employed sporadically, i.e. in a non-systematic way, but the arithmetical principle underlying these formations is already potentially recursive. These sporadic complex formations are likely to become opaque. Neither English twelve — which goes back to what is etymologically the same formation as the Lithuanian numeral in (3c) — nor eight — derived from the Proto-Indo-European root in (3d) — nowadays show signs that they were once complex expressions. Synchronically, they are mono-morphemic and arbitrarily shaped forms. From the point of view of the system, they are indistinguishable from the type of simple expressions which originate from bodypart terms. For this reason, a more efficient method to expand the set of number expressions is necessary, one that follows a recursive principle and renders the formation of high valued numerals transparent, so that a small inventory of simple expressions suffices for the formation of numerals, even if they specify high cardinalities. This method is the serialization of arithmetical operations.

4.4 Step 3: Serialization At this stage complex forms are no longer formed sporadically, but a whole sequence of (simple and arbitrary) numerals can now be combined with a single (simple and arbitrary) numeral. The result may be, for instance, a sequence like that in (4a) where the sequence of numerals from ‘1’ to ‘9’ is added to the base numeral decem ‘10,’ resulting in the sequence in (4b).

The grammaticalization cline of cardinal numerals 301

(4) Medieval Latin (Italic, Indo-European) a. unus ‘1’ duo ‘2’ tres ‘3’ quattuor ‘4’ quinque ‘5’ sex ‘6’ septem ‘7’ octo ‘8’ nouem ‘9’ decem ‘10’ b. un-decim ‘1 + 10’ → ‘11’ duo-decim ‘2 + 10’ → ‘12’ tre-decim ‘3 + 10’ → ‘13’ quattuor-decim ‘4 + 10’ → ‘14’ quinque-decim ‘5 + 10’ → ‘15’ se-decim ‘6 + 10’ → ‘16’ decem et septem ‘10 + 7’ → ‘17’ decem et octo ‘10 + 8’ → ‘18’ decem et nouem ‘10 + 9’ → ‘19’

For most languages, it is difficult to reconstruct how such serializations emerge. Again, we can employ crosslinguistic evidence to conjecture as to the historical background of this process. Greenberg (2000: 773b) remarks that, crosslinguistically, “by far the most common [numeral system] is the decimal [system].” This observation is immediately followed by the most plausible explanation for the predominance of decimal numeral systems in the world’s languages: “Its origin from finger counting is intuitively obvious, and in a fair number of instances etymologically clear.” 10 Once more, this shows that linguistic counting originates phylogenetically in body-part counting. Greenberg’s statement also strongly suggests that an expression for a salient point of the human body is a likely lexical source for the emergence of a base numeral as well. Once the conventionalized sequence of body parts is used up, a more prominent point in the sequence, an obvious turning point or a more general notion — like ‘hand(s)’ when all the fingers are 10. Greenberg (2000: 773b) states that, apart from the decimal system, we can find quinary, duodecimal, vigesimal and sexagesimal systems. Of these, quinary and vigesimal systems, next to the vast majority of decimal systems in natural languages, suggest digital/body-part counting as the universally predominant, if not exclusive, type of counting system.

302 Ferdinand von Mengden

used up — can be employed as a base for a serialized arithmetical operation. Typical sources for bases are therefore expressions for ‘hand’ or even more generic expressions like ‘body’ or ‘man’ (cf. Edelman 1999: 229–231 for examples). What is crucial at this point is that the resulting combinations maintain their arithmetical transparency through serialization. A whole sequence of atomic expressions is (morphologically or syntactically) attached to a base expression, and at the same time forms an (arithmetic) function of this constant element. While these combinations are originally mere juxtapositions, we can see a couple of slight formal modifications in the atomic expressions in our example (4b).11 These variants can be interpreted as phonologically restricted allomorphs of the genuine atoms in (4a). Distinctions may also occur in the serialized augend (cf. Section 3.2 above, examples (1) and (2), respectively). In example (4b), this is only partly the case, namely in the expressions from ‘11’ to ‘16,’ where decem becomes decim. This slight variation does not yet play a role in the overall system, but represents the starting point of a phonological change (usually attrition) which accompanies the development from a base as a simple element, as in (4a), to a compound element (4b) and to a subsequent functional element in particular arithmetical operations (cf. Section 4.5 below). With the introduction of serialization into the formation of new numeral expressions, the distinction between atoms and bases emerges. It should be noted that, according to the definition provided in Section 3.1, only a serialized multiplicand, but not necessarily a serialized augend, can be a true base of a numeral system. Whereas it is usually the case that a serialized multiplicand is also used as a serialized augend, the reverse is not necessarily true. Although the introduction of multiplication is a necessary step in the extension of numeral systems, it generally presupposes the introduction of addition (cf. Greenberg 1978: 258). Once serialization is introduced into the numeral system, both originally mono-morphemic expressions (Sections 4.2 and 4.3.1) and what were formerly sporadic complex expressions (Section 4.3.2) become part of a cyclically recurring sequence of atomic elements in continuous combination with bases. Originally complex forms which are used within the atomic sequences have a tendency to become etymologically opaque (Section 4.3.2). In contrast, the complex expressions employed as constants in these serialized operations have a tendency to remain transparent, because in principle the complex structure needs to be analysable. 11. Theoretically, an overt expression of the type of the arithmetical operation ‘+’ would be possible, and it is in fact encoded in many languages, for instance by ‘and,’ ‘with’ or similar connectives (in our example (4b), only part of the sequence contains the element et ‘and’ → ‘+’). For the present purpose it is not relevant whether the arithmetical operator is overtly expressed and, if it is expressed, by means of which lexeme. For a typology of the relevant strategies to express arithmetical operators, cf. Greenberg (1978: 263–272) and Hanke (2005: 95–110), among others.

The grammaticalization cline of cardinal numerals 303

4.5 Step 4: Functional elements within numeral systems In the preceding section we mentioned that, once serializations have become conventionalized, changes in the shape of a serialized augend or multiplicand may be accompanied by the paradigmatic isolation of the augend/multiplicand from the default lexeme.12 The augend/multiplicand expression, therefore, represents not only the respective cardinality, such as ‘10,’ but also the function ‘+’ or ‘×.’ This means that both the modification of the phonological shape and the paradigmatic isolation or loss of syntactic autonomy can be seen as a consequence of the acquisition of a functional meaning. Such a process results in forms like English /-'tiːn/ ‘+ 10’ and /'-tɪ/ ‘× 10.’ As to -teen, the process of its paradigmatic isolation can be observed within the literary history of English. In Old English, the suffix for the teens, -tyne /-tiːnə/, is still analysable as the simple element tyn /tiːn/ ‘10’ plus a fossilized inflectional ending -e. Regular sound changes caused a further isolation of the two items: the simple numeral was subject to shortening of monosyllables in Early Middle English, and the suffix, in turn, lost its final syllable. Hence the present-day distinction /ten/ vs. /-'tiːn/. However, the processes suggested here are often too old for any evidence of the development to survive. Accordingly, the corresponding process of English -ty is even more difficult to trace back. In any case, a clear distinction between the two forms, ‘10’ and ‘× 10,’ must have existed in a common Indo-European numeral system, since the respective items can be reconstructed as related but distinct elements in the earliest stages of Proto-Indo-European (cf. Szemerényi 1960: 115–140; 1990: 238–239, Section VIII.5.3; Gamkrelidze and Ivanov 1995:I, 744–746, Section 10.3.4). The resulting element, such as English /'-tɪ/, still represents the numerical value ’10,’ but it does so only in a limited way, that is, in its use as a multiplicand. The newly emerged morpheme is a functional morpheme which has grown out of a simple numeral ‘10.’ Elements like /-'tiːn/ and /'-tɪ/ in English and corresponding forms in other languages therefore resemble grammatical morphemes in many respects, although the paradigmatic system to which they belong is, of course, quite different from, for example, that of inflectional endings. The grammar where they received a function which distinguishes them from less functional elements is the recursive numeral system with its underlying arithmetical operations on the basis of which complex numerals are generated. Crosslinguistically, there are vari12. I presume that this is more likely to happen if the two elements, addend/multiplier and augend/multiplicand, are juxtaposed without overt expression of the operators ‘+’ or ‘×.’ In (4b) above, we can see that, where the operator ‘+’ is overtly expressed by et, the augend decem ‘10’ remains unaltered. By contrast, where the operation is expressed only by a juxtaposition of the two summands, a change of shape occurs. This is, however, a tendency rather than a rule.

304 Ferdinand von Mengden

ous strategies for such combinations, not all of them suggesting the same kind of grammaticalization process in the same way. Nonetheless, the fact that these combinations are universally based on arithmetical operations and on serializations is a potential trigger for the paradigmatic isolation of individual constituents of complex expressions once the syntagmatic (arithmetical) relationship among the constituents needs to be encoded. Consequently, there is a tendency towards the grammaticalization of serialized augends and serialized multiplicands if they are component parts of complex numerals.

4.6 Step 5: Higher bases Once cardinalities above ‘99’ need to be expressed, it becomes necessary to introduce a second base, and much later on a third and a fourth base. This final step renders a numeral system potentially infinite. The introduction of ‘100’ as the second base of a system allows for an extremely far-reaching extension of the system. In our context it may be of interest to note that these higher bases also have their typical source expressions. Quite often they are intensifications of lower bases, as in Proto-Indo-European *kṃtóm ‘100,’ which has developed from *(d)ḱṃtóm, a derivative of ‘10’ (cf. Szemerényi 1990: 239–240; Gamkrelidze and Ivanov 1995:I, 744–746). Similarly, Proto-Germanic *þus-hund ‘1000,’ an expression which, in contrast to ‘100,’ is not shared by most of the Indo-European daughter languages, is analysed as ‘strong hundred’ (cf. Szemerényi 1990: 240–241). Another analogy to this formation type would be the expression for the fourth base in many languages, which ultimately goes back to Old Italian milli-one, ‘great thousand.’ 13

5. Conclusion and outlook Several of the developmental steps described here form together a cline which leads from a noun to a cardinal numeral, and from there to a functional affix which occurs in numeral systems and marks arithmetical operations in complex numeral expressions. The cline contains all the features which are broadly agreed to be part of a grammaticalization process (see Figure 2). During this development, the relevant expression loses syntactic autonomy in the sense that it derives from a fully referential expression and becomes an exten13. The fact that million is — via French and later also via English — borrowed into a large number of languages in all parts of the world shows that any introduction of a higher base — in fact any extension of a numeral system — may just as well be due to language contact. Interesting though the mechanisms of contact-induced changes in numeral systems may be, this paper must confine itself to language internal changes during the development of numeral systems.

The grammaticalization cline of cardinal numerals 305

?

Indefinite pronouns

Body-part nouns

Lowest-valued cardinalities (‘1’–’3’)

Simple numerals

Figure 2. The grammaticalization cline of cardinal numerals

{Atoms Bases Arithmetical markers

sionally modifying constituent of a noun phrase. English -teen and -ty and corresponding affixes in other languages are clearly bound morphemes which are syntactically less autonomous than simple numerals. The resulting expressions may undergo phonological changes. This applies both to numerals derived from body-part expressions (as the etymological relation between the numeral and the source noun is, in many cases, no longer analysable) and to affixes derived from simple numerals. I hesitate to use the term phonological attrition in this context, although most cases certainly show a reduced phonological shape. It may be an exceptional case, but the suffix -teen in English is phonetically expanded rather than reduced in comparison with the source expression ten (cf. the phonological development sketched in Section 4.5 above). As a final point it should be noted that the numerals for the lowest cardinalities, described above in Section 4.2, may follow a path different from that of the other simple numerals, which, as mentioned, tend to develop out of body-part nouns. It has been suggested for Proto-Indo-European that the respective lowest numerals are derived from pronominal elements, either demonstratives or determiners (cf. Luján-Martínez 1999: 206–207 and the literature cited there). However, the reverse development has frequently been observed, from the cardinal numeral ‘1’ to an indefinite marker or pronoun (Heine and Kuteva 2002: 220–221). To suggest that lower numerals are derived from pronominal elements would, if true, seriously challenge the idea of an underlying crosslinguistically directional development path from numeral to pronoun. If this evolution from pronominal element to cardinal numeral involved Indo-European alone, we could explain the problem away by saying that the reconstructed etymologies of these Proto-Indo-European numerals should be reconsidered. Similarly, we could refer to more reluctant Indo-Europeanists who do not mention pronominal or indefinite sources for the lowest numerals (cf., for example, Gamkrelidze and Ivanov 1995:I, 744–745, Section 10.3.4). However, there is evidence of other numeral systems in which the expressions ‘1’ and ‘2’ are unambiguously synchronically identical with indefinite determiners. Hale (1975: 296–297) describes a pairing system in Warlpiri (South-West-Pama-Nyunga, Australian) which is based entirely on the only two

306 Ferdinand von Mengden

existing simple expressions of cardinality, tjinta ‘1’ and tjirama ‘2.’ The numeral system is in principle binary, in that any higher cardinality is expressed as an addition of the elements ‘1’ and ‘2.’ 14 At the same time, the two expressions are part of a paradigm of four indefinite determiners, which correspond to the four grammatical number categories of Warlpiri: tjinta ‘one’/[singular], tjirama ‘two’/ [dual], wirkaḍu, maṇkurpa ‘several’/[paucal] and panu ‘many’/[plural]. While this situation is typical of Australian languages (Stampe 1976: 604; Dixon 2002: 67), the same family would also appear to contain more advanced numeral systems, like that of Gunwinggu, in which expressions for ‘hand/5,’ ‘hand + hand/10,’ ‘hand + hand + foot/15,’ etc. have created quinary serializations (Hale 1975: 296). This shows that former indefinites can become integrated into a larger, more complex numeral system. If it is true that indefinite pronouns and cardinal numerals for ‘1,’ ‘2,’ and perhaps for ‘3,’ can be each other’s sources as well as targets, then we are dealing with a section of the numeral system in which the directionality of the process is undetermined, which complicates the definition of grammaticalization as a unidirectional development. The behaviour of these expressions challenges the idea of grammaticalization as a clinal process, not just from a lexical to a grammatical expression but also from a grammatical to a more grammatical one, since there may also be shifts from one to another grammatical category of roughly the same level of grammaticalness. It would deem it reasonable, therefore, to say that a process from an indefinite pronoun to a low valued cardinal numeral as well as the inverse process are cases of exaptation rather than of grammaticalization.15 However, this interpretation should not be understood to mean that the remaining parts of the cline may not be described as involving grammaticalization. While the grammaticalization path proposed here seems to be evident in general, problems like the source categories for the lowest valued elements show that more research needs to be done in this field. The task will require several linguistic sub-disciplines. In a case like this, comparative historical linguistics will be as crucial as the methods and results from linguistic typology. The development sketched here goes to show how the reconstruction of the linguistic pre-histo14. I do not agree with Hale’s interpretation that such a method does not represent a “linguistic convention which is employed in situations in which the activity of counting […] is a practical necessity” (1975: 295) and that in Warlpiri there is a “lack of conventionalised numerals” (1975: 296). Pairing systems like the one described by Hale for Warlpiri are found in languages in different parts of the world (cf. Schmidt 1926: 358; Heine 1997: 19; Hanke 2005: 83–85). 15. For the notion of exaptation, cf. Lass (1990). More generally, I would not be surprised if other processes which allegedly go from less to more grammatical are in fact cases of exaptation rather than grammaticalization and, similarly, that processes allegedly going in the other direction are also cases of exaptation rather than some process opposite to grammaticalization.

The grammaticalization cline of cardinal numerals 307

ry and the identification of crosslinguistic patterns of language change (such as grammaticalization) are mutually interrelated goals in linguistic research.

References Benacerraf, Paul. 1965. What numbers could not be. The Philosophical Review 74(1): 47–73. Comrie, Bernard. 1992. Balto-Slavonic. In Jadranka Gvozdanović (ed.), Indo-European Numerals [Trends in Linguistics. Studies and Monographs 57], 717–833. Berlin: Mouton de Gruyter. Comrie, Bernard. 1997. Some problems in the theory and typology of numeral systems. In Bohumil Palek (ed.), Proceedings of the conference Item Order in Natural Languages, Prague, August 16–18, 1994 [Acta Universitatis Carolinae – Philologica 3–4], 41–56. Prague: Charles University Press. Dixon, Robert M. W. 2002. Australian languages: Their nature and development. [Cambridge Language Surveys]. Cambridge: CUP. Edelman, Džoj I. 1999. On the history of non-decimal systems and their elements in numerals of (Indo-)Aryan languages. In Jadranka Gvozdanović (ed.), Numeral types and changes worlwide [Trends in Linguistics. Studies and Monographs 118], 221–241. Berlin: Mouton de Gruyter. Gamkrelidze, Thomas Valerianovic & Vjaceslav Vsevo'lodovic Ivanov. 1995. Indo-European and the Indo-Europeans: A reconstruction and historical analysis of a proto-language and a proto-culture. 2 vols. [Trends in Linguistics. Studies and Monographs 80]. Berlin: Mouton de Gruyter. Gil, David. 2001. Quantifiers. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and language universals: An international handbook [Handbücher zur Sprach- und Kommunikationsforschung 20(2)], 1275–1294. Berlin: Mouton de Gruyter. Greenberg, Joseph H. 1978. Generalizations about numeral systems. In Joseph H. Greenberg (ed.), Universals of human language. Vol. III: Word structure, 249–295. Stanford CA: Stanford University Press. Greenberg, Joseph H. 2000. Numeral. In Geert Booij, Christian Lehmann & Joachim Mugdan in collaboration with Wolfgang Kesselheim & Stavros Skopeteas (eds.), Morphology: An international handbook on inflection and word-formation [Handbooks of Linguistics and Communication Sciences 17(1)], 770–783. Berlin: Mouton de Gruyter. Hale, Kenneth. 1975. Gaps in grammar and culture. In M. Dale Kinkade, Kenneth L. Hale & Oswald Werner (eds.), Linguistics and anthropology. In honor of C. F. Voegelin, 295–315. Lisse: de Ridder. Hanke, Thomas. 2005. Bildungsweisen von Numeralia: Eine typologische Untersuchung [Berliner Beiträge zur Linguistik 3]. Berlin: Weißensee. Heine, Bernd. 1997. Cognitive foundations of grammar. Oxford: OUP. Heine, Bernd & Tania Kuteva. 2002. World lexicon of grammaticalization. Cambridge: CUP. Hurford, James R. 1987. Language and number: The emergence of a cognitive system. Oxford: Blackwell. Ifrah, Georges. 1981. Histoire universelle des chiffres. Paris: Seghers.

308 Ferdinand von Mengden

King, Gareth. 1993. Modern Welsh: A comprehensive grammar. London: Routledge. Langacker, Ronald W. 1991. Foundations of cognitive grammar. Vol. 2: Descriptive application. Stanford CA: Stanford University Press. Lass, Roger. 1990. How to do things with junk: Exaptation in language evolution. Journal of Linguistics 26: 79–102. Luján Martínez, Eugenio Ramón. 1999. The Indo-European system of numerals from ‘1’ to ‘10’. In Jadranka Gvozdanović (ed.), Numeral types and changes worldwide [Trends in Linguistics 118], 199–219. Berlin: Mouton de Gruyter. Majewicz, Alfred Franciszek. 1981. Le rôle du doigt et de la main et leurs désignations en certaines langues dans la formation des systèmes particuliers de numération et des noms de nombre. In Fanny de Sivers (ed.), Le main et les doigts dans l'expression linguistique II: Actes de la Table Ronde Internationale du CNRS, Sèvres (France) 9–12 septembre 1980 [LACITOdocuments: Eurasie 6], 193–212. Paris: SELAF. Menninger, Karl. 1957. Zahlwort und Ziffer: Eine Kulturgeschichte der Zahl. Band I: Zählreihe und Zahlsprache. Göttingen: Vandenhoeck und Ruprecht. Schmidt, P. Wilhelm 1926. Die Sprachfamilien und Sprachenkreise der Erde. [Kulturgeschichtliche Bibliothek. 1. Reihe: Ethnologische Bibliothek 5]. Heidelberg: Winter. Seiler, Hansjakob. 1990. A dimensional view on numeral systems. In William Croft, Keith Denning & Suzanne Kemmer (eds.), Studies in typology and diachrony: Papers presented to Joseph H. Greenberg on his 75th birthday [Typological Studies in Language 20], 187–208. Amsterdam: John Benjamins. Stampe, David. 1976. Cardinal number systems. In Salikoko S. Mufwene, Carol A. Walker & Sanford B. Steever (eds.), Papers from the Twelfth Regional Meeting of the Chicago Linguistic Society, April 23–25, 1976, 594–609. Chicago IL: Chicago Linguistic Society. Szemerényi, Oswald. 1960. Studies in the Indo-European system of numerals [Indogermanische Bibliothek. Dritte Reihe: Untersuchungen]. Heidelberg: Winter. Szemerényi, Oswald. 1990. Einführung in die vergleichende Sprachwissenschaft [Die Sprachwissenschaft: Einführungen]. Darmstadt: Wissenschaftliche Buchgesellschaft. Wiese, Heike. 2003. Numbers, language and the human mind. Cambridge: CUP.

Nominalizers in East Asian and Tibeto-Burman languages

chapter 12

The development of nominalizers in East Asian and Tibeto-Burman languages* Foong Ha Yap and Stephen Matthews Chinese University of Hong Kong and University of Hong Kong This paper presents evidence of recurring patterns as well as of language-specific variations in the development of nominalizers across several East Asian and Tibeto-Burman languages. Special attention is pato the different grammaticalization pathways of nominalizer morphemes and to the links between different types of nominalization phenomena occurring in the languages of this area. The article also offers an in-depth discussion of a number of controversial issues in the domain of nominalization, among others the following: the interaction of nominalization, tense–aspect and evidentiality–mood interpretation, the relationship between nominalization and copula usage, and the analysis of so-called forgotten or residual nominalizers.

1. Introduction Nominalization is a highly pervasive phenomenon. This comes as no surprise, given that speakers need to refer not only to concrete and tangible entities, such as themselves and their fellow humans, as well as objects they encounter as part of their human experience, but also to more abstract entities, such as events and propositions. Frequently, reference to these more abstract entities invokes a process of nominalization. Consider the examples from English in (1) below. In (1a), the noun phrase the sea is expressed as a patient or theme in object position. There is little question that the sea is fully referential; morphosyntactically, the presence of the determiner the clinches its identity as a nominal. Constructions such as his

* We gratefully acknowledge research funding from the Chinese University of Hong Kong (Direct Grant #4450128). We also thank the editors of this volume, two anonymous reviewers and fellow participants from the workshop on The Grammaticalization of Nominalizers: East Asian Perspectives for valuable feedback: Yang Gu, Kaoru Horie, Yumiko Nishi, Michael Noonan, Seongha Rhee, Mikyong Shin, Andrew Simpson and Janick Wrona. We are further indebted to Rumiko Shinzato and Bjarke Frellesvig for extensive feedback on Okinawan and Old Japanese respectively. The usual disclaimer applies.

310 Foong Ha Yap and Stephen Matthews

love for the sea in (1b) and his loving the sea in (1c) are semantically more abstract and syntactically more complex; nonetheless, each is clearly also a nominal construction. Among the morphosyntactic cues for nominal interpretation are the possessive determiner his, the preposition for and the gerundive -ing. While gerundive constructions such as the one in (1c) clearly constitute cases of complementation or clausal nominalization, nominals such as his love for the sea in (1b) are more naturally construed as extended noun phrases.

(1) a. He loves [the sea]. b. [His love for the sea] began early in childhood. c. [His loving the sea] is clearly responsible for his decision to become an oceanographer.

Crosslinguistically, establishing whether or not a particular construction constitutes a case of nominalization is not always without controversy. In some cases, overt morphosyntactic cues of nominalization are not available, giving rise to terms such as zero nominalization (cf. Horie 1993). In other cases, determining whether a morpheme is a nominalizer is open to debate and, in the absence of a proper diachronic perspective, some old (perhaps residual) nominalizers may sometimes go unnoticed. In this paper, we examine the development of nominalizers in some Asian languages, with an emphasis on crosslinguistic (as well as cross-dialectal) comparisons which will enable us to identify more clearly the range of pathways available for the grammaticalization of nominalizer morphemes. This article, which began as an overview paper for the New Reflections on Grammaticalization 3 workshop on The Grammaticalization of Nominalizers: East Asian Perspectives (Santiago de Compostela, 2005), is organized as follows: Section 2 reviews literature on the grammaticalization of nominalizers in Tibeto-Burman and East Asian languages, emphasizing in particular some highly robust developmental patterns in these languages. Section 3 goes on to identify unresolved and intriguing issues requiring further investigation. Frequent references are made to papers presented at the workshop (some of which appear in this volume); each contribution represents an attempt to help clarify the intimate links between different types of nominalization constructions.

2. Nominalization in Asian languages The aim of this section is to review some major works which address nominalization phenomena in Tibeto-Burman and East Asian languages such as Chantyal, Tibetan, Japanese, Korean and Chinese dialects, including Chaozhou and Cantonese. Where possible, emphasis is given to a diachronic account, with the objective

Nominalizers in East Asian and Tibeto-Burman languages

of contributing to our understanding of the etymological sources and grammaticalization trajectories of Asian nominalizers. Crosslinguistic parallels will also be examined for clues to the essential nature of nominalization, both as a cognitive process and as a linguistic construct.

2.1 Tibeto-Burman perspectives In his analysis of the highly polysemous Lahu morpheme ve, Matisoff (1972) identified a close functional and structural relationship between genitive, relative clause and nominalization constructions. Evidence of this relationship is clearly attested in Classical Tibetan and Contemporary Lhasa Tibetan, as reported in DeLancey (1986, 1999) and illustrated in (2) and (3) below. According to DeLancey, relative clauses are special cases of nominalization, in which a nominalized complement clause stands in apposition to, and modifies, a head noun. Crucially, in Tibetan, the overt presence of both nominalizer -pa and genitive ‘i (derived from kyi) provides strong evidence that a nominalized clause is being linked to its head noun.

(2) Classical Tibetan shing gcod-pa-‘i mi wood cut-nmlz-gen man ‘the man who cuts the tree’ (DeLancey 1986: 1)

(3) Lhasa Tibetan mo-s bzos-pa-‘i mog=mog she-erg cook-pst.nmlz-gen momo ‘the momos which she made’ (DeLancey 2003b: 276)

Noonan (1997) also argued for an appositive origin for relative clauses such as (4) in Chantyal, along similar lines to those of Matisoff (1972) for Lahu and DeLancey (1986, 1999) for Tibetan. Noonan further observed that agent/patient nominals such as the one in (5) are likewise appositive in nature, with the nominalized clause essentially standing in attributive relation to a zero anaphora (hence their frequent characterization in the literature as headless or pronominal constructions). As observed in Noonan (1997), it is worth noting that, although attributive constructions in Chantyal do not make use of a nominalizer-plus-genitive combination, Gurung, another Tamangic language, does show evidence of a nominalizer accompanied by a genitive linker, as illustrated in (6).

(4) Chantyal [gay-ye sya ca-wa] [menchi] [cow-gen meat eat-nmlz [person ‘the person who is eating beef ’ (Noonan 1997: 376)

311

312 Foong Ha Yap and Stephen Matthews

(5) Chantyal na-se [capa ca-wa] [zero anaphora]-ra kwi pin-ji. I-erg [meal eat-nmlz [zero anaphora]-dat water give-pfv ‘I gave water to the one who was eating.’ (Noonan 1997: 379)

(6) Gurung cá [pxra-bá-e] mxi jaga those [walk-nmlz-gen person pl ‘those walking people’ [=sentries] (lit. ‘those people who are walking’) (Glover 1974: 97, cited in Noonan 1997: 383)

In his analysis of Chantyal, Noonan (1997) identified an extensive network of constructions which are clearly linked to a process of nominalization. Among these are noun complements and purposive constructions, such as those illustrated in (7) and (8) respectively (from Noonan 1997: 376). Less well known but highly interesting are numerous complex predicates of the type in (9), which Noonan refers to as periphrastic verb complexes. Literal translations of these [verb (+aspect) + nominalizer + ‘be’] constructions suggest a cleft-like (or focus) origin, even though the focus reading, at least in contemporary Chantyal, often appears to be rather subdued. We will return to a discussion of these periphrastic constructions in Sections 2.2.2 and 3.2 below.

(7) Chantyal na-ra [reysi thu-wa] men kha-i. I-dat [raksi drink-nlmz desire come-pfv ‘I want to drink raksi.’ (lit. ‘A desire to drink raksi came to me.’)

(8) Chantyal khi [ca-wa-ri] kha-i. he [eat-nmlz-loc come-pfv ‘He came to eat.’

(9) Chantyal [kadmandu-ri hya-si-wa] hin. [Kathmandu-loc go-ant-nmlz be-npst ‘I’ve gone to Kathmandu.’ (lit. ‘(It) is that I’ve gone to Kathmandu.’) (Noonan 1997: 370)

Noonan further identified what appear to be marked main verb constructions, some of which convey a sense of counter-expectation, or even exasperation, as illustrated in (10) below. Clearly, speaker subjectivity forms an important part in the interpretation of these nominalized main verb constructions. It is worth noting that Noonan compares this type of stance-ful constructions (i.e. sentences

Nominalizers in East Asian and Tibeto-Burman languages 313

which signal speaker perspective or attitude) to Lahu sentence-final ve constructions, such as the one in (11), which are known to convey a strong assertive, matter-of-fact interpretation. (10) Chantyal bennu-ye nal tato ta-si-wa. gun-gen barrel hot become-ant-nmlz ‘The barrel of the gun had become hot!’ (Noonan 1997: 381) (11) Lahu yo la tu ve. he come fut nmlz ‘He will come.’/‘It is the case that he will come.’/‘It is a he-will-come case.’ (Matisoff 1972: 246–247) In fact, Matisoff identifies Lahu sentence-final ve constructions, such as (11) above, with focus readings which are comparable to English cleft constructions. There is thus some intimation here of a robust nominalizer > cleft > stance development among the Tibeto-Burman languages. As already hinted in Noonan (1997) and as we will further elaborate in Section 3.2, the link between the periphrastic verb and off-norm1 main verb constructions is an intimate one, involving issues of focus and speaker orientation and stance. Figure 1 captures some of the grammaticalization pathways undergone by the Chantyal nominalizer -wa, particularly those involving event and proposition perspectivization via focus as well as stance constructions. Similar developments can also be observed in other Tibeto-Burman languages to varying degrees. Figure 2 (where TAM stands for tense-aspect-modality mark-

Relativizer -wa + N Lexical source unknown

Nominalizer -wa

Cleft-like -wa + ‘be’ (in periphrastic verb construction) Stance -wa (in main verb construction)

Figure 1. Grammaticalization pathways of the Chantyal nominalizer -wa (based on Noonan 1997) 1. The vivid term off-norm main verb predicates comes from Shinzato (2005). Watters (forthcoming) refers to these constructions as stand alone or free-standing nominalized clauses.

314 Foong Ha Yap and Stephen Matthews

Ergative case kyis/gyis/gis/-s Lexical source unknown

Genitive ‘i (< kyi/gyi/gi)

Lexical source unknown

Nominalizer -pa

Relativizer -pa + N Complementizer -pa Cleft-like -pa + TAM complex Stance -pa + TAM complex

Figure 2. Grammaticalization pathways of the Classical and Lhasa Tibetan nominalizer -pa (based on DeLancey 1986, 1999; Agha 1993)

er) captures parallel developments in Classical and Lhasa Tibetan. Note that Lhasa Tibetan (DeLancey 1986, 1999; Agha 1993) has also developed additional, more specific nominalizers, such as mkhan, sa and yag, whose grammaticalization pathways await further investigation.2 Examples of complementizer, cleft and stance constructions in Tibetan are shown in (12) to (15). In (12), the nominalizer suffix -pa marks a complement clause that precedes a verb of perception (‘know’). In (13), we see structurally a cleft-like construction; however, in Contemporary Tibetan, such constructions typically do not yield a strong focus interpretation. In (14) and (15), we see the nominalizer -pa in a mirative expression conveying surprise or counterexpectation. (12) Lhasa Tibetan nga bod-la phyin-pa kho-s shes=kyi. I Tibet-loc go-pst.nmlz he-erg know-ipfv/disjunct ‘He knows (that) I went to Tibet.’ (DeLancey 2003b:284) (13) Lhasa Tibetan kho-s las=ka ma-byed-pa red. he-erg work neg-do-nmlz pfv/disjunct ‘He did not do the work.’ (lit. ‘It is that he did not do the work.’)3

2. In relative clauses, Lhasa Tibetan also uses the nominalizers mkhan (for actor head nouns), sa (for locative/dative head nouns) and yag (for patient/instrument head nouns in non-perfective contexts). The nominalizer -pa is used for non-actor head nouns in perfective contexts. 3. We thank Zuzana Vokurkova for helpful discussions pertaining to the non-focus readings of these cleft-like constructions. Whether focus readings were available at an earlier period is unknown and deserves further investigation.

Nominalizers in East Asian and Tibeto-Burman languages

(14) Classical Tibetan de-s rtsig-pa zhig gi steng nas mchongs pa. that-erg wall a gen upper.surface abl leapt nmlz (=stance) ‘He jumped off a wall.’ (DeLancey 2003a:264) (15) Lhasa Tibetan kho-s kha=lags zas-byas phyin-pa red. he-erg meal ate-nfin go-pst pfv.disjunct4 ‘He ate and left.’ (DeLancey 2003b: 283) As shown in (13) and (15) above, the nominalizer -pa in cleft (or focus) constructions often appears with the particle red, which marks complex tense/aspect/evidentiality (or mood) information. It is worth noting that this intimate association between nominalizer, tense/aspect marker and speaker stance (or mood) particle is a frequently recurring phenomenon in Tibeto-Burman as well as in a number of other Asian languages (e.g. Korean). We will return to this topic in Sections 2.2.2 and 3.1 below.

2.2 East Asian perspectives 2.2.1 Japanese nominalizers The Japanese nominalizer no has long received extensive discussion in the literature. From a diachronic perspective, no was attested as a genitive and possessive pronominal as early as the eighth century in Kojiki, an anthology of old tales written during the Nara period. By contrast, uses of no as a nominalizer did not emerge until the sixteenth century, when complementizer, (pseudo-)cleft and headless relative clause constructions appeared around the same time (cf. Horie 1998a). Examples of these constructions in Modern Japanese are given in (16) to (20) respectively.5 (16) Modern Japanese Hanako no kimono Hanako gen kimono ‘Hanako’s kimono’

4. Delancey (2003b) treats -pa red as a TAM complex, without glossing -pa as a nominalizer (see also Agha 1993). The link between nominalizer and TAM marking has frequently been observed in the literature, though in-depth discussions are still lacking. 5. We thank Yumi Inoue for the Modern Japanese examples used in this section, unless otherwise specified.

315

316 Foong Ha Yap and Stephen Matthews

(17) Modern Japanese Hanako no Hanako gen ‘Hanako’s (one)’ (18) Modern Japanese Watashi wa Hanako ga heya ni hairu no o mita. I top Hanako nom room loc enter nmlz acc see.pst ‘I saw Hanako enter(ing) the room.’ (19) Modern Japanese Eziputo de mita no wa piramiddo desu. Egypt loc see.pst nmlz top pyramid cop.npst ‘What I saw in Egypt was the pyramids.’/‘It was pyramids that I saw in Egypt.’ (20) Modern Japanese Eziputo de katta no wa hontooni kirei desu. Egypt loc buy.pst nmlz top really pretty cop.npst ‘What/The thing I bought in Egypt is really pretty.’ While there is debate in the field over whether the complementizer, cleft and relative clause constructions are genitive in origin (cf. Horie 1998a; Nishi 2004), most accounts recognize a pivotal role for pronominal uses of no in the development of nominalizer/complementizer no. Another major function of no which has received extensive discussion is its use in sentence-final no desu constructions, typically to convey speaker assertion (Aoki 1986; Cook 1990), as illustrated in (21) below. In the absence of copula desu (or contracted da), a wider range of speaker moods is possible, with speaker prosody triggering the intended interpretation, which could include nuances of doubt or incredulity (cf. Yap et al. 2004). Generally speaking, researchers within the discourse (and conversation) analysis tradition have identified the usage of no in sentence-final position as an interactional (i.e. pragmatic) particle, deeply involved with speaker (inter-)subjectivity functions (Cook 1990; Iwasaki 1993; see also Shinzato 2005). (21) Modern Japanese Sonna fukanoo na koto iwanai no! such impossible assoc thing say.neg sfprt ‘Don’t say such an impossible thing!’ (Cook 1990: 422; gloss added) Recent studies have highlighted a diachronic (in addition to semantic-pragmatic) link between focus no constructions and sentence-final no constructions,

Lexical source unknown

Nominalizers in East Asian and Tibeto-Burman languages 317

Nominative no (available in some dialects Topic marker VP no dewa

Genitive no Pronominal no

Concessive subordinator no ni Nominalizer -no

Cause/reason subordinator no de Complementizer no Cleft no desu Stance no (desu)

Figure 3. Grammaticalization pathways of the Japanese nominalizer no (based on Horie 1998a; Yap et al. 2004)

both types sharing an orientation toward speaker stance (cf. Yap et al. 2004; Shinzato 2005). We thus have before us a highly polysemous morpheme which has extended from referential to nominalizer to evidential/perspectival functions, as shown in Figure 3. In Modern Japanese, nominalizer no also combines with other functional particles, such as de and ni, to form cause/reason and concessive subordinators, as seen in (22) and (23) respectively, and with dewa to form a topic marker, as seen in (24) (cf. Section 2.2.2 for crosslinguistic comparison with Korean; see also Horie, this volume). Worth noting is that the emergence of no as a nominalizer coincided with the demise of an older nominalization system which relied on rentaikei (adnominal) forms, both in Old Japanese and Middle Japanese (for detailed discussions, see Onoe 1982; Yanagida 1985; Nishi 2004; Shinzato 2005; Wrona 2005). (22) Modern Japanese undo shita no de taiju ga herakatta. exercise do.pst nmlz.prt(> conn) weight nom reduce.npst ‘Because I exercised, I lost weight.’ (23) Modern Japanese undo shita no ni taiju ga heranakatta. exercise do.pst nmlz.prt(> conn) weight nom reduce.neg.npst ‘Although I exercised, I did not lose weight.’ (24) Modern Japanese kare iku no dewa, arimasen. he go.npst nmlz.prt (> top) exist.neg.npst ‘As for his going (there), he did not do it.’ (lit. ‘As for his going (there), it did not happen.’)

318 Foong Ha Yap and Stephen Matthews

Lexical source unknown

Genitive nu Pronominal nu

Nominative case nu

Lexical source unknown

Nominalizer si

Complementizer si

Old rentaikei focus system New rentaikei focus system (kakari musubi constructions)

Cleft and stance constructions

Figure 4. Grammaticalization pathways of Okinawan genitive nu, nominalizer si and the new rentaikei system (based on Shinzato 2005)

On this point, evidence from Okinawan, a sister language of Japanese, is highly instructive. Shinzato (2005) shows that, in Okinawan, genitive nu has evolved nominative case marking functions. However, pronominal nu has not extended into nominalizer contexts to yield relativizer, complementizer nor cleft functions, arguably because Okinawan opted for a renewal of its old rentaikei system. An important point also noted in Shinzato (2005) is that the cleft-like kakari musubi constructions regularly encode speaker stance as well (see also Shinzato 1998; Serafim and Shinzato 2005). This is consistent with the robust cleft > stance development which we observe crosslinguistically. Figure 4 summarizes the grammaticalization pathways for Okinawan nu, si and the rentaikei system.6

2.2.2 Korean nominalizers Like Japanese, Korean also exhibits a pronominal > nominalizer > stance development, involving a complex -n kes pronominal construction (for a similar view, see, for example, Park 1999; Shin 2005; Rhee 2007). The etymology of kes is generally traced to a bound (or formal) noun meaning ‘thing.’ Its lexical sense still appears in a few fossilized forms such as ku-kes (‘that thing’) and na-uy kes (‘my thing, mine’), but otherwise kes is no longer used as a lexical morpheme. It is worth further noting that na-uy kes is often contracted to nay kes or nay kke, while most other possessive forms are used without genitive -uy and often appear in phonologically changed forms, such as John kke ‘John’s thing,’ ku salam kke ‘his (thing)’ and ku salamtul kke (‘theirs’).7 6. Shinzato (2005) notes that nominalizer si constructions replaced the older rentaikei system within the relative clause and complementizer domains, while a new rentaikei system was recruited for the cleft and stance constructions. 7. We thank Seongha Rhee for these examples of native speaker’s contracted forms, which highlight the pronominal extension of Korean kes.

Nominalizers in East Asian and Tibeto-Burman languages 319

It is generally agreed that the path from possessive pronominal to nominalizer is not attested in (Modern) Korean. This is true in the sense that we do not find evidence of genitive uy evolving pronominal and nominalizer functions. Instead, we see the language employing two separate adnominal or linking morphemes, namely -uy and -n, to distinguish between possessive modification and attributive modification, as shown in (25a–b) below.8 The latter type (involving -n/-nun/-ul adnominal linkage) is essentially an ordinary-type (or standard) headed relative clause construction. (25) Korean a. na-uy kes I-gen thing ‘my thing’/‘mine’ b. heyemchi-koiss-nun oli swim-prog-prs.adn duck ‘(the) duck which is swimming’ (Shin 2005: 25) While the genitive -uy has not extended much beyond the sense of ‘marker of agentive possessor’ in contexts like English John’s graduation, the attributive linker -n has evolved numerous nominalizer functions via semantic extension of pronominal kes (< ‘thing’), as illustrated in (26a–c) below (examples from Shin 2005: 25; romanization and glosses revised). (26) Korean a. heyemchi-nun oli swim-prs.adn duck ‘(the) swimming duck’ b. heyemchi-nun kes swim-prs.adn thing ‘(the) swimming thing’ (lit. ‘(the) thing that swims’) / ‘swimming’ c. heyemchi-nun kes-un caymiiss-ta. swim-prs.adn nmlz-top be.fun-decl ‘Swimming is fun.’ Example (26b) in particular emphasizes the very close relationship between relativization and nominalization in Korean. This phenomenon is, in fact, very 8. Kes is often bound not only with the attributive adnominal linker -n, but also with its allomorph -nun, the distinction being conditioned by tense-aspect-mood considerations. It is not clear to us whether the adnominal linker -l is in fact an allomorphic variant, since Old Korean also had an erstwhile nominalizer -l whose linkage to nominalizer -n is unknown. Whitman (1997) identifies -n as an anterior adnominal, -nun as a non-past adnominal and -(u)l as a posterior adnominal (cf. also Lee 1993; Rhee 2005).

320 Foong Ha Yap and Stephen Matthews

similar to that noted earlier in Tibeto-Burman languages (cf. Matisoff 1972; DeLancey 1986, 1999; Noonan 1997). Arguably, pronominal kes (also referred to as formal noun or defective noun kes) yields a headless relative clause construction which paves the way for the emergence of nominalizer -n kes.9 Constructions such as (26b) are thus bridge constructions, whose indeterminate readings between agent/patient nominal (e.g. ‘the swimming thing’) and gerundive-type noun complement (e.g. ‘swimming’ as an activity noun) are dependent on context for appropriate semantic-pragmatic interpretation. Rhee (personal communication) notes that the nominalizer -n kes became increasingly productive from Middle Korean onwards (fifteenth century) and, from the earliest available written records, was not restricted only to attributive or relative clause constructions (be these headed, headless or internally headed forms).10 As illustrated in examples (27), (28) and (29) from Modern Korean, -n kes appears in complementizer, focus (or cleft-like) and stance constructions as well (examples from Shin 2005: 71, 77, 80 respectively; romanization and glosses revised). (27) Korean Kunye-nun [wuli-ka cip kwuha-n kes]-ul a(l)-n-ta. she-top [we-nom house get-adn.pst nmlz]-acc know-prs-decl ‘She knows that we have rented a house.’ (28) Korean Ku-nun ecey [o-n kes]-i-a. he-top yesterday [come-adn.pst nmlz]-cop-decl ‘It was yesterday that he came.’ (29) Korean Nay-ka [mikwuk-ey ka-yaha-n-ta-nun kes]-i-a. I-nom [America-dir go-must-prs-decl-prs.adn nmlz]-cop-decl ‘(The fact is) I need to go to America.’ The emergence of -ul kes-ul as a concessive subordinator is noted in Horie (1998b:155), while recent developments of -n kes-i-ni and -n kes-i-nikka as cause/ reason subordinators are noted in Rhee (2002). In this regard, it is worth recalling that Japanese has also developed the cause/reason subordinators no ni and no de as well as the topic marker no dewa (cf. Figure 3 above). A common characteristic of these subordinators and topic markers, as noted earlier for Japanese (cf. Section 2.2.1), is the recruitment of a nominalizer in combination with additional, more specific functional particles. 9. On kes as a formal or defective noun, see e.g. Simpson and Wu (2001) and Shin (2005). 10. Nominalizer -n kes constructions began replacing the older -(u)m nominalizer constructions in earnest towards the end of the nineteenth century (Modern Korean).

Nominalizers in East Asian and Tibeto-Burman languages 321

Lexical source unknown

Adnominal -uy (possessive, i.e. genitive) Possessive pronominal -uy kes

Lexical source unknown

Adnominal -n (non-possessive, headed relative clause) Cause/reason subordinator -n kes-i-ni, -n kes-i-nikka, etc. Complementizer -n kes

Lexical noun kes ‘thing’

Nominalizer -n kes

Cleft -n ke(s) ia Stance -n ke(s) ia

Figure 5. Grammaticalization pathways of the Korean nominalizer -n kes (based on Park 1999; Shin 2005)

Figure 5 summarizes the relevant grammaticalization pathways for Korean -n kes. Note in particular the dissociation between the possessive adnominal (-uy) and the attributive adnominal (-n) discussed earlier. This dissociation allows us to clearly see the pivotal role of pronominal -n kes in the emergence of nominalizer -n kes, and ultimately in the development of cleft and focus -n ke(s) ia as well. Note also that ia is derived from a fusion of the copula i‑ and the sentence-final particle a. In this respect, the stance marker -n ke(s) ia shares a strong resemblance to the stance marker no desu constructions in Modern Japanese: in both languages we see evidence of a nominalizer-plus-copula complex. Recall that Chantyal also makes frequent use of cleft constructions, in which clauses nominalized by -wa are immediately accompanied by the copula hin ‘be’ (cf. example (9) in Section 2.1). This phenomenon may be a common characteristic of verb-final languages. Close examination of other verb-final languages will therefore be useful. At least two other Korean nominalizers, namely -(u)m and -ci, have also evolved stance marking functions (Park 1999; Rhee 2007). These nominalizers are also found in complementizer and focus constructions, but have not extended to headed relative clause contexts.11 Among the focus constructions are the sentence-final -m (ia) and -ci (*ia) constructions, an equative variant of the Korean -n ke(s) ia and Japanese no (desu) cleft/stance constructions in terms of pragmatic effect.12 Figure 6 highlights the polysemy networks of the nominalizers -(u)m 11. The Korean nominalizers -(u)m and -ci resemble Classical Chinese zhe in that they prefer headless clause environments. 12. It is worth pointing out here that cleft and stance constructions are not easily distinguishable in verb-final languages, such as Japanese and Korean.

322 Foong Ha Yap and Stephen Matthews

Lexical source unknown

Nominalizers -(u)m/-ci

Cause/reason subordinator (u)m-ulo-se Focus constructions …-m, …-m/…-ci, …-ci (both restricted usage in Modern Korean) Cleft -m ia/-ci (*ia) (restricted usage in Modern Korean) Stance -m (ia)/-ci (*ia) (restricted usage in Modern Korean)

Figure 6. Grammaticalization pathways of the Korean nominalizers -m and -ci (based on Rhee 2007)

and -ci.13 Note the parallels with -n kes as well as with nominalizers in other languages which have also evolved stance functions. It should further be noted that diachronic accounts of the development of these two older nominalizers are still sketchy and await further investigation. Rhee (this volume, citing Kim 1978; Ryu 1990; Kang 1993) suggests the possibility of a similar grammaticalization pathway for an even older nominalizer from Old Korean, namely -i. This nominalizer is arguably derived from a proximal demonstrative and is closely linked to the pronoun i, as well as to nominative -i and copula i‑ (cf. also Frellesvig 2001:n 31, citing Martin 1997). According to Rhee, the relationship between nominative and copula usage is one of parallelism, with nominative -i marking a nominal argument and copula i‑ marking a predicate argument (i.e. a nominalized clause).

2.2.3 Chinese nominalizers There is also some evidence from Chinese of a similar nominalizer > stance development. Here we focus on the development of nominalizer kai within the Chaozhou dialect (Southern Min) based on examples from Matthews and Xu (2002). The morpheme kai is highly versatile; it is used in classifier, possessive pronominal, cleft and sentence-final stance functions, as well as topic-marker and non-verbal copula. As seen in (30), classifier kai can assume a bound pronominal function, in this case tsi kai ‘this (one).’ 14 In the examples which follow, kai is used as a genitive (31), a possessive pronominal (32), a relative clause marker (33) and a nominalizer 13. The nominalizer -m is used in a wide range of contexts (including prohibitive ones); the nominalizer -ci is used in negative, retrospective and strong assertive contexts. 14. This is reminiscent of the Korean bound noun/pronominal kes discussed earlier in Section 2.2.2.

Nominalizers in East Asian and Tibeto-Burman languages 323

in a headless relative clause (34) (arguably the relative clause marker of (33) reinterpreted as the nominalizer of (34)). Note that this constitutes a classifier > genitive and pronominal > relative clause and nominalizer development. This pathway is common to other Chinese dialects such as Cantonese, where the classifier go3 grammaticalized into the genitive and the possessive pronominal ge3 (note the phonological reduction), and then further into the nominalizer and relative clause marker ge3 as well. (See Figures 7 and 8 for a summary of the grammaticalization of Chaozhou kai and of Cantonese ge3 respectively). It is worth noting that this classifier > genitive and pronominal > relative clause and nominalizer pathway closely parallels the general noun (e.g. person, place, thing) > nominalizer > relative clause marker development in Tibeto-Burman languages (e.g. LaPolla 2003). (30) Chaozhou ̃lɯ tã i ai tsi kai me? 2sg say 3sg like this clf q ‘Do you think she will like this (one)?’ (31) Chaozhou i kai haʔseng 3sg poss student ‘his/her student’ (32) Chaozhou tsia kai si ua kai this clf cop 1sg clf ‘this (one) is mine’ (33) Chaozhou lai tsio p’aʔkang kai naŋ loŋ si ɡuaseñ-kiã. come here work rel people all cop outside.province-suf ‘Those people who came here to work are all from other provinces.’ (34) Chaozhou lai tsio p’aʔkang kai loŋ si ɡuaseñ-kiã. come here work rel all cop outside.province-suf ‘Those who came here to work are all from other provinces.’ In (35), we see kai used in a subordinate clause construction.15 In such structures, kai combines with ue (‘word, speech’) to form a conditional subordinator 15. I owe this example to Huiling Xu (personal communication). See also Jiang (2004) for a discussion of a similar phenomenon involving the Mandarin Chinese conditional subordinator de hua.

324 Foong Ha Yap and Stephen Matthews

(‘if, supposing that, say that’). Notice the structural similarity between subordinate clause [VP kai ue] and relative clause [VP kai N]. Note in particular that the two constructions involve nominalizer kai and that both of them serve specificational or modificational functions, the relative clause construction being used for nominal modification and the subordinate clause construction for clausal modification (cf. also Xu and Matthews, forthcoming). This suggests a close relationship between nominalization, relativization and subordination. (35) Chaozhou (na si) lok hou kai ue tsu mai k’ɯ. (cond fall rain cond (< nmlz + saying) then not.want go ‘If it rains, then don’t go.’ In (36) we see kai used in two types of cleft constructions, kai ... kai and si ... kai, which are equivalent to Mandarin Chinese shi ... de and Cantonese hai2 ... ge3. These cleft constructions rely on a copula (either kai or si in Chaozhou, shi in Mandarin and hai2 in Cantonese) to introduce and highlight a discourse-prominent entity, which often takes the form of a nominalized clause (marked by the nominalizers kai in Chaozhou, de in Mandarin and ge3 in Cantonese). In (36) we see the first kai being used as a copula and the second kai as a nominalizer.16 In turn, in (37) kai in sentence-final position serves as a nominalizing particle as well as a stance (or speaker perspective) marker. (36) Chaozhou Kai ua kio i k’ɯ kai, m si i kio i k’ɯ kai. cop 1sg call 3sg go nmlz neg cop 3sg call 3sg go nmlz ‘It was me who asked her to go, not him who asked her to go.’ (37) Chaozhou si/kai oi t’iã kai. cop able hurt sfprt ‘It is painful.’ Chaozhou kai also provides clear evidence of a close relationship between topic marking and non-verbal copula usage. In (38) kai is used as a topic marker.17 In 16. As noted in Matthews and Xu (2002), copula kai is a pronominal (i.e. non-verbal) copula (cf. Stassen 1997), whereas si is a fully grammaticalized verbal copula, given that kai does not pass the Chinese ‘V-not-V’ test for verbhood (i.e. *kai m kai, but si m si ‘is or is not’). For example, when asking the question ‘Is s/he your child?’ in Chaozhou, we cannot say * i kai m kai lɯ ̃ kai kia, whereas i si m si lɯ ̃ kai kia is grammatical. That kai is a copula is evidenced by its availability in constructions such as i kai ua kai kia ‘s/he is my child.’ 17. Matthews and Xu (2002) glossed kai as a particle in this particular example. Here we enter-

Nominalizers in East Asian and Tibeto-Burman languages 325

(39a), with the copula si present, kai is also more likely to be interpreted as a topic marker. However, in the absence of the copula si, as in (39b), a bridging context (Evans and Wilkins 2000) emerges in which kai can be interpreted as either a topic marker or a non-verbal copula, though more likely the latter.18 Finally, in (40), with temporal adverbial ĩ-tsaĩ (‘before’) intervening between kai and the preceding noun phrase tsi kai kaʔsiu (‘this professor’), kai is more naturally interpreted as a copula. Matthews and Xu (2002) identify non-verbal copula kai in Chaozhou with Stassen’s (1997) pro-copulas (or pronominal copulas), an analysis which is highly consistent with the classifier and pronominal origin of kai. (38) Chaozhou no nang kai [IP long si siau lauhou]. two person top [IP all cop small tiger ‘They are both real little tigers.’ (i.e. very brave) (lit. ‘These two boys, all/ completely are little tigers.’) (39) Chaozhou a. i kai si lausɯ, m si haʔseng. 3sg top cop teacher neg cop student ‘He is a teacher, not a student.’ b. i kai lausɯ, m si haʔseng. 3sg top/cop teacher neg cop student ‘He is a teacher, not a student.’ (40) Chaozhou [topic tsi kai kaʔsiu] [comment ĩ-tsaĩ kai i kai haʔseng]. before cop 3sg poss student [topic this clf professor [CP ‘This professor used to be his student.’ Figure 7 provides a sketch of the grammaticalization pathway of Chaozhou kai.19 What is additionally interesting about the Chaozhou data is that the topic tain the possibility that kai could be more specifically interpreted as a topic marker. Example (38) would then serve as an intermediate construction between a topic marking interpretation and a non-verbal copula usage. 18. See Evans and Wilkins (2000) for a more detailed discussion of bridging contexts and in particular also Heine (2003), especially for extended discussion on switch contexts. Both contexts serve as intermediate stages along the pathways of grammaticalization. In essence, a lexical/grammatical category with function A may extend its usage into more contexts such that we sometimes see ambiguous interpretations A/B (referred to in the literature as a potential bridge context). In cases where we later obtain clear cases in which only function B is acceptable, this final stage constitutes the switch context. 19. Dotted lines indicate either unknown source or influence from other grammatical constructions.

326 Foong Ha Yap and Stephen Matthews Lexical source unknown

Classifier kai

Topic marker kai

Genitive kai Pronominal kai

Non-verbal copula kai

Conditional subordinator kai ue

Nominalizer kai

Relativizer kai Cleft kai/si … kai Stance kai

Figure 7. Grammaticalization pathways of the Chaozhou nominalizer kai (based on Matthews and Xu 2002)

marker kai and the non-verbal copula kai appear to have evolved from the classifier/pronominal pathway, unlike the case for topic markers such as [CP[IP gong2dou3 NP] ge3waa2] (‘speaking of NP’) in Cantonese (cf. Figure 8), which appears to have evolved under the influence of the nominalizer constructions, much like the headed relative clause. Recognition of an earlier pronominal and later nominalizer pathways as related yet independent pathways may provide a clue to the dissociative relationship between nominalization and copula verb formation. In other words, as highlighted in Figures 7 and 8, both pronominal and nominalizer constructions can contribute to the rise of topic markers; however, the pronominal construction appears to more readily evolve non-verbal (i.e. pronominal) copula usage. Here it is worth recalling from Figure 3 (Section 2.2.1 above) that Modern Japanese also has a no dewa topic marker (e.g. kare iku no dewa, arimasen ‘as for his going (there), he did not do it;’ lit. ‘it did not happen’), which is evidently a topic marker of nominalizer origin. The pronominal > nominalizer > stance development appears to be very robust among the Chinese dialects; indeed, this phenomenon may even be pan-Chinese. Evidence from Classical and Contemporary Chinese (Lu 1943; Wang 1980; Li and Thompson 1981; Jiang 1999; Shi and Li 2002; Yap et al. 2008) suggests a very similar grammaticalization trajectory for the nominalizer di/de in Mandarin Chinese (cf. Figure 9). A language-specific difference worth noting is the relatively late emergence of Mandarin genitive di/de. Diachronically, we see that both Archaic and Middle Chinese for a long time relied on genitive zhi,20 and it was not until the South20. The earliest known Chinese dynasty was the Xia dynasty (twenty-first to sixteenth centuries BC), though the earliest available written records came from bone oracles from the Shang period (sixteenth to eleventh centuries bc). Old Chinese, based on extant inscriptions and writings, covers at least the Shang dynasty and continued into the Western Han period (206 BC to 24 AD). Middle Chinese emerged during the Easatern Han period (25 AD to 220 ad) and contin-

Nominalizers in East Asian and Tibeto-Burman languages 327

Lexical source unknown

Topic marker [gong2dou3 NP] ge3waa2 Conditional subordinator ge3waa2

Classifier go3

Genitive ge3 Pronominal ge3

Nominalizer ge3

Relativizer ge3 + N (headed) Cleft hai6 … ge3 Stance -ge3 (+SFP)

Figure 8. Grammaticalization pathways of the Cantonese nominalizer ge3 (based on Yap 2007)

Influence of zhi and zhe

Lexical noun di (‘foundation, base, bottom’)

Genitive di/de Pronominal di

Conditional subordinator (ruguo) … de hua Relativizer di/de + N (headed)

Nominalizer di/de

Cleft shi … de Stance de (+SFP)

Figure 9. Grammaticalization pathways of the Mandarin nominalizer de (based on Jiang 1999; Shi and Li 2002; Yap et al. 2008)

ern Song period (eleventh to twelfth centuries AD) that genitive di/de gradually came to increasingly compete with genitive zhi (Jiang 1999; Shi and Li 2002). A point worth highlighting here is that the pronominal function (independently of the genitive one) can give rise to the development of nominalizers which further grammaticalize into stance markers.

2.2.4 Summary We have seen that a number of nominalizers in East Asian languages have developed stance functions, which indicates that the path from referential to evidential/mood marking is a fairly robust development crosslinguistically. Also evident ued through the golden age of the Tang dynasty (618 to 907 AD), then giving way to Early Modern Chinese during the Song period (960 to 1279 AD). See Norman (1988:xi), among others, for chronological details.

328 Foong Ha Yap and Stephen Matthews

is that this grammaticalization process is recursive, in the sense of being cyclical, with new morphemes often recruited to assist, and often eventually replace, old nominalizers. A pivotal intermediate step appears to be the availability of pronominal usage for each potential nominalizer candidate. Once a nominalizer, if productive, the morpheme can generate clausal constructions which contribute to referent-identification (e.g. agent/patient nominals, gerundive or infinitival type noun complements, relative clause constructions) and event/proposition perspectivization (e.g. focus, cleft or stance constructions, subordinate clause constructions). Several related yet independent pathways were also identified. One pathway gave rise to topic markers and case markers closely affiliated with referent (including semantic role) identification. The other pathway gave rise to nominalizers, and with them often the wide range of nominalized constructions discussed in this paper. Another frequent bifurcation was found within the nominalizer pathway itself. One involved the evidentiality/modality functions (focus, cleft and stance marking), while the other involved subordinators and topic markers performing focalizing or mental space building functions (cf. Fauconnier 1985). The former set of nominalized constructions is often further realized as sentence-final particles. The latter set is also frequently clause-final, but appears at the end of dependent and non-terminative clauses, frequently functioning as situational specifiers for upcoming events in the matrix clause. Figure 10 summarizes these grammaticalization pathways which are frequently observed among nominalizers in East Asian languages.

Nominative case marker Topic marker Non-verbal copula verb Lexical source (often unknown)

Genitive Pronominal

Concessive subordinator Conditional subordinator Cause/reason subordinator Relative clause (headed)

Nominalizer

Complementizer Cleft construction Stance construction

Figure 10. Some frequent grammaticalization pathways of nominalizers in East Asian languages

Nominalizers in East Asian and Tibeto-Burman languages 329

3. Issues related to the grammaticalization of East Asian and Tibeto-Burman nominalizers We have thus far seen clear evidence of a robust nominalizer-to-stance development among East Asian and Tibeto-Burman languages. A number of other frequently recurring issues deserve further consideration, among them the following: (i) the interaction between nominalization, tense/aspect and evidentiality/mood interpretation (cf. Section 3.1); (ii) the relationship between nominalization and copula usage (cf. Section 3.2); and (iii) the investigation of forgotten nominalizers (cf. Section 3.3).

3.1 The relationship between nominalization, tense/aspect and evidentiality/ mood An important recurring theme related to nominalization phenomena in a number of East Asian and Tibeto-Burman languages is the frequent interaction between nominalization, tense/aspect and evidentiality/mood interpretation. Recall Noonan’s (1997) identification of nominalization constructions involving periphrastic verbs in Chantyal, where we find many -wa nominalized clauses accompanied by the copula verb hin (‘be’) to form cleft-like focus constructions (cf. Section 2.1 above). Similarly, Japanese and Korean also use nominalizer-plus-copula verb constructions to form focus and stance constructions, i.e. via n(o) desu and -n ke(s) ia constructions respectively (cf. Sections 2.2.1 and 2.2.2). Chinese dialects frequently use discontinuous-type ‘copula … nominalizer’ constructions to form cleft-like focus constructions. Examples include Mandarin shi … de, Cantonese hai2 ... ge3 and Chaozhou si … kai and kai … kai constructions (cf. Section 2.2.3). Thus far, the link between nominalization and focus/stance has been clearly established in this paper and elsewhere (e.g. Yap et al. 2004; Shinzato 2005; see also the papers in Yap and Wrona, forthcoming). Much less well understood, however, is the link between nominalization and tense/aspect. Scholars working on Classical and Lhasa Tibetan have examined to some extent the intimate interaction between nominalizers and complex perfective/imperfective suffixal elements which give rise to conjunctive/disjunctive perspectives closely intertwined with the illocutionary intent of speech act participants (e.g. Agha 1993; DeLancey 2003a, 2003b). This line of research offers a fresh perspective on how researchers working on other East Asian languages might proceed with their investigation of the intimate link between nominalization and tense/aspect marking. Potentially relevant for closer scrutiny is the relationship between anterior adnominal -n and nominalizer kes in Korean. Also worth further investigation is the question of whether adnominal -n may be linked to the Old Korean

330 Foong Ha Yap and Stephen Matthews

nominalizer -n often discussed in the literature (cf. Rhee, this volume). If such a link could be established, it would be possible to strongly posit that a realis interpretation frequently associated with nominalization may have given rise to anterior/past readings for adnominal -n. Once a tense/aspect reading becomes associated with -n, other aspectual associations could also emerge in the language, among them the non-past reading with -(nu)n and the prospective reading with -(u)l. If this account holds for Korean, it would be worth further investigating the relationship between erstwhile nominalizers and tense/aspect functions in other neighbouring languages as well, in particular Manchu and Mongolian, both of which are known to use -n as a nominalizer. It is also noteworthy that Korean scholars have also identified -l as a nominalizer in Old Korean (as well as in Mongolian, among other languages). That -(u)l also participates in the tense-aspect system in adnominal constructions (which, as we have seen, are strongly linked to nominalization) clearly prompts some serious investigation. Understanding the relationship between tense/aspect and nominalization is clearly important because even slight variations in their interactions often yield meaningful subtle differences in evidential/mood interpretation. Little is currently known about the link between nominalization and the use of adnominals as tense/ aspect markers in Japanese or about the implications of their relationship for evidentiality/mood interpretation.21 Even less is known of this type of relationship from extant literature on Chinese nominalization. Any small step in understanding how subtle variations in mood may have emerged from interactions between tense/aspect and nominalization in languages such as Tibetan, Korean, Manchu and Mongolian, among others, is bound to contribute to a better understanding of possibly covert interactions between tense/aspect and nominalization in tenseless languages such as Chinese.

3.2 The relationship between nominalization and copula usage Another frequently recurring issue across the languages which we have examined in Section 2 above is the relationship between nominalization and copula usage. There appear to be at least three functional domains in which copulas and nominalizers are intimately linked together. As noted earlier in the case of Chaozhou, demonstrative pronominals can be used as topic markers, which, in turn, can give rise to pronominal (or non-verbal) copulas. In such cases, nominalizers may share a common source morpheme with the copula, though there is no direct, patrifilial relationship between them.22 21. But see Frellesvig (2001) for a recent discussion. 22. In Old Chinese, shi was used as a demonstrative (Li and Thompson 1977: 424–425). Since

Nominalizers in East Asian and Tibeto-Burman languages 331

Another functional domain in which copula usage may sometimes be linked to nominalization is that of adnominal constructions. As discussed in Frellesvig (2001), in Old Japanese, no was used as the adnominal form of the copula, conveying the sense ‘which is.’23 Frellesvig posits that copula no gave rise to genitive no, through the use of [NP1 no NP2] constructions, initially with the meaning ‘NP2 which is NP1’ and then later with the meaning ‘NP2 of NP1’ and ‘NP1’s NP2.’24 In time, genitive no further developed a highly productive nominalizing function (as discussed earlier in Section 2.2.1). Recall that this new nominalizer no further combined with a new copula form desu to create new focus and stance constructions (i.e. the contemporary no desu type). We see therefore a pattern emerging where adnominal copulas, genitives, nominalizers and other grammatical categories can intimately be linked to each other. Frellesvig (2001) also posits a close connection between the copula and the adnominal system in Korean. Following Martin (1997), he suggests that the Modern Korean copula i‑ could have developed from the Old Korean demonstrative -i (2001:n 31).25 Like Lehmann (1995: 26–27), Frellesvig (2001: 21–23) claims that adnominal copulas often emerge when anaphoric demonstratives are recruited to reinforce already existing copula verbs. As Rhee notes in this volume, Korean scholars (cf. Kim 1978; Ryu 1990; Kang 1993) have also claimed a proximal demonstrative origin for nominalizer -i. Taken together, these two views suggest a possible link between copula i‑ and nominalizer -i, possibly along the lines of the development demonstrative pronoun > copula adnominal > nominalizer, in tandem with that of demonstrative pronoun > nominative noted earlier in Section 2.2.2. Much more transparent is the relationship between nominalizer and copula in the formation of cleft constructions in contemporary languages. In Chantyal, for example, we see evidence of nominalizer -wa + hin (‘be’) verb constructions, which Noonan (1997) refers to as periphrastic verb constructions (cf. SecMiddle Chinese, shi has functioned as a copula. Hashimoto (1969: 88) identified a fascinating illogical copula construction, which we here suggest is a topic marking construction. Note that a topic marker > copula verb development has been observed for Chaozhou kai (cf. Section 2.2.3), and crosslinguistic evidence (Heine and Kuteva 2002: 95–96) shows that a copula > focus development is quite common in the languages of the world. In the case of shi, there may have been a topic > copula > focus development. A focus interpretation could also naturally arise from contrastive topics. 23. Interestingly, this adnominal usage leaves the possibility of a pronominal origin open. 24. Note that this analysis becomes more transparent if we bear in mind that Japanese is SOV and head-final, unlike English, which is SVO and head-initial. 25. See Pustet (2003: 54–61) for crosslinguistic evidence of the origin of copulas, including pronoun > pronominal copula > verbal copula > copular affix.

332 Foong Ha Yap and Stephen Matthews

tion 2.1 above). We have suggested that structures of this kind are a type of focus construction, albeit one which has come to saturate the language such that it does not always yield strong affective readings and other more salient focus constructions may sometimes be needed to achieve marked affect. Such saturation is not uncommon; in fact, besides Tibeto-Burman languages, such as Chantyal, other languages, such as Korean (with its near-ubiquitous sentence-final -n ke(s) ia constructions), Japanese (with its own near-ubiquitous sentence-final no desu structures) and Mandarin (also with its own near-ubiquitous sentence-final de constructions) likewise show saturation of stance structures which could be traced to cleft-constructions. These, as we have discussed earlier in this paper, are manifestations of nominalization constructions. While all cleft structures are focus constructions by definition, not all stance structures retain strong focus effects.26 Of particular relevance for our current discussion is that many of the focus/ stance constructions we describe here are nominalizer-plus-copula verb combinations. This is the case with Korean -n ke(s) ia constructions (< nominalizer -n ke(s) + copula i‑ + sentence-final particle -a) and with Japanese no desu constructions (< nominalizer no + copula desu). This is also true with the discontinuous Mandarin shi … de focus structures (where shi is the copula), though not with the sentence-final stance de constructions, which appear without the copula. Elision of the copula (shi) may have paved the way for sentence-final de constructions to be in a position to either express speaker perspective emphatically (often with the aid of other more specific sentence-final particles, in addition to marked prosody in spoken discourse) or to express speaker perspective more subtly, i.e. without overt indication (as specified by the addition of sentence-final particles) but simply by intimation. Crucially, what we often see is a strong presence of nominalizer-plus-copula verb pattern among focus constructions in these East Asian languages. Also worth noting is the transition from the rentaikei nominalization system in Old Japanese into the new periphrastic no (desu) nominalization system in Modern Japanese. While it has often been noted that the latter clearly involves use of the copula (desu), recent attention is now also being drawn to a possible relationship between the rentaikei nominalization system and copula forms (cf. Frellesvig 2001; Wrona 2005). A similar situation appears to hold also with Okinawan. Recall that the old rentaikei system in Old Okinawan (a sister language of Old Japan26. Horie (personal communication), expanding on Iwasaki (2000: 243), further emphasizes that conflation of the adnominal (rentaikei) and conclusive (shushikei) verb forms also gave rise to a strong tendency in Modern Japanese to use nominalized constructions in order to convey indirectness and non-assertion, which Horie argues reflects a cultural tendency among the Japanese to favour conflict avoidance.

Nominalizers in East Asian and Tibeto-Burman languages 333

ese which is believed to have branched off from Proto-Japanese around the sixth century)27 also gave way to a newer rentaikei system which was periphrastically reinforced by the use of copula verb yoru, as seen in (41) below, which shows both the old and the new rentaikei forms (furu and furi-yoru). Little has been said about the possibility of a copula link with the nominalization system in Old Okinawan. However, given that the nominalization system in Old Okinawan parallels that of Old Japanese, the same possibility emerges for a strong link between adnominal copula and nominalizer in Old Okinawan as well. (41) Old Okinawan Shiyori furu ame ya sudemidu do furi-yoru. Shuri fall rain top purified.water kprt fall ‘Speaking of the rain falling in Shuri, it is purified water that falls.’ (OS 370; Shinzato 2005: 4–5) Korean scholars have also long recognized that the older nominalizers -um and -ci are being overtaken by the more contemporary -n ke(s) ia constructions in focus/stance contexts (see Rhee 2007 for a summary). Little is currently known about whether copula usage is linked with the -um and -ci nominalizer constructions; future research will need to investigate this issue. A concurrent possible agenda is to examine the extent to which other neighbouring languages exhibit similar affinities between nominalization and copula usage.

3.3 Forgotten or residual nominalizers The preceding discussions bring us naturally to another significant topic in this paper, namely the issue of forgotten nominalizers. In looking at a wide array of nominalizers across East Asian and Tibeto-Burman languages, diachronically, crosslinguistically and even cross-dialectally, we see recurring intimations of morphemes which might in fact be nominalizers. For both Old Korean and Old Japanese, forgotten nominalizers are likely to lurk among the adnominal constructions. Potential candidates include -ki and possibly -shi in Old Japanese. Consider (42) and (43) below. In (42), note that -ki is identified as an adjectival adnominal and occupies a syntactic position within a headed relative clause very much like the adnominal morphemes -n and -l discussed earlier with respect to Korean, and also quite similar to nominalizers in relative constructions in Tibeto-Burman and Chinese discussed in Sections 2.1 and 2.2.3 respectively. In (43), in turn, we see both -ki and -shi occurring in structures very similar to the cleft focus constructions in other languages, many of which clearly implicate a pivotal role for nomin27. Here, Shinzato (2005) cites Hattori (1976) and Hokama (1977).

334 Foong Ha Yap and Stephen Matthews

alizers. The likelihood that -ki and -shi could have been nominalizers in Old Japanese is quite high. What is also interesting to investigate is the possibility that these potential forgotten nominalizers may have traversed a similar grammaticalization trajectory as the nominalizers already identified in this paper. (42) Old Japanese yononaka-pa munasi-ki mono to aramu. this.world-top empty-adj.adn thing cop exist.conjc ‘This world is empty.’ (MYS 3.442; Frellesvig 2001: 4) (43) Old Japanese minato no ashi no uraha wo tare ka tawori-shi (ki) estuary gen rush gen tip acc who kprt cut-pst […] ware zo tawori-shi (ki). I kprt cut-pst ‘Who is it that cut the tip of the rushes at the estuary? […] It is I who cut them.’ (MYS 7.1288; Shinzato 2005: 5) It is also worth noting that -ki and -shi in Old Japanese, as seen in (43) above, were associated with tense/aspect marking, more specifically, with perfect(ive)/ past readings (i.e. essentially involved as markers of anteriority). We have already briefly discussed the frequent and intimate link between nominalization and tense/aspect marking (cf. Section 3.1), but the issue is worth revisiting here, since such interactions may serve as probes in the search for forgotten nominalizers. In this regard, it is interesting to contemplate the possibility that tense/aspect markers are sometimes relic nominalizers, such that reinforcement from and suffixation by newer nominalizers gives rise to tense-plus-nominalizer complexes. Noonan’s observation (this volume; Section 4) of sequential converbal suffix *si in Bodic languages combining with nominalizer *pa readily comes to mind. In Chantyal, this phenomenon takes the form of -si-wa constructions, while in Korean comparable structures would be the -n kes constructions. In Old Japanese, Wrona (2005) has suggested that some answers lie in the adnominal (rentaikei) constructions. Potential forgotten nominalizers in Classical Chinese include the ubiquitous morpheme ye. As noted in Yue (1998), ye often appears in clause/sentence-final position as a stance particle and is also often used postnominally as a topic marker. As seen in (44) below, when it appears in a non-terminative clause, i.e. shi ke ren ye ‘if that can be tolerated,’ ye contributes to an implicit conditional subordinator reading, much as the nominalizers zhi and zhe sometimes did in pre-Qin Chinese (cf. He 1988; Huang and Yap 2006; Yap et al. 2008). While the precise type of

Nominalizers in East Asian and Tibeto-Burman languages 335

subordination (causal, conditional, concessive, etc.) is not inherently encoded in the meaning of ye (nor in that of zhi and zhe), a dependency relation to the subsequent clause readily emerges. Worth noting is that such a dependency is often linked to notions of non-finiteness in tensed languages (cf. Haspelmath 1995). That ye contributes to the formation of dependent (possibly non-finite) clauses suggests that it may be intimately tied to nominalization. Future diachronic work could investigate the extent to which a nominalizer > stance development holds for this highly ubiquitous and versatile morpheme ye.28 (44) Classical Chinese Kongzi wei Ji shi ba yi wu yu ting Confucius said Ji person eight team dance in court.garden shi ke ren ye shu bu ke ren ye. pron can tolerate prt (
4. Conclusion Although it is not always possible to trace all the pathways for erstwhile nominalizers, we have shown in this paper that certain pathways for the grammaticalization of nominalizers are highly robust. These pathways may provide some typological road maps to guide us to potential mining strikes as we trace further back in time and as we try to understand newly emerging constructions. Ultimately, this whole exercise is meant to provide us with a means to better understand how our mind recruits nominalization phenomena to package our thoughts and viewpoints as arguments and stances within each unfolding discourse with fellow interlocutors. Four other papers from the New Reflections on Grammaticalization 3 workshop on The Grammaticalization of Nominalizers: East Asian Perspectives appear 28. Yue (2004) recently identified ye as a focus marker in Old Chinese. Whether this analysis precludes a nominalizer role for ye is an issue still requiring further investigation. 29. In this example, Confucius was expressing disapproval of the conduct of one Mr Ji, who contravened the rules of propriety in allowing eight teams (instead of just four) to dance at the ceremony which he was officiating, when social norms reserved the privilege of officiating an event of this size only for the king. The Analects is a compilation of the teachings of Confucius and is translated in many versions.

336 Foong Ha Yap and Stephen Matthews

in this volume to further elaborate on the themes outlined in this paper. One is Michael Noonan’s work on Bodic languages, expanding on his earlier and highly insightful work on Chantyal nominalization, which has inspired this series of investigations. Another is Andrew Simpson’s tightly-knitted and well-argued prosodic and morphosyntactic analysis of grammaticalization phenomena involving nominalizers from Colloquial Burmese and Literary Burmese. A third contribution is Seongha Rhee’s comprehensive discussion on the rise and fall of nominalizers in the Korean language (all the way from Old Korean through Middle to Modern Korean). Finally, Kaoru Horie offers a detailed contrastive analysis of nominalization constructions in two closely related languages, Japanese and Korean, in particular noting differential degrees of grammaticalization between the contemporary nominalizers of both languages, with Modern Japanese nominalizers instantiating in general “a higher degree of polyfunctionality” than their Modern Korean counterparts. Horie’s contrastive analysis captures an interesting phenomenon in which we witness differential rates of language change across two closely related civilizations, suggesting perhaps that Japanese may have been shifting from an older language paradigm to a newer one more quickly than Korean. Further work on how these differential rates and degrees of language change might impact sociolinguistic behaviour is certainly needed. Comparative work across more languages from a grammaticalization perspective can contribute not only to a better understanding of language universals; it will also help us to understand better where, when and why we sometimes differ socioculturally from others.30

Abbreviations 1 2 3 abl acc

first person second person third person ablative accusative

adj adn ant assoc clf

adjective adnominal anterior associative classifier

30. Renewed interest in nominalization phenomena has given rise to a series of extensive discussions, both in workshops and in upcoming special issues and edited volumes. Prominent among these is the Workshop on Nominalization and its Discontents at La Trobe University (see Coupe 2006, 2007; LaPolla 2006; Morey 2006; van Breugel 2006, among others). A workshop on Nominalizers and Copulas in East Asian and Neighboring Languages was also recently held at the Chinese University of Hong Kong, January 8–11 2007 (see Lichtenberk, forthcoming; Liu and Gu 2007; Liu et al. 2007; Sio 2007; Tang 2007; Noonan, forthcoming; Wrona, forthcoming; Xu and Matthews, forthcoming, among others). Other recent works on nominalization in languages within the East Asian and Tibeto-Burman region include Herring (1991), Bickel (1999) and Lahaussois (2003).

Nominalizers in East Asian and Tibeto-Burman languages 337

cond conjc conn cop dat decl dir disjunct erg fut gen ipfv kprt loc neg nfin nmlz

conditional conjectural connective copula dative declarative directional disjunctive mood ergative future genitive imperfective kakari (focus) particle locative negation non-finite nominalizer

nom npst pfv pl poss prog pron prs prt pst q rel sfprt sg suf top

nominative non-past perfective plural possessive progressive pronoun present particle past question marker relative sentence-final particle singular suffix topic

References Agha, Asif. 1993. Structural form and utterance context in Lhasa Tibetan: Grammar and indexicality in a non-configurational language. Frankfurt: Peter Lang. Aoki, Haruo. 1986. Evidentials in Japanese. In Wallace Chafe & Johanna Nichols (eds.), Evidentiality: The linguistic coding of epistemology, 223–238. Norwood NJ: Ablex. Bickel, Balthasar. 1999. Nominalization and focus constructions in some Kiranti languages. In Yogendra P. Yadava & Warren W. Glover (eds.), Topics in Nepalese linguistics, 271–296. Kathmandu: Royal Nepal Academy. Cook, Haruko Minegishi. 1990. An indexical account of the Japanese sentence-final particle no. Discourse Processes 13: 401–439. Coupe, Alec R. 2006. Nominalization, relativization and genitivization in Mongsen Ao and Chang. Paper presented at the Workshop on Nominalization and its Discontents. La Trobe University, August 2006. Coupe, Alec R. 2007. A grammar of Mongsen Ao [Mouton Grammar Library 39]. Berlin: Mouton de Gruyter. DeLancey, Scott. 1986. Relativization as nominalization in Tibetan and Newari. Paper presented at the 19th International Conference on Sino-Tibetan Languages and Linguistics. Ohio State University, Columbus, September 1986. DeLancey, Scott. 1999. Relativization in Tibetan. In Yogendra P. Yadava & Warren W. Glover (eds.), Topics in Nepalese linguistics, 231–249. Kathmandu: Royal Nepal Academy. DeLancey, Scott. 2003a. Classical Tibetan. In Graham Thurgood & Randy LaPolla (eds.), The Sino-Tibetan languages, 255–269. London: Routledge. DeLancey, Scott. 2003b. Lhasa Tibetan. In Graham Thurgood & Randy LaPolla (eds.), The Sino-Tibetan languages, 270–288. London: Routledge. Evans, Nicholas & David Wilkins. 2000. In the mind’s ear: The semantic extensions of perception verbs in Australian languages. Language 76(3): 546–592.

338 Foong Ha Yap and Stephen Matthews

Fauconnier, Gilles. 1985. Mental spaces: Aspects of meaning construction in natural language. Cambridge MA: The MIT Press. Frellesvig, Bjarke. 2001. A common Korean and Japanese copula. Journal of East Asian Linguistics 10: 1–35. Glover, Warren W. 1974. Sememic and grammatical structures in Gurung [Summer Institute of Linguistics Publications in Linguistics and Related Fields 49]. Norman OK: Summer Institute of Linguistics of the University of Oklahoma. Hashimoto, Anne Yue. 1969. The verb ‘to be’ in Modern Chinese. In John W. M. Verhaar (ed.), The verb ‘be’ and its synonyms, Vol. 9, 72–111. Dordrecht: Reidel. Haspelmath, Martin. 1995. The converb as a cross-linguistically valid category. In Martin Haspelmath & Ekkehard König (eds.), Converbs in cross-linguistic perspective: Structure and meaning of adverbial verb forms - adverbial participles, gerunds [Empirical Approaches to Language Typology 13], 1–55. Berlin: Mouton de Gruyter. Hattori, Shiro. 1976. Ryūkyū hōgen to hondo hōgen (The Ryukyu dialects and the mainland dialects). In Ifa Fuyū Seitan Hyakunen Kinenkai (eds.), Okinawagaku no reimei: Ifa Fuyū seitan hyakunen kinenshi (The dawning of Okinawan studies: A volume commemorating the hundredth anniversary of the birth of Ifa Fuyu). Tokyo: Okinawa Bunka Kyōkai. He, Jin-song. 1988. Zhuyu weiyu zhijian zhi de ciyi ji qi yuanliu (The origin and semantics of zhi between subject and predicate). Zhong Nan Minzu Xueyuan Xuebao: Zhe She Ban (Wuhan) (Journal of the South Central University for Nationalities (Philosophy and Social Sciences Edition)) 6: 106–117. Heine, Bernd. 2003. On the role of context in grammaticalization. In Ilse Wischer & Gabriele Diewald (eds.), New reflections on grammaticalization [Typological Studies in Language 49], 83–101. Amsterdam: John Benjamins. Heine, Bernd & Tania Kuteva. 2002. World lexicon of grammaticalization. Cambridge: CUP. Herring, Susan C. 1991. Nominalization, relativization, and attribution in Lotha, Angami, and Burmese. Linguistics of the Tibeto-Burman Area 14(1): 55–72. Hokama, Shuzen. 1977. Okinawa no Gengo to Rekishi (The language and history of Okinawa). Iwanami Kōza 11 (Iwanami Lecture Series 11). Tokyo: Iwanami Shoten. Horie, Kaoru. 1993. From zero to overt nominalizer no: A syntactic change in Japanese. In Soonja Choi (ed.), Japanese/Korean Linguistics 3, 305–321. Stanford CA: CSLI. Horie, Kaoru. 1998a. On the polyfunctionality of the Japanese particle no: From the perspectives of ontology and grammaticalization. In Toshio Ohori (ed.), Studies in Japanese grammaticalization: Cognitive and discourse perspectives, 169–192. Tokyo: Kurosio. Horie, Kaoru. 1998b. Functional duality of case-marking particles in Japanese and its implications for grammaticalization: A contrastive study with Korean. In David J. Silva (ed.), Japanese/Korean Linguistics 8, 147–159. Stanford CA: CSLI. Huang, Kwan Joan & Foong Ha Yap. 2006. On the grammaticalization of zhe in Classical Chinese. Paper presented at the 2006 Annual Research Forum of the Linguistic Society of Hong Kong. Hong Kong Polytechnic University, December 2006. Iwasaki, Shoichi. 1993. Functional transfer in the history of Japanese language. Japanese/Korean Linguistics 2, 20–32. Stanford CA: CSLI. Iwasaki, Shoichi. 2000. Suppressed assertion and the functions of the final-attributive in prose and poetry of Heian Japanese. In Susan C. Herring, Pieter Van Reneen & Lene Schøsler (eds.), Textual parameters in older languages, 237–272. Amsterdam: John Benjamins. Jiang, Lansheng. 1999. Chusuoci de lingge yongfa yu jiegou zhuci di3 de youlai. (The genitive use of locative words and the origin of the structural particle di3). Zhongguo Yuwen 2: 83–93.

Nominalizers in East Asian and Tibeto-Burman languages 339

Jiang Lansheng. 2004. Kua ceng fei duan yu jiegou ‘de hua’ de ci hui hua (About the lexicalization of the cross-structural combination de hua (的话)). Zhongguo Zhongguo Yuwen 5: 387–400. Kang, Eun-Kook. 1993. Cosene cepmisauy thongsicek yenkwu. (A diachronic study of Korean suffixes). Seoul: Sekwanghakswulcalyo Publishing. Kim, Seung-Gon. 1978. Hankwuke cosauy thongsicek yenkwu (A historical study of Korean particles). Seoul: Tayceykak Publishing. LaPolla, Randy. 2003. A grammar of Qiang, with annotated texts and glossary. With Chenglong Huang. Berlin: Mouton de Gruyter. LaPolla, Randy. 2006. Nominalization in Rawang. Invited keynote address presented at the 39th International Conference on Sino-Tibetan Languages and Linguistics. University of Washington, Seattle, September 2006. Lahaussois, Aimée. 2003. Nominalizations and its various uses in Thulung Rai. Linguistics of the Tibeto-Burman Area 26(1): 33–57. Lee, Hyo-Sang. 1993. The temporal system of noun-modifying (attributive) clauses in Korean from a typological perspective. Studies in Language 17: 75–110. Lehmann, Christian. 1995. Thoughts on grammaticalization: A programmatic sketch. 2nd edn. Munich: Lincom. Li, Charles N. & Sandra A. Thompson. 1977. A mechanism for the development of copula morphemes. In Charles N. Li (ed.), Mechanisms of syntactic change, 419–445. Austin TX: Texas University Press. Li, Charles N. & Sandra A. Thompson. 1981. Mandarin Chinese: A functional reference grammar. Berkeley CA: University of California Press. Lichtenberk, Frank. Forthcoming. Nominalization in Toqabaqita and other Cristobal-Malaitan languages. In Foong Ha Yap & Janick Wrona (eds.), Nominalization: Diachronic and typological perspectives [Typological Studies in Language]. Amsterdam: John Benjamins. Liu, Hongyong & Yang Gu. 2007. Nominalizers in Nuosu Yi. Ms. Department of Linguistics and Modern Languages, Chinese University of Hong Kong. Liu, Hongyong, Da Wu & Yang Gu. 2007. Determiner or nominalizer: The morpheme suu in Yi. Paper presented at the International Workshop on Nominalizers and Copulas in East Asian and Neighboring Languages. Chinese University of Hong Kong, January 2007. Lu, Shuxiang. 1943. Lun di3, di4 zhi bian ji di3 zi de youlai (A discussion on the controversy of di3 and di4 and the origin of di3). Jingling Qilu Hanxi Zhongguo Wenhua Huikan 3. (Published by Beijing Commercial Press, 1984). Martin, Samuel E. 1997. Copula structures in Korean and Japanese. Paper presented at the University of Oslo, September 1997. Matisoff, James A. 1972. Lahu nominalization, relativization, and genitivization. In John Kimball (ed.), Syntax and semantics. Vol. I, 237–258. New York NY: Seminar Press. Matthews, Stephen & Huiling Xu. 2002. The pronominal copula in Chaozhou. Paper presented at the 11th Annual Conference of the International Association of Chinese Linguistics. Nagoya, Japan, 2002. Morey, Stephen. 2006. Nominalisation and its discontents: Clausal nominalisation in Nimhpuk Singpho. Paper presented at the Workshop on Nominalization and its Discontents. La Trobe University, Melbourne, August 2006. Nishi, Yumiko. 2004. The emergence of the complementizer no in Japanese revisited. In Timothy Vance & Kimberly Jones (eds.), Japanese/Korean Linguistics 14, 127–137. Stanford, CA: CSLI.

340 Foong Ha Yap and Stephen Matthews

Noonan, Michael. 1997. Versatile nominalizations. In Joan Bybee, John Haiman & Sandra A. Thompson (eds.), Essays on language function and language type. In honor of Talmy Givón, 373–394. Amsterdam: John Benjamins. Noonan, Michael. Forthcoming. Aspects of the historical development of nominalizers in Tamangic languages. In Foong Ha Yap & Janick Wrona (eds.), Nominalization: Diachronic and typological perspectives [Typological Studies in Language]. Amsterdam: John Benjamins. Norman, Jerry. 1988. Chinese. Cambridge: CUP. Onoe, Keisuke. 1982. Bun no kihon kōsei – shiteki tenkai. (Basic sentence constructions: A historical development). In Kenji Morioka (ed.), Kōza Nihongo-gaku (Lecture Series in Japanese Linguistics). Tokyo: Meiji Shoin: 1–19. Park, Joo Young. 1999. ‘Kut’-uy moonbeobhwa yeongu (A study of the grammaticalization of kut). MA thesis, Sang Myung University, Seoul. Pustet, Regina. 2003. Copulas: Universals in the categorization of the lexicon. Oxford: OUP. Rhee, Seongha. 2002. From silence to grammar: Grammaticalization of ellipsis in Korean. Paper presented at New Reflections on Grammaticalization 2. University of Amsterdam, April 2002. Rhee, Seongha. 2005. Variability of formal reduction in grammaticalization. Journal of Linguistic Sciences 32: 251–268. Rhee, Seongha. 2007. Nominalization and stance marking in Korean. Keynote presentation at the International Workshop on Nominalizers and Copulas in East Asian and Neighboring Languages. Chinese University of Hong Kong, January 2007. Ryu, Ryul. 1990. Cosenmallyeksa 1. (A history of Korean 1). Pyongyang: Sahoykwahak Publishing. Serafim, Leon A. & Rumiko Shinzato. 2005. On the Old Japanese kakari (focus) particle koso: Its origin and structure. Gengo Kenkyu 127: 1–49. Shi, Yuzhi & Charles N. Li. 2002. The establishment of the classifier system and the grammaticalization of the morphosyntactic particle de in Chinese. Language Sciences 24:1–15. Shin, Mikyong. 2005. Development of the adnominal system in some East Asian languages. MPhil. thesis, Department of Linguistics and Modern Languages, Chinese University of Hong Kong. Shinzato, Rumiko. 1998. Kakari Musubi revisited: Its functions and development. Japanese/ Korean linguistics 8, 203–216. Stanford CA: CSLI. Shinzato, Rumiko. 2005. Nominalization in Okinawan: A diachronic and comparative perspective. Ms. Georgia Institute of Technology. Simpson, Andrew & Zoe Wu. 2001. The grammaticalization of formal nouns and nominalizers in Chinese, Japanese and Korean. In Thomas E. McAuley (ed.), Language change in East Asia, 250–283. London: RoutledgeCurzon. Sio, Ut-Seong Joanna. 2007. The adnominal and sentence-final ge3 in Cantonese. Paper presented at the International Workshop on Nominalizers and Copulas in East Asian Languages. Chinese University of Hong Kong, January 2007. Stassen, Leon. 1997. Intransitive predication. Oxford: OUP. Tang, Sze-wing. 2007. Nominalization in Mandarin and Cantonese: Variations under unified approach. Paper presented at the International Workshop on Nominalizers and Copulas in East Asian Languages. Chinese University of Hong Kong, January 2007. Van Breugel, Seino. 2006. Nominalisation in Atong. Paper presented at the Workshop on Nominalization and its Discontents. La Trobe University, Melbourne, November 2006.

Nominalizers in East Asian and Tibeto-Burman languages 341

Wang, Li. 1980. Hanyu Shigao. (History of the Chinese Language). Beijing: Zhonghua Shuju. Watters, David E. Forthcoming. Nominalization in Himalayish languages. In An encyclopedia of Nepal’s languages. Whitman, John. 1997. Kakarimusubi from a comparative perspective. Japanese/Korean Linguistics 6, 161–178. Stanford CA: CSLI. Wrona, Janick. 2005. The adnominal form in Old Japanese: Grammaticalization and degrammaticalization of an inflected form? Paper presented at the Workshop on Grammaticalization of Nominalizers: East Asian Perspectives, 3rd International Conference New Reflections on Grammaticalization (NRG 3). University of Santiago de Compostela, July 2005. Wrona, Janick. Forthcoming. A case of non-derived stand-alone nominalization: Evidence from Japanese. In Foong Ha Yap & Janick Wrona (eds.), Nominalization: Diachronic and typological perspectives [Typological Studies in Language]. Amsterdam: John Benjamins. Xu, Huiling & Stephen Matthews. Forthcoming. The ubiquitous morpheme kai in Chaozhou: Its adnominal and copula functions and possible grammaticalization pathways. Ms. La Trobe University and University of Hong Kong. Yanagida, Seiji. 1985. Muromachi Jidai no Kokugo (Language of the Muromachi period). Tokyo: Tōkyōdō. Yap, Foong Ha. 2007. On native and contact-induced grammaticalization: A diachronic analysis of Malay empunya. Ms. Department of Linguistics and Modern Languages, Chinese University of Hong Kong. Yap, Foong Ha, Pik-ling Choi & Kam-Siu Cheung. 2008. Delexicalizing di: How a Chinese noun evolves into a nominalizer, with attitude. Ms. Department of Linguistics and Modern Languages, Chinese University of Hong Kong. Yap, Foong Ha, Stephen Matthews & Kaoru Horie. 2004. From pronominalizer to pragmatic marker: Implications for unidirectionality from a crosslinguistic perspective. In Olga Fischer, Muriel Norde & Harry Perridon (eds.), Up and down the cline: The nature of grammaticalization [Typological Studies in Language 59], 137–168. Amsterdam: John Benjamins. Yap, Foong Ha & Janick Wrona (eds.), Nominalization: Diachronic and typological perspectives [Typological Studies in Language]. Amsterdam: John Benjamins. Yue, Anne O. 1998. Zhi in pre-Qin Chinese. T’oung Pao LXXXIV: 239–292. Yue, Anne O. 2004. Two focus markers in the Zhongshan Bronze Inscriptions. In Kenichi Takashima & Shiaoyu Jiang (eds.), Meaning and form: Essays in Pre-Modern Chinese grammar, 241–265. Munich: Lincom.

Index of names A Aaron, Uche 59 Abe, Kooboo 170 Abraham, Werner 137, 141, 143 Agha, Asif 314–15, 329 Aikhenvald, Alexandra Y. 21, 68, 98 Alber, Birgit 113 Alexandre, R. P. 58 Al-Hassan, Shahir A. 41, 59 Allan, Keith 21 Allott, Anna 272, 277 Alsagoff, Lubna 198–9, 201 Andersen, Henning 107–8 Anderson, Gregory 223 Andvik, Eric 230 Ansaldo, Umberto 16, 32 Aoki, Haruo 316 Arnold, Werner 57 B Baker, Philip 246 Beavon, Keith H. 58 Becker, Neele 23 Behaghel, Otto 137, 152 Benacerraf, Paul 292 Benveniste, Émile 133, 137, 142, 160 Bernini, Giuliano 130, 150 Berretta, Monica 147 Bertinetto, Pier Marco 141–2 Beyer, Stephan 227, 229 Bickel, Balthasar 224, 226, 336 Binnick, Robert 222 Bisang, Walter 1–4, 8, 10, 17–19, 23–4, 29, 32, 45, 121–2, 131–2, 161, 193 Blevins, Juliette 75 Bonfante, Giuliano 140 Boretzky, Norbert 46, 57 Bourdin, Philippe 2, 10–11, 44–5, 51, 59 Bourquin, Walther 58 Boyeldieu, Pascal 58 Braune, Wilhelm 109, 118 Breu, Walter 148, 202 Brustad, Kristen E. 41

Bucholz, Oda 143 Bybee, Joan L. 1, 3, 16–17, 27–30, 33, 143, 181–2, 246, 248, 255 C Campbell, Lyle 1, 67, 70, 99, 129–31, 174 Carey, Kathleen 135 Carlson, Robert 42, 58 Casad, Eugene 246 Caubet, Dominique 57 Cennamo, Michela 154, 157 Chae, Wan 244 Chang, Suk-Jin 183 Chhangte, Lalnunthangi 59 Choe, Hyon-Bae 244 Choi, Kyu-Il 240, 257 Chomsky, Noam 73 Chumbow, Beban 189 Cinque, Guglielmo 151 Claudi, Ulrike 15, 26 Comrie, Bernard 131–2, 148–50, 189–90, 192, 297 Cook, Haruko Minegishi 316 Corbett, Greville G. 104 Correll, Christoph 57 Coupe, Alec R. 336 Creissels, Denis 106–7, 124 Cristofaro, Sonia 132 Croft, William 64, 193 Crystal, David 43, 213 Cuyckens, Hubert 1 D Dahl, Östen 131–2, 134, 144, 157, 193, 195, 204–6 Dasher, Richard B. 246 Davidse, Kristin 1 Davis, Karen 57 Day, Christopher 44, 57 De Jong, Rudolf 44, 57 DeLancey, Scott 7, 226–30, 232, 254–5, 311, 314, 320, 329 Denison, David 138 Denny, Peter J. 21 Denwood, Philip 227

344 Rethinking Grammaticalization: New Perspectives Detges, Ulrich 31, 41, 59 Dez, Jacques 57 Dieninghoff, J. 138 Diessel, Holger 246 Dieterlen, Hermann 45 Diewald, Gabriele 1 Disterheft, Dorothy 57 Dixon, Robert M. W. 98, 306 Doke, Clement Martyn 45 Donaldson, Bruce 159 Drame, Mallafé 58 Drinka, Bridget 139–41, 143 Dryer, Matthews S. 144 E Ebert, Karen 38, 45–7, 57, 59, 225, 227, 231, 233 Ebert, Robert 119–20, 141 Edelman, Džoj I. 302 Egner, Inge 59 Eisenberg, Peter 103–4, 116 Elbert, Samuel H. 45 Eliade, Mircea 221 Elsness, Johan 139 Enfield, Nicholas James 16, 18 Evans, Nicholas 325 Everett, Daniel L. 44, 57 F Fauconnier, Gilles 328 Fedson, Vijayarani Jotimuttu 59 Feghali, Michel 42 Fici, Francesca 157–8 Fiedler, Wilfried 143 Fiorentino, Giuliana 150 Fischer, Olga 1, 10, 40 Fleischman, Suzanne 246 Folárìn Schleicher, Antonia 58 Frajzyngier, Zygmunt 2–3, 50, 57, 63–4, 66, 68, 71–4, 78, 80–1, 84–5, 95 Francis, Elaine J. 10, 103–4, 124 Frellesvig, Bjarke 309, 322, 330–2 Friedman, Victor A. 140–1 Fuss, Eric 285 G Gadet, Françoise 150 Gaeta, Livio 2, 9–10, 107, 111, 115, 118–19 Gamkrelidze, Thomas Valerianovic 303–5 Gardiner, Alan 246–7 Garvin, Paul L. 139 Genetti, Carol 255 Geniušienė, Emma 154–5, 158

Georg, Stefan 229 Giacalone Ramat, Anna 1–2, 8, 64, 129–30, 132, 151 Gil, David 291 Givón, Talmy 64, 246, 255 Glover, Warren W. 228, 230 Goldenberg, Gideon 79, 84 Greenberg, Joseph H. 20, 68, 80, 144, 293, 296–9, 301–2 Grice, Herbert Paul 27 Grinevald, Colette 21 Grønvik, Ottar 117 Grosjean, François 212 Grunow-Harsta, Karen 228 Gu, Yang 336 H Haase, Martin 137, 139, 142 Hahn, Reinhard F. 59 Haiman, John 9, 64, 122, 189, 191–2, 195, 213 Hale, Kenneth 305–6 Hanke, Thomas 289–90, 297, 302, 306 Härd, John E. 115 Hari, Annemarie 230 Harris, Alice C. 118, 138, 142, 174 Hashimoto, Anne Yue 331 Haspelmath, Martin 8, 66–7, 106–7, 111–12, 129, 133, 137, 144, 148, 154–6, 161, 213, 246, 335 Hattori, Shiro 333 Haugen, Einar 139 He, Jin-song 7, 19, 148, 191, 209, 334 Heath, Jeffrey 191 Heine, Bernd 1, 3, 9, 11, 16–17, 26–30, 40, 46–7, 49–50, 57–8, 64–6, 72, 131–2, 139, 141, 144–6, 153, 161–2, 177, 193–4, 197, 203, 209–10, 242–3, 246, 289, 298, 305–6, 325, 331 Heller, Daphna 76 Hennesy, Pelin 255 Herring, Susan C. 270, 275, 336 Hildebrandt, Kristine 230 Himmelmann, Nikolaus P. 1, 107–8, 112, 146 Hinderling, Robert 211 Hinrichs, Uwe 143 Hitchcock, John 221 Hokama, Shuzen 333 Holvoet, Axel 159 Hong, Jongseon 242, 244, 248 Hong, Ki-Moon 242 Hopper, Paul J. 1, 3, 7, 17, 29–30, 64, 66, 104–5, 107, 110, 130, 135, 209, 246, 250 Hopple, Paulette M. 272, 274 Horie, Kaoru 4–5, 169, 174, 177–80, 246, 270, 310, 315–17, 320, 332, 336

Horrocks, Geoffrey 140 Huang, Kwan Joan 334 Hünnemeyer, Friederike 15, 26 Hurford, James R. 297 I Ifrah, Georges 298 Im, Hong-Pin 244 Ivanov, Vjaceslav Vsevo’lodovic 303–5 Iwasaki, Shoichi 172–3, 316, 332 J Jakobson, Roman 71, 221 Janda, Richard D. 67, 99 Jang, Seok-Jin 244, 257 Jiang, Lansheng 323, 326–7 Jirsa, Bill 72, 74 Johnston, Erik 61, 85, 87 Jones, Ross McCallum 58 Joseph, Brian D. 67, 99, 192–3 Judaxin, Konstantin Kuzmič 59 Jung, Howan 239, 245 K Kang, Eun-Kook 241–2, 249, 253, 322, 331 Kang, Kil-Won 240 Kassevitch, Vadim 273 Kazenin, Konstantin I. 153, 158 Keenan, Edward 149, 151 Keesing, Roger M. 57 Kemmer, Suzanne 153–5, 157–9 Kern, Barbara 44, 57 Kihm, Alain 49 Kim, Eon-Ju 245 Kim, Heung-Soo 255 Kim, Ki-Bok 169, 244 Kim, Seung-Gon 241, 322, 331 Kim, Tae-Yeop 258 King, Gareth 299 Klein, Wolfgang 115–16 Kluge, F. 152 Ko, Yong-Kun 257 Koch, Peter 133 Koelle, Sigismund Wilhelm 247 König, Ekkehard 111 Koptjevskaja-Tamm, Maria 131, 157–9 Kuno, Susumu 171 Kuryłowicz, Jerzy 31, 63, 66, 68, 80 Kuteva, Tania 2, 8–9, 11, 40, 50, 58, 64–5, 72, 131–2, 139, 141, 144–6, 153, 161–2, 177, 189–90, 192–4, 197, 203, 209–10, 246, 289, 305, 331 Kwon, Jae-Il 240, 244

Index of names 345 L Labov, William 161 Lafon, Michel 58 Lahaussois, Aimée 336 Langacker, Ronald W. 118, 291 Lanthaler, Franz 113 LaPolla, Randy 220, 227, 232, 275, 323, 336 Lass, Roger 306 Laury, Ritva 147 Lee, Deuk-Choon 244, 253 Lee, Hyo-Sang 319 Lee, Ki-Moon 242, 255 Lee, Soong-Nyung 255 Lehmann, Christian 1, 3, 15–17, 25–6, 30, 33, 40, 42, 64, 109, 130, 245, 247, 331 Leiss, Elisabeth 117–18 Lessau, Donald A. 246 Lewis, Geoffrey L. 246 Li, Charles N. 135, 247, 326–7, 330 Lichtenberk, Frank 336 Lick, Ho Chee 198–9, 201 Lim, Lisa 16, 32 Liner, E. 190 Liu, Hongyong 336 Longacre, Robert E. 41, 58–9 López-Couso, María José 2 Lötzsch, Ronald 145, 147 Lu, Shuxiang 326 Lühr, Rosemarie 191 Luján Martínez, Eugenio Ramón 305 Luraghi, Silvia 57 M Mabille, Adolphe 45 Maibaum, Anita 230 Majewicz, Alfred Franciszek 298 Marchese, Lynell 40, 49, 58 Marra, Antonieta 148 Martin, Samuel E. 173, 257, 322, 331 Masica, Colin P. 221 Maslova, Elena 222 Matisoff, James A. 191, 282, 311, 313, 320 Matras, Yaron 193 Matthews, Peter 65 Matthews, Stephen 4–5, 177, 254, 257, 270, 322, 324–5, 336 Maurer, Philippe 59 Maynard, Senko K. 170–1 Mazaudon, Martine 230 Meillet, Antoine 1, 12, 63, 137 Menninger, Karl 298 Michaelis, Laura A. 10, 103–4, 124

346 Rethinking Grammaticalization: New Perspectives Mitchell, Bruce 138 Mitchell, Terence F. 59 Mofokeng, S. Machabe 45 Möllendorff, Paul Georg von 240 Morey, Stephen 336 Morphy, Frances 57 Mortelmans, Tanja 118 Moser, Amalia 140 Mustanoja, Tauno F. 152 Mycielski, Jan 64 Myers-Scotton, Carol 131 N Nadkarni, Mangesh V. 196–7, 212 Nau, Nicole 193 N’Diaye-Correard, Geneviève 58 Nedjalkov, Vladimir P. 135 Newmeyer, Frederick J. 1, 17, 33, 67, 99, 129 Nichols, Johanna 224 Nicolle, Steve 58 Nishi, Yumiko 174, 254, 309, 316–17 Noonan, Michael 4–7, 169, 220, 224, 227, 229, 231–2, 254–5, 275, 311–13, 320, 329, 331, 334, 336 Norde, Muriel 1 Norman, Jerry 327 Noss, Philip A. 58 Nougayrol, Pierre 49, 58 O Okell, John 272–4, 277 Onoe, Keisuke 317 P Pagliuca, William 1 Park, Joo Young 245, 248, 318, 321 Paul, Hermann 119–21, 152 Perkins, Revere 1 Perridon, Harry 1 Pfaff, Meike 206 Pinkster, Harm 135–6 Plank, Frans 115 Poppe, Nicholas 59, 240 Prost, André 58 Pustet, Regina 331 Q Quesada, Juan D. 59 R Ramat, Paolo 109, 130, 135–6 Reh, Mechthild 58, 246

Rhee, Seongha 4–6, 177, 242, 249, 254, 256–8, 270, 318–22, 330–1, 333, 336 Rijkhoff, Jan 75 Rodegem, Firmin M. 58 Romaine, Suzanne 152 Rosenbach, Anette 1 Rosenhouse, Judith 42, 57 Rothstein, Robert A. 139 Rowlands, Evan C. 39, 59 Ryu, Ryul 241, 322, 331 S Sachnine, Michka 58 Samarin, William J. 57 Samuels, Michael L. 63, 75 Santandrea, Stefano 58 Sassa, Yuko 177, 180 Saunders, Ernest Dale 170 Saw Tun 273 Scatton, Ernest A. 144 Schachter, Paul 22 Schirmunski, Victor M. 113 Schmid, Tanja 10, 115 Schmidt, P. Wilhelm 306 Schütze, Carson 273 Schwegler, Arnim 143 Sebba, Mark 48 Seebold, E. 152 Seiler, Hansjako 293 Seoane, Elena 2 Serafim, Leon A. 318 Shay, Erin 61, 68, 72–3, 78, 80–1, 84 Shi, Yuzhi 326–7 Shibatani, Masayoshi 72 Shim, Jae-Kee 244 Shin, Mikyong 254, 318–320 Shinzato, Rumiko 309, 313, 316–18, 329, 333 Silva-Corvalán, Carmen 212 Simpson, Andrew 4, 7, 169, 270, 276, 281, 284–5, 320, 336 Sio, Ut-Seong Joanna 336 Sitoe, Bento 47, 58 Small, Priscilla C. 47–8 Smith, Holly 230 Soares, Carlos 212 Sohn, Choon-Sup 245 Sohn, Sung-Ock 184 Song, Cheol-Eui 244, 251 Song, Seok-Choong 253 Squartini, Mario 141–2 Stampe, David 306 Stassen, Leon 324–5

Index of names 347

Stein, Dieter 1 Stevens, Christopher M. 211 Stilo, Donald 210–11, 221 Stolz, Christel 130, 132, 193 Strauss, Susan 184 Suh, Jung-Soo 257 Szemerényi, Oswald 148, 295, 303–4 T Taira, Kaori 177 Tang, Sze-wing 327, 336 Taylor, Doreen 230 Taylor, John R. 42, 58 Thierhoff, Rolf 141 Thomason, Sarah Grey 131–2, 207 Thompson, Katrina Daly 58 Thompson, Sandra A. 247, 326, 330 Thurgood, Graham 275 Timofeeva, V. T. 59 Trask, R. Larry 140 Traugott, Elizabeth Closs 1, 3, 7, 17, 29–30, 40, 64, 66, 104–5, 107, 110, 130, 135, 138, 209, 246, 250, 255 Trovesi, Andrea 144–6 Trudgill, Peter 195 V Van Breugel, Seino 336 Van der Auwera, Johan 111 Van Driem, George 232 Van Gelderen, Elly 276 Vandelanotte, Lieven 1 Varenkamp, Bryan 230 Velázquez-Castillo, Maura 59 Verstraete, Jean-Christophe 1 Visser, Fredericus Th. 138 Von Humboldt, Wilhelm 63

Von Mengden, Ferdinand 2, 10–11 W Wako, Masakazu 183 Wälchli, Bernhard 131, 157–9 Waltereit, Richard 31 Wang, Li 326 Wang, Moon-Yong 245 Watters, David E. 221, 226, 313 Weinreich, Uriel 194 Weiss, Daniel 147 Westermann, Dietrich 58 Whitman, John 319 Wiemer, Björn 1, 161 Wierzbicka, Anna 71 Wiese, Heike 290, 292, 297 Wilkins, David 32 Wilson, William A. A. 57 Wischer, Ilse 1, 138 Wolff, Ekkehard 57 Wright, Roger 161 Wrona, Janick 317, 329, 332, 334, 336 Wu, Zoe 169, 270, 276, 281, 320 Wurzel, Wolfgang U. 110 X Xu, Huiling 322–5, 336 Y Yanagida, Seiji 317 Yap, Foong Ha 4–5, 174, 177, 246, 252, 254–5, 257, 270, 316–17, 326, 329, 334 Yoon, Kyung-Mi 245 Yue, Anne O. 334–5 Z Ziervogel, D. 58–9

Index of languages A African languages 40, 43 Afrikaans 111 Afro-Asiatic languages 38, 41, 47, 57, 59 Albanian 140, 143–4, 147, 149, 153, 156, 159 Arabic 41–2, 44, 52, 57, 59 Armenian 210–11 Arrarnta 299 B Bagiro 11, 43, 58 Balkan languages 140, 147, 149 Baltic languages 134, 158 Bantu 42, 45, 52, 58–9 Basque 10, 106–7, 124, 133, 139, 142, 144 Bodic languages 6, 219–34 Bodish 219–20, 226 Boko 44, 58 Burmese Colloquial Burmese 7, 265–86 Literary Burmese 7, 265–86 Burushaski 221, 223 C Catalan 11, 41, 52, 59 Celtic 8, 130, 133–4, 139, 144, 148, 159 Chadic languages 2–3, 61–99 Chantyal 4, 6–7, 220, 223–5, 227–33, 310–13, 321, 329, 331–2, 334, 336 Chinese 4, 10, 20, 24–5, 32, 121, 123, 198–201, 321–4, 326–7, 329–31, 333–6 Cantonese 4, 310, 323–4, 326, 329 Chaozhou 4, 310, 322–5, 329–31 Classical Chinese 321, 334 Hokkien Chinese 198–9 Mandarin Chinese 4, 80, 198–200, 324, 326–7, 329, 332 Croatian 201–3 Czech 8, 133, 139, 142, 144–6 D Danish 204–5 Dutch 10, 115, 120

E East Asian languages 4, 169, 177, 310, 327–9, 332 English 11, 63, 65–6, 73–5, 77, 81, 130, 132–3, 135, 138–9, 141–2, 147–8, 152–3, 292–5, 300 Old English 135, 138–9, 143, 303 Ewe 209 F Finnish 134, 144, 147, 154–5 French 8, 130, 133, 139, 141, 148–9, 152, 154–6 G German 8, 10, 103, 110–21, 133, 137–9, 141–3, 145, 147–8, 152–3, 155–6, 159, 161 Lotharingian 113 Middle High German 110, 115, 119 New High German 110–11 Old German 137 Old High German 110–13, 118–19, 138 Ripuarian 112–13 Upper Vinschgau 113 Germanic languages 8, 10, 109, 111–12, 115, 117, 120, 130, 133–4, 136–40, 142–3, 148–9, 152, 154, 156, 158–9, 161 Gidar 61, 71, 79 Giziga 61, 76 Godié 49, 58 Gothic 109, 136, 143 Greek 8, 109, 111 Classical Greek 111 Tsakonian Greek 111 Guinea Bissau Crioulo 57 Gula 49–50, 58 Gurung 4, 228, 230, 311

H Hdi 3, 50, 57, 61, 77, 85–8, 90, 92, 94, 97 Himalayish 7, 219–20, 223, 227, 231–3 Hona 3, 61, 85–7, 90, 92, 96–7 I Indo-European 133, 140, 142, 145, 148–9, 152–3,

Index of languages 349

156–7, 160, 296, 300, 303–5 Italian 8, 117, 130, 133–5, 141–3, 145–51, 155–6, 201–3 Old Italian 146, 151, 304 Izi 11, 41, 58 J Japanese 4–5, 169–74, 177–85, 309–10, 315, 317–18, 320–1, 326, 329–34, 336 Classical Japanese 171–3 Modern Japanese 170–7, 315–17, 321, 326, 332 Old Japanese 331–4 K Kafa 79 Kannada 196–7, 212 Kera 38, 45, 57 Khmer 3, 10, 16, 18–20, 24, 121–2, 192, 195–6 Konkani 196–7, 212 Korean 4–6, 177–85, 239–60, 318–22, 329–34, 336 Early Modern Korean 240, 244, 255 Middle Korean 6, 240, 245, 251, 253–5, 320 Modern Korean 6, 240–2, 244, 248–60, 320, 331 Old Korean 5, 244, 247, 319, 322, 329–31, 333 Kxoe 46, 49, 57 L Lahu 4, 311, 313 Latin 8, 32, 105–7, 135–40, 142–3, 147–52, 154, 156, 160–1, 209 Lithuanian 300 M Mainland Southeast Asian languages 3, 16–17, 19, 21, 24, 26, 28, 30–4 Margi 47–8 Mari 11, 43, 59 Mina 76, 81–3 Mixtec 48 Mongolian 222 Mutsuvan 85–7, 90, 95–7 N Nar-Phu 227, 230–1 Neo-Aramaic 11, 41–2, 57 Ngemba 189–90 Norwegian 204–5 O Okinawan 4, 318, 332–3

P Polish 133–4, 139, 142, 144, 146 R rGyalrong 219–20, 226 Romance 8, 130, 133–4, 139–43, 148–52, 154–6, 158–61 Russian 106–7, 144, 157–8 S Saramaccan 46, 48, 57 Scandinavian languages 133, 139, 141, 153, 156, 158, 204–6, 211 Semitic 2, 76, 84, 132, 144, 210 Siberian Yupic Eskimo 207–8 Singlish 198–201 Sino-Tibetan 6, 219, 221, 232 Slavic languages 8, 134, 139, 144–8, 154, 156, 158–60, 202–3 Slovenian 8, 144–6 Sorbian 8, 104, 144–7 Spanish 141, 155 Standard Average European 8, 133–4, 137, 139, 144, 148–50, 153, 159–61 Surselvan 158 Swahili 42, 58, 190 Swedish 204–5 T Thai 3, 10, 20–6, 121–3 Tibetan Classical Tibetan 227, 229, 311 Lhasa Tibetan 4, 227, 231, 311, 314, 329 Tibeto-Burman languages 4, 219, 221, 223, 275, 309–36 Tsonga 48, 58 V Vietnamese 10, 121–3 W Wandala 3, 78, 83–7, 89–93, 97 Wari 37, 44, 57 Warlpiri 305–6 Welsh 130, 299 Y Yakoma 48 Yukaghir 222–3

Index of subjects A abduction 29 abstraction 247, 252 accretion 9, 189–214 morphemic 204 phonological 191 acquisition first language 292 second language 130–1, 195 actualization 107 adjective 198–200, 202–3 adnominal 4–5, 234, 251–2, 317, 319, 321, 329–34 adnominalizer 6, 240, 243, 253–5, 258–9 adnominalizer vs. nominalizer 254–9 prospective adnominalizer 240, 251, 254, 258, 330 simultaneous adnominalizer 240, 251, 254 affix 11–12, 181–2 agglutinating language 181 agreement 63, 104, 118, 131, 136, 139, 257 Aktionsart 116 amalgamation 201 ambiguity 40, 325 analogy 26, 29, 107 analytic construction 202, 204 animacy 153 inanimate 154 anterior 46, 141, 181–2, 240, 251, 253–5, 319, 329–30 anticausative function 154 areal contact 144 context 130, 220–1 convergence 132, 143 diffusion 6, 8, 130–1, 140–3, 149, 152–3, 156, 159, 231 feature 131, 133, 142, 149, 160 interpretation 143 relationships 132 type 4, 131–2 typology 131 areality 2–6, 8–9, 17, 31, 130–3, 140–4, 147, 159–61, 224, 240 article 1, 4–5, 10, 12, 149, 152, 160–1, 206

definite 8, 130, 144–8, 204–5, 211 indefinite 139, 144, 147 suffixed 204–6 aspect 3–5, 7, 10, 122, 312–13, 315, 319, 329–30 perfect 334 ‘have’-perfect 8, 68, 115, 133–43, 159–62, 181–4 perfective 68, 85, 88, 90, 106, 114, 117, 135, 139, 141, 181–4 perfects as pasts 141–3 progressive 10, 44, 51, 68, 106 resultative construction 135, 137–40, 142–3 aspectual construction 5, 174, 176, 179–85 completive 181–2, 184 marker 180 periphrastic 5, 181–5 attributive 117, 171–3, 179, 220, 311, 319–21 function 116 nominal forms 171 non-relative attributive 233 participle 11 attrition 191, 206, 302, 305 autonomy 3, 25–6, 30 syntactic autonomy 11, 245, 303–4 auxiliary see verb auxiliation 42, 140, 142 B bilingualism 130, 212 bleaching (semantic) 30, 65, 67, 143, 156, 175, 182, 268, 272 body part (noun) 11, 12, 290, 298–9, 301 bondedness 25–6 morphosyntactic 252–3 borrowing 10, 105, 123–4, 219, 231, 233 bridge construction 320 bulking see accretion C calquing 140, 161 cardinality 292, 297–8, 303, 306 case marker 241–3, 246, 255 cause/reason 317, 320 centre of innovation 8, 142, 149, 153

classifier 3, 16, 18, 20–6, 267, 269–71, 322–3, 325–6 clause fusion 118 clause-linkage marker 104 cleft 5, 174–5, 178, 312–16, 318, 320–2, 324, 328–9, 331–3 code-switching 131 cohesion 11, 25, 151 ‘come’ 37, 38–59 ‘come to have’ 16, 18–21, 23–4, 26, 29–32 comitative 203–4, 207 comment 84 comparative construction 203 complementizer 5, 174, 177–8, 180, 189, 253–4, 256–7, 267, 270, 314–16, 318, 320–1 compounding 272 concessive 317, 320, 335 conclusive form 173 conjunction 5, 174–5, 177, 179–80 connective 39, 41, 43, 45–6, 49, 51–2, 58–9 construct state 76 content question 71 context expansion 107, 112–13 control 153–4 conventionalization 16, 26, 30, 64, 174, 292, 298, 301, 303, 306 copula see verb correlative construction 196 coverb 3, 24, 26 creole 130 cultural influence 231 cyclicity 143, 146 D dative-locative 230–1 decategorialization 194, 208 defectiveness defective noun 245 lexical defectiveness 109 definiteness marking 9, 204 deixis deictic force 146 demonstrative 8, 144–9, 160, 199–200, 204, 241–2, 246–7, 267–8, 270–1, 275–8, 280, 286, 305 directional deixis 11, 38, 47 spatial deixis 52 temporal deixis 52 demonstrative see deixis desemanticization 156, 194, 208 determination 204, 211 determiner 189, 270 diachronic overlap stage 207–9

Index of subjects 351 diffusion 6, 8, 130–1, 141–3, 147, 149, 152–3, 156, 159, 161, 231 directionality 50–1, 162, 306 disambiguating context 145 discreteness of syllable boundaries 17, 32 divergence 104 downgrading analysis 33 E economicity 160 elaborateness of marking 195 elaboration 7, 231 emergent grammar 64, 70 erosion 16–17, 110, 194, 207–8, 210, 212 evidentiality 315, 317, 328–30 exaptation 306 extension 7–8, 153–4, 194, 199, 208, 231–2 functional extension 5–6, 174, 177–8, 240 external factors 160, 162 F factuality 19, 31 feature strength 160 focus 2, 4, 21, 78–9, 97, 312–16, 320–2, 328–9, 331–3, 335 contrastive focus 22, 63 form renewal 108–9, 111 form-function mapping 103 form-meaning co-evolution 15, 25–6, 28, 30, 32 fossilization 11, 43–4, 190, 303 frequency 33, 132, 139, 258 future see tense G gender 68–9, 71, 80, 214 generalization 5, 7, 10, 65, 213 semantic generalization 174, 177–8, 248–9 genitive 6, 174, 198, 267, 274–5, 277–8, 282, 284–6, 311, 315–16, 318–19, 322–3, 326–7, 331 genitive construction 199, 200, 221–3, 228–32 ‘go’ 37, 38–59 grammaticalization passim canonical grammaticalization 7, 129, 162 contact-induced grammaticalization 8–9, 133, 144, 147, 152, 156, 159–62, 193–7, 201–4, 206–8, 210, 212–14 deconstructing grammaticalization 129 degree of grammaticalization 16, 25–6, 180–1, 184–5 expanded notion of grammaticalization 67 generalization of grammaticalization 106–8 grammaticalization area 8, 132–3, 157, 159–60

352 Rethinking Grammaticalization: New Perspectives grammaticalization (cont.) grammaticalization chain 47, 129 grammaticalization cline 10–11, 16, 20, 25, 28–9, 40–2, 155, 290, 292, 298–9, 304, 306 grammaticalization extension 107, 109, 113, 118 grammaticalization of grammatical relations 3, 63, 67, 75, 84–7, 93, 98 grammaticalization of lexical categories 2, 63, 69, 80, 83 grammaticalization of linear order 75 grammaticalization of phonological means 68 grammaticalization of verbal elements 181 grammaticalization pathway 4–5, 38, 40–2, 49–51, 174, 177, 180–2, 184, 209, 248, 265, 292, 306, 313–14, 318, 321–2, 325, 328 grammaticalization pattern 5–6, 9, 130, 170, 177, 181–2, 184–5, 206, 211, 214 grammaticalization process 3, 7–11, 30–1, 105–6, 112, 117, 121, 130–3, 137, 139–40, 142, 144–5, 147–50, 152–3, 156, 158–62, 169, 194, 197, 202–3, 205–6, 208–9, 211, 214, 290, 292, 294–6, 299, 304, 328 grammaticalization wave 105, 111 incipient grammaticalization 142, 145 initial stage of grammaticalization 130, 145, 161, 184 innovating grammaticalization 109, 113 language internal grammaticalization 9, 162, 194, 207–8, 210, 214 limits of grammaticalization 130 motivations for grammaticalization 75 morphophonological motivation 113, 2, 66, 70–4 renovating grammaticalization 109 scope of grammaticalization 3, 61–2, 70–1, 99 traditional approach to grammaticalization 62, 64, 66–7 two-step grammaticalization change 143 unchanged residue of grammaticalization 10, 105, 107–9, 113, 124 H hermit crab theory 9, 191 high-contact variety 194–5, 197, 202–3, 206 I iconic repetition 192 impersonal construction 8, 153 implicature conventional 26–7, 29–30 conversational 26–7, 29–30 incipient categories 146

indefinite marker 305 indicative see mood individuation 21 inference 3, 17–20, 28, 30 abductive inference 118 pragmatic inference 15–16, 26–7, 31, 34, 258 infinitive 10, 114–20 substitute infinitive 114–15, 120 infix 190 inflection 67, 105, 107, 109, 112 initial state 2, 63, 72, 74–5, 84–6 innovating mutation 160 integrity 25–6 interrogative structure 197 introflection 105 invariable marker 132, 149–50, 152, 160 irrealis see mood isolating languages 181 iteration 11, 45–9 K kakari musubi constructions 318 L language contact 7–10, 105, 121–4, 130–1, 140, 152, 156–9, 161, 193–6, 201, 204–5, 211–13 lateral shift 192–3 layering 104, 109, 209, 250, 299 lexical inertia 10, 108–9, 113 lexical lacuna 113 lexicalization 64, 66, 130, 251–2 linguistic area 131–3 linker particle 198–200 locative predicator 81–3 M main verb construction 312–13 metaphor 26–30, 249 metonymy 26–7, 29–30 middle forms 153–5 middle voice 153 middle-passive voice 153 mismatch 9, 103–24 complexity mismatch 10, 103, 105, 109 content mismatch 10, 103–5, 114, 119, 124 modal + infinitive periphrasis 119 modality 74, 97, 181, 328 ability 7, 18–19, 31 epistemic 249, 257 obligation 19, 31, 176 permission 18–19, 31 mood 327, 329–30

Index of subjects 353

indicative 111 irrealis 277–9, 284–6 realis 277–9, 284–6 subjunctive 111

overlap of function 6, 209, 240 of meaning 209 stage 207–10, 214

N negation 253 discontinuous negation 130 nominal agent nominal 224–6 attributive nominal 220, 233 eventive nominal 243 patient nominal 6, 220, 224–6, 229, 234, 311, 320, 328 propositional nominal 243 nominalization 4–7, 169–85, 219–34, 239–60, 265–86, 309–36 direct nominalization 173 end-marking function 256–7 gerundival nominalization 251 nominalization-attribution syncretism 221–2 nominalization-relativization syncretism 6–7, 219–34 nominalized clause 171, 222, 224, 258, 275, 283, 311, 313, 324, 329 non-overt nominalization 173 nominalizer derivational nominalizer 240 development of nominalizer 316, 322 residual nominalizer 5, 310, 333–5 sentential nominalizer 5, 7, 170–1, 174, 185, 246–7, 265, 273–86 source of nominalizer 241–2, 276 nominative case marker 241–3, 246–7, 275, 328 non-canonical patterns 105–7 non-systemic expression 292, 294, 297–8 number 20 number expression see numeral numeral 21, 267–70, 289–306 arbitrary 300 cardinal 2, 11, 290–9, 304–6 complex 11–12, 290, 292–4, 303–4, 306 simple 11, 292–3, 295, 303, 305 numeralization 290

P paradigm 3, 7–8, 15, 17, 25–6, 32–3 defective paradigm 109, 111 suppletive paradigm 111 paradigm layering 109 paradigmaticization 25–6, 42, 109 participial complement 137 past see tense past time 42, 139 past participle (passive) 10, 111–21, 133, 135–40, 142 perfect see aspect perfective see aspect phenomenological dimension 213 phonological means 67–9, 75–7, 97, 99 phonotactic restraint 17, 32 pidgin 130 polysemy 20, 194, 321 possessive 104, 210–11, 318, 321 possessive construction 106, 135, 137 possessive meaning 138 possessive pronominal 315, 319, 322–3 postfix 8, 158–60 pragmatic marker 177 pragmatics 3, 30–1, 33 preterite see tense preterite-present see verb principle of functional transparency 2, 63, 72–5, 83–4, 89, 92, 95–6 principle of indirect means 2, 63, 72–5, 99 productivity 240, 252, 259 progressive see aspect pronoun 39, 49, 104, 149, 242, 247, 305, 322, 331 anaphoric 150 indefinite 306 interrogative 148, 152–3 personal 150, 241, 246 reflexive 8, 153–9 relative see relative pronoun purposiveness 48–9

O object 75, 79, 83–98, 118, 149–52, 273–4, 277, 279–81 accusative 138 direct 135–8 obligatory categories 3, 15–16, 18, 20, 22, 25–6, 33 one-form languages 158 onomatopoeic word 243, 250

Q quantification 252, 290–2 quantifier 11, 290–2, 295, 299 R realis see mood reanalysis 29, 34, 107, 113, 135–6, 138, 142–3, 265–6, 272

354 Rethinking Grammaticalization: New Perspectives reanalysis (cont.) semantic reanalysis 51 recursive principle 292–5, 300 reduction (phonetic) 3, 16, 28, 33–4, 65, 67, 76, 98, 160, 184, 248–9, 323 redundancy of marking 9, 193–5 regrammaticalization 72, 265, 284, 286 regularization 120 relative appositive relative 151–2 relative clause 9, 83–4, 132, 175, 178, 180, 189–90, 196–201, 206, 212, 220, 222–4, 255, 274–5, 311–12, 314–16, 318–24, 326, 328, 333 relative marker 149–50, 153, 180, 196, 198, 322–3 non-restrictive relative 149, 151 relative pronoun 8, 148–53, 160, 190, 197–8, 201, 233 relativization strategy 148, 189–90, 198, 201 restrictive relative 149, 151 relativization with the genitive 228–31 rentaikei system 318, 332–3 repetition of lexical items and phrases 63, 68–9, 78–9, 83, 192 residual quirks theory 9, 190–1, 206, 208 right-heading 10, 123 routinization 52 S saturation 332 semantic change 20, 33–4, 40, 157, 246–50 semantic map 154–5 sentence-final particle 329, 337, 340 sentence-final position 324, 332 sequentiality 37–52 serialization 11, 42–3, 114–15, 120, 300–2 singulative 21 Spec-head reduction 276 specialization 6, 104, 240, 250–9 affirmative vs. negative 253 stance 5, 312–35 structural scope 25–6 subitizing 297–8 subject 75, 78, 83–98, 267, 273–7, 279–80 absolutive subject 106 non-nominative subject 106 subjectification 118–19 (inter-)subjectivity 316 subjunctive see mood subordinator 317, 320–8, 334 suffix comitative 44, 207 comparative 202

instrumental case 203–4 nominalizing 6–7, 226–7, 231–3, 314, 334 superordinate motivation 109 superordinate restriction 105, 124 switch-reference function 63 synchronic buffer zone 9, 207, 210–12 syntactic rigidification 44 synthetic construction 201–4 system-external motivation see language contact T temporal interval 116–19 tense 19, 41, 63, 74, 80, 97, 181, 228, 329–30 future 2–4, 10–12, 42, 44, 47, 49–52, 109, 111–12, 120 past 11, 44, 52, 142–3, 182, 230, 253 preterite 139–43, 191 tense-aspect distinctions 231, 255 tense-aspect-modality 20, 32 tense-aspect-modality marker 3, 23–4, 26, 122, 313–14 tone 67, 75, 77 topic marker 317, 320, 322, 324–31, 334 transfer 132, 161, 207 transnumerality 20 two-form languages 158 U unidirectionality 1, 8, 66–7, 110–11, 129, 148, 161–2, 306 V variability 25–6, 30, 245, 260 verb auxiliary verb 11, 107, 117, 119, 135–6, 138–40, 142, 160 copula 5, 242, 246–7, 257, 271, 281, 316, 321–2, 324–6, 329–33 copula construction 117, 331 directional verb 3, 24, 26 intransitive verb 86, 89, 93, 138–40, 155, 183 light verb 271 of change of state 139 of cognition 135, 279, 282 of motion 10, 38, 40–1, 44–6, 48, 81, 139, 155 of motion in space 37–8, 40–1, 44–8 of motion in time 37–8 of possession 133–4, 138, 142, 160 of saying 135 preterite-present verb 138 pro-copula 325 strong/weak verb 109–11

telic verb 112–13 transitive verb 106, 133, 135, 137–9 verb-initial languages 78, 85–6 verbalizer 271–2

Index of subjects 355 W weight 25–6 word order 3, 15–16, 18, 24–6, 30, 67, 84–6, 99, 138–9, 276

Typological Studies in Language A complete list of titles in this series can be found on the publishers’ website, www.benjamins.com 80 Laury, Ritva (ed.): Crosslinguistic Studies of Clause Combining. The multifunctionality of conjunctions. xiv, 246 pp. + index. Expected August 2008 79 Abraham, Werner and Elisabeth Leiss (eds.): Modality–Aspect Interfaces. Implications and typological solutions. 2008. xxiv, 422 pp. 78 Harrison, K. David, David S. Rood and Arienne Dwyer (eds.): Lessons from Documented Endangered Languages. v, 366 pp. + index. Expected August 2008 77 Seoane, Elena and María José López-Couso (eds.): Theoretical and Empirical Issues in Grammaticalization. In collaboration with Teresa Fanego. 2008. x, 354 pp. + index. 76 López-Couso, María José and Elena Seoane (eds.): Rethinking Grammaticalization. New perspectives. In collaboration with Teresa Fanego. 2008. x, 355 pp. 75 Frajzyngier, Zygmunt and Erin Shay (eds.): Interaction of Morphology and Syntax. Case studies in Afroasiatic. 2008. v, 234 pp. 74 Kurzon, Dennis and Silvia Adler (eds.): Adpositions. Pragmatic, semantic and syntactic perspectives. 2008. viii, 307 pp. 73 Ansaldo, Umberto, Stephen Matthews and Lisa Lim (eds.): Deconstructing Creole. 2007. xii, 292 pp. 72 Næss, Åshild: Prototypical Transitivity. 2007. x, 240 pp. 71 Nedjalkov, Vladimir P. (ed.): Reciprocal Constructions. With the assistance of Emma Š. Geniušienė and Zlatka Guentchéva. 2007. xxiii, 2219 pp. (5 vols.). 70 Zúñiga, Fernando: Deixis and Alignment. Inverse systems in indigenous languages of the Americas. 2006. xii, 309 pp. 69 Aranovich, Raúl (ed.): Split Auxiliary Systems. A cross-linguistic perspective. 2007. vii, 277 pp. 68 Abraham, Werner and Larisa Leisiö (eds.): Passivization and Typology. Form and function. 2006. x, 553 pp. 67 Veselinova, Ljuba N.: Suppletion in Verb Paradigms. Bits and pieces of the puzzle. 2006. xviii, 236 pp. 66 Hickmann, Maya and Stéphane Robert (eds.): Space in Languages. Linguistic Systems and Cognitive Categories. 2006. x, 362 pp. 65 Tsunoda, Tasaku and Taro Kageyama (eds.): Voice and Grammatical Relations. In Honor of Masayoshi Shibatani. 2006. xviii, 342 pp. 64 Voeltz, F. K. Erhard (ed.): Studies in African Linguistic Typology. 2006. xiv, 426 pp. 63 Filimonova, Elena (ed.): Clusivity. Typology and case studies of the inclusive–exclusive distinction. 2005. xii, 436 pp. 62 Couper-Kuhlen, Elizabeth and Cecilia E. Ford (eds.): Sound Patterns in Interaction. Crosslinguistic studies from conversation. 2004. viii, 406 pp. 61 Bhaskararao, Peri and Karumuri Venkata Subbarao (eds.): Non-nominative Subjects. Volume 2. 2004. xii, 319 pp. 60 Bhaskararao, Peri and Karumuri Venkata Subbarao (eds.): Non-nominative Subjects. Volume 1. 2004. xii, 325 pp. 59 Fischer, Olga, Muriel Norde and Harry Perridon (eds.): Up and down the Cline – The Nature of Grammaticalization. 2004. viii, 406 pp. 58 Haspelmath, Martin (ed.): Coordinating Constructions. 2004. xcv, 578 pp. 57 Mattissen, Johanna: Dependent-Head Synthesis in Nivkh. A contribution to a typology of polysynthesis. 2003. x, 350 pp. 56 Shay, Erin and Uwe Seibert (eds.): Motion, Direction and Location in Languages. In honor of Zygmunt Frajzyngier. 2003. xvi, 305 pp. 55 Frajzyngier, Zygmunt and Erin Shay: Explaining Language Structure through Systems Interaction. 2003. xviii, 309 pp. 54 Aikhenvald, Alexandra Y. and R.M.W. Dixon (eds.): Studies in Evidentiality. 2003. xiv, 349 pp. 53 Givón, T. and Bertram F. Malle (eds.): The Evolution of Language out of Pre-language. 2002. x, 394 pp. 52 Güldemann, Tom and Manfred von Roncador (eds.): Reported Discourse. A meeting ground for different linguistic domains. 2002. xii, 425 pp.

51 Newman, John (ed.): The Linguistics of Sitting, Standing and Lying. 2002. xii, 409 pp. 50 Feigenbaum, Susanne and Dennis Kurzon (eds.): Prepositions in their Syntactic, Semantic and Pragmatic Context. 2002. vi, 304 pp. 49 Wischer, Ilse and Gabriele Diewald (eds.): New Reflections on Grammaticalization. 2002. xiv, 437 pp. 48 Shibatani, Masayoshi (ed.): The Grammar of Causation and Interpersonal Manipulation. 2002. xviii, 551 pp. 47 Baron, Irène, Michael Herslund and Finn Sørensen (eds.): Dimensions of Possession. 2001. vi, 337 pp. 46 Aikhenvald, Alexandra Y., R.M.W. Dixon and Masayuki Onishi (eds.): Non-canonical Marking of Subjects and Objects. 2001. xii, 364 pp. 45 Bybee, Joan and Paul J. Hopper (eds.): Frequency and the Emergence of Linguistic Structure. 2001. vii, 492 pp. 44 Voeltz, F. K. Erhard and Christa Kilian-Hatz (eds.): Ideophones. 2001. x, 436 pp. 43 Gildea, Spike (ed.): Reconstructing Grammar. Comparative Linguistics and Grammaticalization. 2000. xiv, 269 pp. 42 Diessel, Holger: Demonstratives. Form, function and grammaticalization. 1999. xii, 205 pp. 41 Frajzyngier, Zygmunt and Traci S. Walker-Curl (eds.): Reciprocals. Forms and functions. Volume 2. 2000. xii, 201 pp. 40 Frajzyngier, Zygmunt and Traci S. Walker-Curl (eds.): Reflexives. Forms and functions. Volume 1. 2000. xiv, 286 pp. 39 Payne, Doris L. and Immanuel Barshi (eds.): External Possession. 1999. ix, 573 pp. 38 Siewierska, Anna and Jae Jung Song (eds.): Case, Typology and Grammar. In honor of Barry J. Blake. 1998. 395 pp. 37 Giacalone-Ramat, Anna and Paul J. Hopper (eds.): The Limits of Grammaticalization. 1998. vi, 307 pp. 36 Newman, John (ed.): The Linguistics of Giving. 1998. xv, 373 pp. 35 Givón, T. (ed.): Grammatical Relations. A functionalist perspective. 1997. viii, 350 pp. 34 Givón, T. (ed.): Conversation. Cognitive, communicative and social perspectives. 1997. viii, 302 pp. 33 Fox, Barbara A. (ed.): Studies in Anaphora. 1996. xii, 518 pp. 32 Bybee, Joan and Suzanne Fleischman (eds.): Modality in Grammar and Discourse. 1995. viii, 575 pp. 31 Gernsbacher, Morton Ann and T. Givón (eds.): Coherence in Spontaneous Text. 1995. x, 267 pp. 30 Downing, Pamela A. and Michael Noonan (eds.): Word Order in Discourse. 1995. x, 595 pp. 29 Kahrel, Peter (PJK) and René van den Berg (eds.): Typological Studies in Negation. 1994. x, 385 pp. 28 Givón, T. (ed.): Voice and Inversion. 1994. viii, 402 pp. 27 Fox, Barbara A. and Paul J. Hopper (eds.): Voice: Form and Function. 1994. xiii, 377 pp. 26 Lord, Carol: Historical Change in Serial Verb Constructions. 1993. x, 273 pp. 25 Svorou, Soteria: The Grammar of Space. 1994. xiv, 290 pp. 24 Perkins, Revere D.: Deixis, Grammar, and Culture. 1992. x, 245 pp. 23 Kemmer, Suzanne: The Middle Voice. 1993. xii, 300 pp. 22 Payne, Doris L. (ed.): Pragmatics of Word Order Flexibility. 1992. viii, 320 pp. 21 Downing, Pamela A., Susan D. Lima and Michael Noonan (eds.): The Linguistics of Literacy. 1992. xx, 334 pp. 20 Croft, William, Suzanne Kemmer and Keith Denning (eds.): Studies in Typology and Diachrony. Papers presented to Joseph H. Greenberg on his 75th birthday. 1990. xxxiv, 243 pp. 19:2 Closs Traugott, Elizabeth and Bernd Heine (eds.): Approaches to Grammaticalization. Volume II. Types of grammatical markers. 1991. xii, 558 pp. 19:1 Closs Traugott, Elizabeth and Bernd Heine (eds.): Approaches to Grammaticalization. Volume I. Theoretical and methodological issues. 1991. xii, 360 pp. 18 Haiman, John and Sandra A. Thompson (eds.): Clause Combining in Grammar and Discourse. 1988. xiii, 428 pp. 17 Hammond, Michael, Edith A. Moravcsik and Jessica Wirth (eds.): Studies in Syntactic Typology. 1988. xiv, 380 pp. 16 Shibatani, Masayoshi (ed.): Passive and Voice. 1988. xi, 706 pp.

15 Austin, Peter (ed.): Complex Sentence Constructions in Australian Languages. 1988. vii, 289 pp. 14 Hinds, John, Shoichi Iwasaki and Senko K. Maynard (eds.): Perspectives on Topicalization. The case of Japanese WA. 1987. xi, 307 pp. 13 Never published. 12 Nedjalkov, Vladimir P. (ed.): Typology of Resultative Constructions. Translated from the original Russian edition (1983). Translation edited by Bernard Comrie. 1988. xx, 573 pp. 11 Tomlin, Russell S.: Coherence and Grounding in Discourse. Outcome of a Symposium, Eugene, Oregon, June 1984. 1987. viii, 512 pp. 10 Ransom, Evelyn N.: Complementation: its Meaning and Forms. 1986. xii, 226 pp. 9 Bybee, Joan: Morphology. A Study of the Relation between Meaning and Form. 1985. xii, 235 pp. 8 Slobin, Dan I. and Karl Zimmer (eds.): Studies in Turkish Linguistics. 1986. vi, 294 pp. 7 Craig, Colette G. (ed.): Noun Classes and Categorization. Proceedings of a symposium on categorization and noun classification, Eugene, Oregon, October 1983. 1986. vii, 481 pp. 6 Haiman, John (ed.): Iconicity in Syntax. Proceedings of a symposium on iconicity in syntax, Stanford, June 24–26, 1983. 1985. vi, 402 pp. 5 Rutherford, William E. (ed.): Language Universals and Second Language Acquisition. 1984. ix, 264 pp. 4 Chisholm, William, Louis T. Milic and John A.C. Greppin (eds.): Interrogativity. A colloquium on the grammar, typology and pragmatics of questions in seven diverse languages, Cleveland, Ohio, October 5th 1981-May 3rd 1982. 1984. v, 302 pp. 3 Givón, T.: Topic Continuity in Discourse. A quantitative cross-language study. 1983. vi, 492 pp. 2 Haiman, John and Pamela Munro (eds.): Switch Reference and Universal Grammar. Proceedings of a symposium on switch reference and universal grammar, Winnipeg, May 1981. 1983. xv, 337 pp. 1 Hopper, Paul J. (ed.): Tense-Aspect. Between semantics & pragmatics. 1982. x, 350 pp.

Rethinking Grammaticalization: New perspectives (Typological Studies in Language)

New Reflections on Grammaticalization Vol 49 (Typological Studies in Language)

Theoretical and Empirical Issues in Grammaticalization (Typological Studies in Language)

Benefactives and Malefactives: Typological Perspectives and Case Studies (Typological Studies in Language)

Adpositions: Pragmatic, Semantic and Syntactic Perspectives (Typological Studies in Language)

Coordinating Constructions (Typological Studies in Language)

Prototypical Transitivity (Typological Studies in Language)

Ideophones (Typological Studies in Language, 44)

Coordinating Constructions (Typological Studies in Language)

Dimensions of Possession (Typological Studies in Language)

Ideophones (Typological Studies in Language, 44)

Deconstructing Creole (Typological Studies in Language)

Coordinating Constructions (Typological Studies in Language)

Studies in African Linguistic Typology (Typological Studies in Language)

Creoles, their Substrates, and Language Typology (Typological Studies in Language)

Studies in Evidentiality: v. 54 (Typological Studies in Language)

Language Contact: New perspectives

Pathways of Change: Grammaticalization in English (Studies in Language Companion)

New Perspectives in Language, Discourse and Translation Studies

A Typology of Purpose Clauses (Typological Studies in Language)

Reflexives: Forms and Functions (Typological Studies in Language)

Case, Typology and Grammar (Typological Studies in Language)

The Linguistics of Giving (Typological Studies in Language)

Rethinking Documentary: New Perspectives and Practices

Language Attrition: Theoretical perspectives (Studies in Bilingualism)

Passivization And Typology: Form And Function (Typological Studies in Language)

Rethinking Evil: Contemporary Perspectives

Comparative Fascist Studies: New Perspectives

Modality-Aspect Interfaces: Implications and typological solutions (Typological Studies in Language)

New Perspectives in Language, Discourse and Translation Studies (Second Language Learning and Teaching)

Interaction of Morphology and Syntax: Case studies in Afroasiatic (Typological Studies in Language)

Rethinking Grammaticalization: New perspectives (Typological Studies in Language)

New Reflections on Grammaticalization Vol 49 (Typological Studies in Language)

Theoretical and Empirical Issues in Grammaticalization (Typological Studies in Language)

Benefactives and Malefactives: Typological Perspectives and Case Studies (Typological Studies in Language)

Adpositions: Pragmatic, Semantic and Syntactic Perspectives (Typological Studies in Language)

Coordinating Constructions (Typological Studies in Language)

Prototypical Transitivity (Typological Studies in Language)

Ideophones (Typological Studies in Language, 44)

Coordinating Constructions (Typological Studies in Language)

Dimensions of Possession (Typological Studies in Language)

Ideophones (Typological Studies in Language, 44)

Deconstructing Creole (Typological Studies in Language)

Coordinating Constructions (Typological Studies in Language)

Studies in African Linguistic Typology (Typological Studies in Language)

Creoles, their Substrates, and Language Typology (Typological Studies in Language)

Studies in Evidentiality: v. 54 (Typological Studies in Language)

Language Contact: New perspectives

Pathways of Change: Grammaticalization in English (Studies in Language Companion)

New Perspectives in Language, Discourse and Translation Studies

A Typology of Purpose Clauses (Typological Studies in Language)

Reflexives: Forms and Functions (Typological Studies in Language)

Case, Typology and Grammar (Typological Studies in Language)

The Linguistics of Giving (Typological Studies in Language)

Rethinking Documentary: New Perspectives and Practices

Language Attrition: Theoretical perspectives (Studies in Bilingualism)

Passivization And Typology: Form And Function (Typological Studies in Language)

Rethinking Evil: Contemporary Perspectives

Comparative Fascist Studies: New Perspectives

Modality-Aspect Interfaces: Implications and typological solutions (Typological Studies in Language)

New Perspectives in Language, Discourse and Translation Studies (Second Language Learning and Teaching)

Interaction of Morphology and Syntax: Case studies in Afroasiatic (Typological Studies in Language)

Recommend Documents