Voicing in Japanese

Voicing in Japanese ≥ Studies in Generative Grammar 84 Editors Harry van der Hulst Jan Koster Henk van Riemsdijk M...

Author: Jeroen van de Weijer | Kensuke Nanjo | Tetsuo Nishihara

79 downloads 2662 Views 4MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Voicing in Japanese

≥

Studies in Generative Grammar 84

Editors

Harry van der Hulst Jan Koster Henk van Riemsdijk

Mouton de Gruyter Berlin · New York

Voicing in Japanese

Edited by

Jeroen van de Weijer Kensuke Nanjo Tetsuo Nishihara

Mouton de Gruyter Berlin · New York

Mouton de Gruyter (formerly Mouton, The Hague) is a Division of Walter de Gruyter GmbH & Co. KG, Berlin.

The series Studies in Generative Grammar was formerly published by Foris Publications Holland.

앝 Printed on acid-free paper which falls within the guidelines 앪 of the ANSI to ensure permanence and durability.

Library of Congress Cataloging-in-Publication Data Voicing in Japanese / edited by Jeroen van de Weijer, Kensuke Nanjo, Tetsuo Nishihara. p. cm. ⫺ (Studies in generative grammar ; 84) Includes bibliographical references and index. ISBN-13: 978-3-11-018600-0 (cloth : alk. paper) ISBN-10: 3-11-018600-4 (cloth : alk. paper) 1. Japanese language ⫺ Phonetics. I. Weijer, Jeroen Maarten van de, 1965⫺ . II. Nanjo, Kensuke. III. Nishihara, Tetsuo, 1961⫺ . IV. Series. PL541.V65 2005 495.61158⫺dc22 2005031093

Bibliographic information published by Die Deutsche Bibliothek Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the Internet at ⬍http://dnb.ddb.de⬎.

ISBN-13: 978-3-11-018600-0 ISBN-10: 3-11-018600-4 ISSN 0167-4331 쑔 Copyright 2005 by Walter de Gruyter GmbH & Co. KG, D-10785 Berlin. All rights reserved, including those of translation into foreign languages. No part of this book may be reproduced in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Cover design: Christopher Schneider, Berlin. Printed in Germany.

Preface

Most of the work on this book was done while the first editor was a Research Fellow at the Netherlands Institute for Advanced Study in the Humanities and Social Sciences (NIAS) in Wassenaar in the period 2002–2003. We are extremely grateful to NIAS for the tranquil yet productive environment in which the ideas expressed in this book could be conceived and reflected upon. A first version of some of the papers in this volume here were presented at a workshop in the Linguistics and Phonetics 2002 (LP2002) conference, held from September 2–6, 2002 at Meikai University in Urayasu, Japan. We are grateful to the organisers for giving us the opportunity to have this workshop, and to the audience for helpful discussion and suggestions. Jeroen van de Weijer Kensuke Nanjo and Tetsuo Nishihara

Leiden, Summer 2005

Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

v

Voicing in Japanese . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jeroen van de Weijer, Kensuke Nanjo and Tetsuo Nishihara

1

Part I – Consonant voice Rendaku: Its domain and linguistic conditions . . . . . . . . . . . . . . . . . . . . . . . Haruo Kubozono

5

Sequential voicing, postnasal voicing, and Lyman’s Law revisited . . . . Keren Rice

25

Sei-daku: diachronic developments in the writing system . . . . . . . . . . . . Kazutoshi Ohno

47

The representation of laryngeal-source contrasts in Japanese . . . . . . . . . Kuniya Nasukawa

71

Rendaku in inflected words. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Timothy J. Vance

89

Ranking paradoxes in consonant voicing in Japanese . . . . . . . . . . . . . . . . 105 Haruka Fukazawa and Mafuyu Kitahara The implicational distribution of prenasalized stops in Japanese . . . . . . 123 Noriko Yamane-Tanaka The correlation between accentuation and Rendaku in Japanese surnames: a morphological account . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Hideki Zamma A survey of Rendaku in loanwords . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Tomoaki Takayama

viii Contents Recognizing Japanese numeral-classifier combinations . . . . . . . . . . . . . . 191 Keiichiro Suzuki

Part II – Vowel voice Corpus-based analysis of vowel devoicing in spontaneous Japanese: an interim report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Kikuo Maekawa and Hideaki Kikuchi Syllable structure and its acoustic effects on vowels in devoicing environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Mariko Kondo The effect of speech rate on devoiced accented vowels in Osaka Japanese . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 Miyoko Sugito Where voicing and accent meet: their function, interaction, and opacity problems in phonological prominence. . . . . . . . . . . . . . . . . . . . . . . 261 Shin-ichi Tanaka

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Index of authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Index of languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Index of subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

279 307 312 313

Voicing in Japanese Jeroen van de Weijer, Kensuke Nanjo and Tetsuo Nishihara

This book presents a number of studies which focus on the [voice] grammar of Japanese, paying particular attention to historical background, dialectal diversity, phonetic experiment, and phonological analysis. Both voicing processes in consonants (such as Sequential Voicing, henceforth Rendaku) and vowels (such as vowel devoicing) are examined. A number of new analyses are presented, focusing on well-known data that have been controversial in phonological debate in the past, but it also presents new (or rediscovered) data, partly through the work of Japanese scholars that hitherto went mostly unnoticed, partly through new database research, and partly through phonetic experiment. In this introduction, we will briefly introduce the different contributions and point out their respective interests. There are two parts to the book: (1) consonant voice, (2) vowel voice. In the consonant part, the contribution by Kubozono presents a point of departure by introducing many of the voicing phenomena in Japanese, and also pointing out some of the relevant dialectal differences. Let us briefly review the most important of these in very general terms. For details and refinements, we refer to the contributions that follow. Rendaku is a rule of Japanese which voices the initial consonant of the second member of a compound, if certain phonological and syntactic conditions are satisfied. Consider the following examples (taken from various standard sources): (1)

shima ‘island’ maki ‘roll’ oo ‘large’

- kuni shima-guni ‘country’ - sushi maki-zushi ‘sushi’ - tanuki oo-danuki ‘badger’

‘island country’ ‘rolled sushi’ ‘large badger’

2 Jeroen van de Weijer, Kensuke Nanjo and Tetsuo Nishihara However, if the second member of the compound has a voiced stop, Rendaku is not allowed to occur, as the following examples show (again, we will not go into various important issues, but only illustrate the “received” wisdom): (2)

doku ‘poison’ oo ‘large’

- tokage ‘lizard’ - kaze ‘wind’

doku-tokage

‘poisonous lizard’

oo-kaze

‘big wind’ (Vance 1987)

The first non-Japanese researcher who wrote about this blocking effect was Lyman (1894), which is why the condition is commonly referred to as “Lyman’s Law”. In other literature, the condition is referred to as Motoori Norinaga’s Law. A point of controversy in the literature is whether this “Law” has exceptions or not. In recent work, Haraguchi (2003) points out that the exceptions can all be analysed by making reference to independently motivated principles of grammar, such as morphological constituency. A number of issues are distinguished with respect to Rendaku: is it an exceptionless rule? Tamamura (1989) points out that only in 60% of the nounnoun compounds in which Rendaku could occur, it actually does occur. If Rendaku is not a regular rule, how should the exceptions be accounted for? How should the phonological and syntactic conditions be formalized? To what part(s) of the lexicon does the rule apply? How long has it been part of the grammar of Japanese? Do loanwords undergo it? What does the rule tell us about the specification of voicing on obstruents and on sonorants in the phonology of Japanese? All these issues are dealt with at length in the contributions that follow, but let us pick out three major issues here. First, it appears to be the case that some lexical items that would at first glance be expected to undergo Rendaku do not undergo it, or undergo the rule in some compounds but not in others. This “variable” or “unpredictable” behaviour can be approached from a number of viewpoints. Has the rule been completely lexicalized? Are there still subregularities? Does analogy play a role? How are new formations treated? These questions play a major role in the contributions by Kubozono and Ohno, while certain syntactic conditions that were assumed up to now are scrutinized and dismissed in the contribution by Vance. The fact that a rule is variable presents obvious problems for speech recognition software. This problem is dealt with in the paper by Suzuki. A second issue concerns the stratification of the Japanese lexicon. As is well known, the Japanese language incorporates a number of layers, or

Voicing in Japanese

3

vocabulary from different sources, which are subject to partly different restrictions. An obvious question is how many layers there are, about which there is some controversy in the literature. A first layer comes from native Japanese, and is known as Yamato. A second layer comes from early Chinese well-incorporated loans, known as Sino-Japanese. Borrowings may be either fairly well incorporated (in which case they constitute the “Loan” stratum) or not (in which case they constitute the “Foreign” stratum). Finally, onomatopoeic forms are involved in a class of their own, which is usually referred to as the Mimetic stratum. The question is whether Rendaku applies in all of these strata, or rather – since it does not – how the difference in application among the strata should be formalized, especially in the light of the idea that the grammars of languages consist of a single constraint hierarchy. This issue is discussed in detail in the contribution by Fukuzawa and Kitahara, while the question if Rendaku applies in loanwords is explored by Takayama. A final issue touched upon here is the specification of the distinctive feature [voice]. The nature of this feature has been the topic of fierce debate in the literature: is it binary or unary? Is it specified for obstruents and sonorants in the same way? Recall that Rendaku voices obstruents in compounds under certain conditions. Nasals (or other sonorants) do not play a role, which suggests that they are not (underlyingly) specified for voice. In another voicing process – postnasal voicing –, however, nasals seem to impose their voicedness on a following stop, so that sequences such as …nt… are not allowed (again, making provisions for stratumhood). In this respect, therefore, nasals do seem to bear a specification for voicing, which presents an interesting paradox, especially, again, in the light of the idea that the grammars of languages consist of a single constraint hierarch, which maps outputs directly onto inputs, without intermediary stages. This paradox takes primary place of attention in the contribution by Rice. Of particular interest are the historical papers in the volume. Two articles shed light on the historical dimension: where does Rendaku come from and was it always an irregular process? Ohno investigates this question using the earliest sources and finds, among other things, that Rendaku has always been irregular. Yamane-Tanaka investigates the close relation between voicing and prenasalization (which is still evident in some varieties that have prenasalized stops or nasal vowels; cf. Nishihara 2002) and offers an OT-style analysis both of the dialect-geographical situation and the historical development. It is well known that Japanese is a pitch-accent language, and this raises the question whether there is a relation between accent on the one hand,

4 Jeroen van de Weijer, Kensuke Nanjo and Tetsuo Nishihara and Rendaku on the other. Zamma investigates precisely this relation in personal names and in place names, giving an Optimality account which again bears on the stratification of the lexicon. The second part of the book deals with vowel voice, and here vowel devoicing, the second process which Japanese is well known for, is the main topic. Again, let us briefly illustrate without going into details. In most Japanese dialects, there is a phonological rule of high vowel devoicing. If the high vowels /i, u/ appear between voiceless sounds and/or word boundary, these vowels are devoiced. Furthermore, Maekawa (1988) points out that devoicing of the low vowel /a/ and the mid vowel /o/ sometimes takes place, although not as frequently as the high vowels. Both the facts, which differ greatly between dialects, speech styles, etc., as their phonological interpretation are topics of debate in the contributions that follow. The stage is set by the article by Maekawa and Kikuchi, who present a great deal of factual data concerning vowel devoicing, including information on which vowels are devoiced, in which environments, whether consecutive vowels can be devoiced, etc. These facts are taken from new database research. The contribution by Kondo looks at vowel devoicing from a syllable/mora perspective: does vowel devoicing affect syllable structure, or, alternatively, does syllable structure constrain the process? The paper by Tanaka looks at the concept of “prominence” in general, which bears on the voiced-voiceless distinction, which is relevant to consonants as well as vowels, but also on accent. Sugito presents the results of an experiment on vowel devoicing: how well do speakers recognise the accent that was present on the devoiced vowel? We hope that this volume succeeds in putting some of the received wisdom with respect to voicing in Japanese to the test from both a phonological as well as phonetic perspective.

Rendaku: Its domain and linguistic conditions Haruo Kubozono

1. Two kinds of rendaku voicing The main purpose of this paper is to review past works on rendaku, or sequential voicing, with main focus on its domain and linguistic conditions and to summarize remaining questions for future work. One of the most fundamental questions regarding Japanese rendaku concerns its linguistic nature: is it a productive process or is it no more than a property of specific lexical entries? The first hypothesis emphasizes the productivity of rendaku and defines it as a productive phonological (or morphophonological) process of voicing that permits lexical exceptions (Otsu 1980, Itô and Mester 1986, 2003; cf. Kuroda 2002). On the other hand, the second hypothesis focuses on the extremely large number of lexical exceptions and attributes rendaku to a lexical property of certain words (Ohno 2000). Let us consider the pair hiragana and katakana (the two types of kana letters), for example. Etymologically, these words are made up of two morphemes, /hira+kana/ and /kata+kana/. In the course of the history, the first word underwent voicing as in (1), while the second did not. (1) hira + kana hiragana According to the first hypothesis mentioned above, this historical process of voicing remains a productive process in modern Japanese by which /hira+kana/ turns into /hiragana/. The word katakana is regarded as an exception to this synchronic process. The second hypothesis, in contrast, posits hiragana and katakana as underlyingly /hiragana/ and /katakana/, respectively: the presence of voicing in the first word and the lack of voicing in the second are lexical properties of the respective words. These two hypotheses are difficult to assess because rendaku is extremely productive in modern Japanese, on the one hand, and, on the other, it admits an extremely large number of exceptions whose exceptionality is difficult to explain. One important study that has tackled this difficult issue is the experimental work by Shinji and Suzy Fukuda (Fukuda and Fukuda 1999). They

6 Haruo Kubozono looked at children with a language disorder called ‘specific language impairment’ (henceforth ‘SLI’ for short). People with SLI are linguistically normal in every respect except that they cannot apply productive grammatical rules to morphological/syntactic strings. For example, native English speakers with SLI are unable to produce plural forms for countable nouns (2a) and to put an ending /s/ to a verb to mark a third person, singular form (2b). (2)

a. I have three apple. b. Mary walk in the yard.

Fukuda and Fukuda (1999) examined how native Japanese speakers with the same language impairment produce Japanese utterances. Specifically, they looked at the way their eight- to twelve-year-old subjects produced voicing in compound nouns. If the subjects should fail to produce voicing in words like /hiragana/, then it would mean that voicing is a productive rule in modern Japanese, hence supporting the first hypothesis mentioned above. If, on the other hand, the subjects should produce voicing in words like /hiragana/ just as normal native speakers do, then it would suggest that the phonological form with voicing is a lexical form of the word, namely, that voicing has been lexicalized and is not produced by rule in the synchronic grammar. What Fukuda and Fukuda (1999) found out is something that compromises the two predictions. On the one hand, their subjects showed voicing in some basic compound nouns like nagagutu ‘long + shoes; boots’, suggesting that voicing in these words is part of their underlying representation. On the other hand, they also showed lack of voicing in non-frequent and novel compounds such as those in (3a), which were pronounced with voicing by normal native speakers of the same age group as shown in (3b). (3)

a. kotoba + tukai kotoba-tukai ‘language use’ kotoba + hon kotoba-hon ‘language book’ b. kotoba + tukai kotoba-dzukai ‘language use’ kotoba + hon kotoba-bon ‘language book’

This latter result reveals a contrast between normal speakers and speakers with SLI, with the first but not the latter group of speakers being able to produce voicing in non-frequent and novel compounds. This suggests that voicing in non-frequent and novel compounds should be attributed to a productive rule and, hence, that there exists a productive process of voicing in normal speakers’ grammars.

Rendaku: Its domain and linguistic conditions

7

Fukuda and Fukuda’s experimental data are interesting in that they reveal that some instances of rendaku voicing are lexicalized while others are due to a productive rule. Native speakers of Japanese deal with the first type of voicing by memorizing the form with voicing as a lexical entry. In contrast, they deal with the second type of voicing by acquiring a voicing rule, or rendaku rule, and applying it to unfamiliar and novel compounds. What remains unclear is the boundary between the two kinds of voicing, more specifically, between ‘frequent’ and ‘non-frequent’ compounds. This will be an intriguing empirical question for future research.

2.

Lyman’s Law revisited

2.1. Original version One of the best-known conditions concerning the domain of rendaku is the so-called ‘Lyman’s Law’, which can be defined as in (4). Representative examples are given in (5) in contrast to those that are not subject to the condition. (4)

Rendaku is blocked in a compound word [AB] if B already contains a dakuon, or a voiced obstruent.

(5)

a. aka + huda aka-huda, *aka-buda ‘red tag’ cf. uwa + huta uwa-buta ‘top lid’ roten + huro roten-buro ‘outdoor, bath; outdoor bath’ b. ai + kagi ai-kagi, *ai-gagi cf. ama + kaki ama-gaki umi + kame

umi-game

‘duplicate key’ ‘sweet, persimmon; sweet persimmon’ ‘sea, turtle; sea turtle’

c. yama + kazi yama-kazi, *yama-gazi ‘forest fire’ cf. wa + kasi wa-gasi ‘Japanese cake’ temuzu + kawa temuzu-gawa ‘Thames, river; River Thames’ As these examples indicate, Lyman’s Law represents a case of the OCP (Obligatory Contour Principle) by which an identical element or feature is prohibited from occurring more than once within a certain domain. In

8 Haruo Kubozono rendaku, the feature in question is [+voice, +obstruent], with the relevant domain of OCP being the morpheme or the second member of a compound. While this is a well-known fact in Japanese phonology, there are certain cases where Lyman’s Law requires a larger domain. This can be seen rather clearly in the data provided by Sugito (1965), which we will consider in detail in the next section. 2.2. Sugito’s data and Lyman’s Law Sugito (1965) looked at the alternation between /ta/ and /da/ shown by the morpheme ta ‘rice field’ as it is combined with a bimoraic morpheme to form a personal name: e.g. /siba-ta/ vs. /ima-da/. This particular morpheme exhibits a rather clear pattern of alternation, which is more or less predictable from the consonant in the immediately preceding mora.1 The results of Sugito’s analysis can be summed up as follows. (6)

a. The morpheme is usually realized as [da] when it is immediately preceded by a mora containing either /s/, /m/, /n/, /t/, or /k/, as well as when it is preceded by a heavy syllable (except a syllable containing a moraic obstruent). b. The morpheme is invariably realized as [ta] when it is preceded by a mora containing either /d/, /b/, /g/, /z/ or /y/, or when it is preceded by a moraic obstruent. c. The morpheme is predominantly realized as [ta] but permits [da] occasionally when it is preceded by a mora containing either /r/ or /w/.

Representative examples are given below. (7)

a. asa-da, hama-da, sana-da, kata-da, huku-da; soo-da, sai-da, kan-da b. kubo-ta, kado-ta, naga-ta, mizu-ta, haya-ta c. ari-ta vs. hara-da, iwa-ta vs. sawa-da

We can develop Sugito’s analysis one step further and reinterpret the data in terms of natural classes.2 This reanalysis leads to the generalization in (8). The contrast between (8a) and (8b) is illustrated in (9). (8)

a. /da/ is preferred after voiceless obstruents and nasals. b. /da/ is prohibited after voiced obstruents. c. /ta/ is preferred after approximants.

Rendaku: Its domain and linguistic conditions

(9)

9

a. huku-da, kasi-da, kusu-da, asi-da, kase-da, kaku-da, sima-da, naka-da (or naka-ta), kata-da b. hugu-ta, kazi-ta, kuzu-ta, azi-ta, kaze-ta, kagu-ta, siba-ta, kubo-ta, naga-ta, sugi-ta, kado-ta

In terms of the markedness of voicing, this generalization means that /ta/ is chosen if the consonant in the immediately preceding mora has the feature [+voice], whereas /da/ may be chosen if the consonant in question is [-voice] or unspecified with respect to voicing (as in nasals). Here the variation between /ta/ and /da/ for some words like naka + ta (/nakata/~/nakada/) does not directly concern us. What is of interest is the fact that /da/ is never permitted if the immediately preceding mora already contains a voiced obstruent. This is a clear case of application of the OCP, or an extension of Lyman’s Law (see (4)). In (5), rendaku voicing is blocked if the second element of the compound already contains a voiced obstruent. In (9), the same process is blocked if the first element ends in a mora containing a voiced obstruent. The similarity between the two cases is obvious: presence of a voiced obstruent in the neighborhood prevents rendaku from creating another voiced obstruent. The OCP effect in (9b) is particularly interesting because voiced obstruents block rendaku across a morpheme boundary. This extended effect of OCP in rendaku is not a new finding, however. Kindaichi et al. (1988: 264), citing examples like /maga-tama/ ‘ancient accessory’ and /mizu-tama/ ‘polka dot’, note that this ‘law’ has existed in Japanese since the ancient period. According to Itô and Mester (2003), this was originally reported by Tatsumaro Ishizuka in 1801. Itô and Mester claim that the domain of Lyman’s Law has narrowed from the word (prosodic word) in Old Japanese to the morpheme in modern Japanese. It remains unclear why this domain change has taken place and how the old effect of Lyman’s Law leaves its trace in modern Japanese. These are very interesting questions for future work. Returning to Sugito’s data regarding the /ta/-/da/ alternation, there are several additional facts that are worthy of special attention. One is the fact that approximants (/r/, /w/ and /y/) pattern more or less with voiced obstruents, while nasals (/m/ and /n/) pattern with voiceless obstruents. These two groups of sounds form a natural class in Japanese phonology in that they are all voiced and lack voiceless counterparts. It is puzzling that they pattern differently with respect to the /ta/-/da/ alternation. Particularly mysterious is the behavior of approximants which, as summarized in (10), tend to show the same behavior as voiced obstruents.

10 Haruo Kubozono (10) /rV//wV//yV/-

-/ta/ -/da/ 31 3 7 2 8 0

In terms of the markedness of voicing, approximants should pattern with voiceless obstruents and nasals since they involve an unmarked value of voice. This ‘unmarkedness’ shows itself very clearly in the general cases of Lyman’s Law which we saw in (5) above. Namely, unlike voiced obstruents, approximants do not block the voicing process when they occur in the second member of compound nouns, and they pattern exactly with voiceless obstruents and nasals in this respect. It is very strange to find that approximants display the same pattern as voiceless obstruents and nasals with respect to Lyman’s Law in the original sense, while they pattern with voiced obstruents in the extended version of the same law. Another interesting fact about the /ta/-/da/ alternation concerns the peculiar behavior of /r/. As shown in (10), /r/ predominantly prefers /ta/ rather than /da/. The three exceptions to this in Sugito’s data are /hara-da/, /terada/ and /tora-da/, in all of which /da/ is preceded by the low vowel /a/.3 This suggests that the choice between /ta/ and /da/ after /r/ is also influenced by the quality (or height, to be more exact) of the immediately preceding vowel, i.e. the final vowel of the preceding morpheme. This possibility is also worth exploring. A final noteworthy fact about Sugito’s data is that /k/ behaves somewhat differently from other voiceless obstruents. While /t/ and /s/ invariably choose /da/, /k/ admits quite a few exceptions as the following statistics and examples show. (11) /sV//tV//kV/-

-/ta/ -/da/ 0 35 0 26 13 31

(12) a. /ta/: iku-ta, aki-ta, oki-ta, kaki-ta, maki-ta, saka-ta b. /da/: huku-da, oka-da, taka-da, toku-da, oku-da, kaku-da, ike-da, take-da, huka-da It is true that /k/ prefers /da/ rather than /ta/, but it is obviously different from other voiceless obstruents in the extent to which it tolerates /ta/. The reason for this peculiar behavior of /k/ remains unclear.

Rendaku: Its domain and linguistic conditions

11

2.3. Summary In the preceding section we have seen Sugito’s data concerning the /ta/-/da/ alternation in personal names consisting of three moras. It should be clear now that in these particular type of compound nouns, Lyman’s Law exerts its effect in a wider domain than is usually assumed. The same effect is found in many pairs of personal names including /naga-sima/—/nakazima/, /naga-sawa/—/naka-zawa/ and /naga-saki/—/naka-zaki/, which fluctuate between these two forms. This being said, it is also important to point out that not all morphemes or personal names exhibit the same extended effect of Lyman’s Law. Restricting ourselves to personal names, we find that some morphemes invariably undergo rendaku even when they are preceded by a voiced obstruent. sono ‘garden’ and huti ‘the depth, an abyss’, for example, get voiced regardless of what morpheme they are combined with. Indeed, these morphemes invariably undergo rendaku as long as they are in a non-initial position of a compound. (13) a. hoka-zono, mae-zono, kubo-zono, azi-zono, naka-zono, nagazono, sugi-zono, eno-ki-zono b. naka-buti, naga-buti, sugi-buti On the other hand, some morphemes tend to resist rendaku voicing in any context. hara ‘field’ and saka ‘slope’ may be such morphemes, which are realized as /hara/ and /saka/, respectively, in most cases: (14) a. oo-hara, o-hara, naga-hara, naka-hara, saka-hara; cf. kanbara b. e-saka, oo-saka, naka-saka, no-saka, ta-saka Most morphemes including ta discussed in the preceding subsection fall between these two extremes. A closer examination of compound nouns may reveal a more general nature of the extended effect of Lyman’s Law sketched in (8)–(9) as well as the degree to which rendaku voicing is morpheme-dependent.

3. Branching constraint A second major condition on rendaku voicing in Japanese is the so-called ‘branching constraint’ (Otsu 1980; Kubozono 1988). This constraint can be defined as follows.

12 Haruo Kubozono (15) Rendaku is blocked in the second member of a right-branching compound. Otsu (1980) gives the following pairs to illustrate the effect of this constraint. (16) a. Right-branching compounds nuri + [hasi + ire] nuri-hasi-ire ‘lacquered, chopstick, case; chopstick case which is lacquered’ nise + [tanuki + siru] nise-tanuki-ziru ‘pseudo, raccoon dog, soup; raccoon dog soup that is not authentic’ b. Left-branching compounds [nuri + hasi] + ire nuri-basi-ire ‘lacquered, chopstick, case; case for lacquered chopsticks’ [nise + tanuki] + siru nise-danuki-ziru ‘pseudo, raccoon dog, soup; soup made from a pseudo raccoon dog’ In (16a), the second member forms a constituent with the third rather than the first member. Corresponding to this morphosyntactic structure, rendaku voicing is blocked between the first and second members although it is not blocked between the second and third. In contrast, rendaku is not blocked in (16b), where the second as well as the third member can undergo the process. Sato (1989) adds the following pair to illustrate the same effect: (17) a. mon + [siro + tyoo] mon-siro-tyoo, *mon-ziro-tyoo ‘white, armorial bearing, butterfly; white cabbage butterfly’ b. [o + siro] + wasi o-ziro-wasi ‘tail, white, eagle: white-tailed eagle’ The status of the branching constraint may be questioned, however, despite the examples in (16) and (17). For one, it is difficult to find clear cases showing its effect. The compound nouns in (16) are novel compounds for many native speakers of Japanese, who do not necessarily have clear-cut intuitions about the presence or absence of voicing in the pairs of expressions. The compounds in (17) are existing expressions, but it is difficult to find more examples showing a similar effect. Moreover, the branching constraint may be questioned by the existence of expressions that apparently defy its effect. Some of these counterexamples are given in (18).

Rendaku: Its domain and linguistic conditions

13

(18) oo + [huro + siki] oo-buro-siki ‘big, bath, carpet; big talk’ mati + [hi + kesi] mati-bi-kesi ‘town, fire, to extinguish; fire brigade for common people’ While the status of the branching constraint may thus be questioned, it can be supported by more general phonological considerations. Kubozono (1988) provided evidence that the process of accentual phrasing characteristic of compound formation exhibits essentially the same prosodic asymmetry between right-branching and left-branching structures. This is illustrated in (19), where /’/ denotes a lexical accent and is placed immediately after the accented mora. Words without this mark are so-called unaccented words, which involve no abrupt pitch fall at the phonetic output. { } indicates an accentual phrase, or a prosodic word (PrWd). (19) a. Right-branching compounds do’itu + [bu’ngaku + kyookai] {do’itu}{bungaku-kyo’okai} ‘Germany, literature, association; German Association of literature’ b. Left-branching compounds [do’itu + bu’ngaku] + kyookai {doitu-bungaku-kyo’okai} ‘Association of German literature’ In right-branching compounds, accentual phrasing is blocked between the first and second members with the result that the first member forms an accentual phrase independent of the second and third members. Their leftbranching counterparts, in contrast, do not exhibit such an accentual split and, consequently, constitute one unified accentual unit. This contrast between left-branching and right-branching compounds is equivalent to the situation of rendaku blocking shown in (16) and (17). Unlike the case of rendaku voicing, there are a number of compound nouns in Japanese that exhibit an accentual contrast as shown in (19). More significantly, the right-branching structure is subject to a similar branching constraint at the phrasal level, where intonational phrases called ‘minor phrases’ are formed. This post-lexical process, too, is blocked in right-branching structure, but not in left-branching structure (Kubozono 1988, 1995). All these observations indicate that a branching constraint of the sort in (15) is independently motivated in Japanese phonology. All we need to do is to define the constraint in a more general form as in (20): (20a) and (20b) are synonymous in descriptive terms.4

14 Haruo Kubozono (20) Branching constraint a. Phonological unification is blocked in the right-branching structure. b. Phonological unification is blocked between two constituents, A and B, if B does not c-command A. An equally interesting fact about the branching constraint thus redefined is that it also applies to phonological processes in languages other than Japanese. In English, for example, compound nouns exhibit an asymmetry between left-branching and right-branching constructions, with the latter but not the former failing to conform to the general strong-weak pattern of compound stress of the language (Chomsky and Halle 1968; Liberman and Prince 1977). Essentially the same asymmetry is observed in Chinese. In this tone language, right-branching phrases fail to undergo the well-known tone sandhi rule whereby a sequence of two tones 3 (falling-rising tone) is converted into a sequence of tone 2 (rising tone) and tone 3 (falling-rising tone) (Hirose et al. 1994). Thus, a string of 3-3-3 tones turns into 2-2-3 via 2-3-3 if it forms a left-branching structure, but the same string tends to yield 3-2-3 in a right-branching structure. A similar effect is observed in the tone sandhi rule in Ewe, a tone language in Africa (Clements 1978). Moreover, it is also reported that consonant lengthening in Italian is blocked in right-branching constructions (Napoli and Nespor 1976). It is an open empirical question if this structural constraint is observed in a wider range of languages, but it obviously represents a rather general constraint on phonological processes that has a cross-linguistic significance. One last question that remains unanswered is why phonological processes in Japanese and other languages are subject to the structural constraint formulated in (20) – or, equivalently, why right-branching structure exhibits such a marked phonological pattern. Kubozono (1995) proposed two hypotheses. One is that the right-branching structure displays irregular phonological behavior in languages where the right-branching structure is syntactically/morphologically marked. This interpretation is consistent with the fact that the left-branching structure is unmarked, at least statistically, in Japanese compounds and phrases as well as in English compounds. If this interpretation is correct, it is expected that left-branching rather than rightbranching structures will show marked, exceptional behavior in rightbranching languages. A second hypothesis put forward by Kubozono (1995) is that the right-branching constraint in (20) is universal and applies to compound nouns irrespective of whether the left-branching or rightbranching structure is syntactically/morphologically unmarked in a particular

Rendaku: Its domain and linguistic conditions

15

language. These two hypotheses must be compared and evaluated by examining phonological markedness in a wider range of languages. This is certainly another interesting topic for future work that will require detailed cross-linguistic comparisons.

4.

Mora constraint

4.1. The alternation between /hon/ and /bon/ We have seen in the preceding section that the constituent structure of a compound noun can serve as a condition on rendaku voicing and, accordingly, determine the domain of this productive process. In addition to this, there are cases where the domain of the rule is determined by the phonological length of the compound. One such case is compounds that contain the morpheme hon ‘book’ as their second member. According to Ohno (2000), this morpheme exhibits an alternation between the underlying form /hon/ and the rendaku form /bon/, depending on the phonological length of the element with which it is combined, or N1.5 A crucial boundary lies between bimoraic and trimoraic N1. If the N1 is monomoraic or bimoraic, hon fails to undergo rendaku and manifests itself as /hon/. If the N1 is three moras long or longer, on the other hand, hon undergoes the voicing rule and yields /bon/. These two patterns are formulated in (21) and illustrated in (22).6,7 (21) N1+hon undergoes rendaku voicing if N1 is longer than two moras. (22) a. e-hon ‘picture book’, aka-hon ‘red book, or a brand name of a publisher’s books for entrance examinations’, ero-hon ‘erotic book’, huru-hon ‘secondhand book’ b. bunko-bon ‘paperback book’, manga-bon ‘comic book’, tyuuko-bon ‘secondhand book’, etti-bon ‘erotic book’, tankoo-bon ‘independent book’, pinku-bon ‘pink, book; pornographic book’, karaa-bon ‘colored book’ Two additional points should be emphasized here. First, the phonological length of N1 must be defined in terms of the mora and not the syllable. This is clearly shown by compounds such as /pinku-bon/ and /karaa-bon/ whose N1 consists of three moras but two syllables. These bisyllabic morphemes

16 Haruo Kubozono do not pattern with bimoraic and bisyllabic morphemes like /aka/ ‘red’ and /ero/ ‘erotic’. Another noteworthy point is that the morphological complexity of N1 does not matter. The monomoraic and bimoraic N1s in (22a) are all monomorphemic while N1 in (22b) consists of more than one morpheme in most cases. This reflects the fact that hon, a Sino-Japanese (SJ) morpheme, tends to be combined with another SJ morpheme (or morphemes) and that each SJ morpheme is up to two moras long. However, the morphological structure of the N1 does not directly concern the boundary between /hon/ and /bon/. This is shown by monomorphemic N1s such as /pinku/ ‘pink’ and /karaa/ ‘color’, which clearly pattern with bimorphemic words like /bunko/ ‘bibliotheca, papeterie’ and /manga/ ‘cartoon’ and not with monomorphemic words like /aka/ ‘red’ and /ero/ ‘erotic’. Having justified the generalization in (21), it is necessary to point out that this rule applies specifically to compound nouns with hon, and not to other compounds. Indeed, many morphemes other than hon do not conform to the pattern in (21). We saw in section 2 above that the morpheme ta ‘rice field’ can undergo voicing even when it is combined with a bimoraic noun. Moreover, some morphemes like ha ‘tooth’ and kame ‘turtle’ undergo rendaku even when they are combined with bimoraic nouns as in (23a), while others are not subject to voicing whether they are combined with bimoraic or trimoric nouns, as shown in (23b). (23) a. musi + ha musi-ba ‘a decayed tooth’ umi + kame umi-game ‘sea, turtle; turtle’ mayu + ke mayu-ge ‘eyebrow, hair; eyebrow’ b. migi + te migi-te hidari + te hidari-te kasegi + te kasegi-te

‘right, hand; the right hand’ ‘left, hand; the left hand’ ‘to earn, hand; bread winner’

While the rule in (21) is not a general constraint on rendaku in Japanese compound nouns, it does not follow that the mora-based generalization represents an idiosyncratic rule in Japanese phonology. The rule in (21) can be reinterpreted as follows if we consider the phonological length of the whole word rather than the length of individual components. (24) hon undergoes rendaku if the entire word consists of more than four moras. This generalization means that rendaku does not occur in Noun-hon if this whole word is up to four moras long. In other words, the morpheme hon

Rendaku: Its domain and linguistic conditions

17

preserves its underlying form /hon/ in four-mora or shorter words, while it undergoes some phonological process characteristic of compounds in words of five or more moras. Interestingly, essentially the same contrast between words up to four moras and those composed of five or more moras is observed in several other phonological processes independent of rendaku voicing in Japanese. Let us first consider the process that Itô and Mester (1996) called ‘contraction’ in SJ compounds.

4.2. Contraction in SJ compounds One type of SJ morpheme has a (C)VC structure with an optimal onset. Morphemes of this type can only take a voiceless obstruent, /t/ or /k/, in the coda position, and exhibit two phonological patterns in SJ compounds, depending on the initial segment of the following morpheme. In many cases, they undergo vowel epenthesis in order to avoid closed syllables or voiced geminates. This is illustrated in (25), where < > and /./ denote an epenthetic vowel and a syllable boundary, respectively. (25) gak + bu ga.k.bu, *gak.bu, *gab.bu ‘learning, part; faculty’ but + ri bu.t.ri, *but.ri, *bur.ri ‘substance, law; physics’ On the other hand, (C)VC morphemes do not undergo epenthesis if the following morpheme begins with a voiceless consonant. This is the pattern that Itô and Mester (1996) termed ‘contraction’. The only minor change that the morphemes in question undergo is place assimilation, whereby the morpheme-final consonant becomes homorganic with the initial consonant of the following morpheme.8 This is illustrated in (26). (26) gak + kai gak.kai, *ga.k.kai ‘learning, party; academic society’ but + si bus.si, *bu.t.si ‘Buddha, teacher; a sculptor of Buddhist images’ gak + ka gak.ka, *ga.k.ka ‘learning, department; department’ but + ka buk.ka, *bu.t.ka ‘thing, price; commodity prices’ The contraction process in (26) has the effect of combining the two morphemes in a straightforward manner. This process, however, is blocked if there is a word boundary between the two morphemes. In other words, the contraction in (26) occurs only if the two adjacent morphemes form a constituent. This is illustrated in (27a), where the constituency is shown by [ ].

18 Haruo Kubozono SJ compounds in (27b), in contrast, readily undergo the contraction since they do not involve a word boundary between the second and third elements. (27) a. [dai + but] + si dai.bu.t.si, *dai.bus.si ‘great, Buddha, teacher; a sculptor of big Buddhist images’ [sin + gak] + ka sin.ga.k.ka, *sin.gak.ka ‘god, learning, department: department of religion’ b. dai + [but + si] dai.bus.si, *dai.bu.t.si ‘great, Buddha, teacher; a great sculptor of Buddhist images’ sin + [gak + ka] sin.gak.ka, *sin.ga.k.ka ‘new, learning, department; a new department’ Itô and Mester (1996) interpret the constituency effect illustrated in (27) as a constraint on the domain of the contraction process: Contraction occurs within a PrWd, which consists of one or two morphemes. Since every SJ morpheme is at most two moras long, this domain constraint can be reinterpreted as in (28). (28) Contraction occurs in the domain of up to four moras. Contraction has taken place in (26) and (27b) since the (C)VC morphemes in question are embedded in a word of up to four moras. In (27a), by contrast, (C)VC morphemes are combined with the following CV(C) morphemes in a larger word. In terms of phonological length, this fact can be reduced to a constraint requiring that the maximal domain of contraction be a constituent consisting of four moras. In other words, two morphemes can be combined without undergoing vowel epenthesis if they form a four-mora or shorter word. This is precisely the same domain constraint that we saw for hon above, which does not undergo the compound rule of rendaku voicing if it is embedded in a four-mora or shorter word.

4.3. /p/-/h/ alternation in SJ compounds /p/-/h/ alternation in SJ compounds shows the same domain effect as the process of vowel epenthesis. This is illustrated with the morpheme hitu ‘pencil’ here. It is generally assumed that the underlying form of this alternation is /p/, which alternates with /h/ in a predicable way. Itô and Mester (1996)

Rendaku: Its domain and linguistic conditions

19

showed that morphemes involving this alternation preserve the underlying form with /p/ when they follow a morpheme ending in a moraic nasal /N/: (29) eN + pitu em-pitu ‘lead, pen; pencil’ haN + patu ham-patu ‘opposite, start; rebel’ However, this does not happen in the two environments in (30) even if they are preceded by a moraic nasal. In (30a), the /p/-morpheme is combined with a SJ compound; in (30b), it forms a constituent with the following morpheme before it does with the preceding one. In these two cases, /p/initial morphemes do not keep their underlying /p/ and, hence, take /h/ instead ([ ] denotes a constituent): (30) a. [maN +neN] + pitu mannen-hitu, *mannem-pitu ‘one thousand, year, pen; fountain pen’ b. siN + [patu + mei] sin-hatumei, *sim-patumei ‘new, invention; new invention’ Since SJ morphemes are maximally bimoraic, the boundary between the /p/ pattern in (29) and the /h/ pattern in (30) can be defined as follows: (31) /p/ is preserved if it is in the non-initial position of words consisting of up to four moras; otherwise, it is realized as /h/. This domain effect is identical to the one we saw in the preceding section as well as the ones we will see in the next sections.

4.4. Accent of mimetics Japanese exhibits some accentual processes that are sensitive to the fourmora domain. One of them is the accentuation of reduplicative mimetic expressions. The base form of Japanese mimetic expressions is largely bimoraic with accent on the initial syllable. When these bimoraic bases are reduplicated to form four-mora words, they are usually accented on their initial syllable. In other words, only the first member of the reduplicated form preserves its accent. This is an accent pattern characteristic of reduplicative mimetics (32) and reduplicative nouns (33) as well as dvandva, i.e. coordinate, compound nouns (34) (Nasu 2001).

20 Haruo Kubozono (32) yu’ra + yu’ra yu’ra-yura ‘(to sway) gently’ su’ru + su’ru su’ru-suru ‘(to climb) smoothly’ ba’ta + ba’ta ba’ta-bata ‘(to fall) noisily, one after another’ (33) mura’ + mura’ mura’-mura ‘village, village; villages’ ka’zu + ka’zu ka’zu-kazu ‘number, number; in a great number, numerous’ (34) yo’ru + hiru’ yo’ru-hiru hiru’ + yo’ru hiru’-yoru a’sa + ban a’sa-ban

‘night and day’ ‘day and night’ ‘morning and evening’

Accent deletion of the second member does not seem to occur, however, if the bimoraic base is reduplicated after being combined with a mimetic ending such as /ri/ and the moraic nasal /N/.9 (35) yura’ri + yura’ri yura’ri yura’ri ‘(to sway) in slow motion’ suru’ri + suru’ri suru’ri suru’ri ‘(to dodge) swiftly’ bata’N + bata’N bata’N bata’N ‘thumpety thump’ The contrast between (32) and (35) indicates that four-mora mimetics constitute a prosodic word (PrWd), or one accentual unit, whereas six-mora mimetics form two PrWds. This provides further support to the claim that the maximal length of a PrWd is four moras.

4.5. Accent of numeral sequences Similar accentual evidence can be found with the pronunciation of numeral sequences. SJ morphemes for numbers are underlying monomoraic or bimoraic, but they are invariably pronounced with a bimoraic length when enumerated in a string of numbers (Itô 1990). Thus, monomoraic morphemes, /ni/ ‘two’ and /go/ ‘five’, are pronounced with a long vowel: [ni:] and [go:]. What is interesting here is that numeral sequences are divided into prosodic words each consisting of two morphemes, or four moras. This shows up very clearly in citing telephone numbers, as exemplified in (36). { } denotes the domain of PrWd, while H and L stand for high and low tones, which are assigned to every mora here for the sake of description.

Rendaku: Its domain and linguistic conditions

21

(36) a. 03-6825-7194 {reesa’n}{rokuha’ti} {niigo’o} {nanai’ti} {kyuuyo’n} LHHL LHHL LHHL LHHL LHHL b. 721-2875

{nanani’i} {iti’} {niiha’ti} {nanago’o} LHHL LH LHHL LHHL

In (36b), the string of three numbers, 721, is realized in two PrWds, with the first two numbers forming a four-mora unit, and the last number constituting a separate PrWd. This clearly demonstrates that the optimal length of PrWds is maximally four moras. Interestingly, the same maximality constraint operates in other dialects, too. (37) shows how the string in (36b) is pronounced in Kinki (Kyoto/ Osaka) dialects (Fukui 1990). In fact, Tokyo and Kinki dialects differ only in the tonal pattern of four-mora PrWds: four-mora strings are pronounced with the tonal pattern of LHHL in Tokyo, and with the pattern of LLHL in Kinki.10 Two-mora PrWds are pronounced with the original (or lexical) accentual pattern of the relevant morpheme in both dialects. (37) 721-2875 {nanani’i}{i’ti} {niiha’ti} {nanago’o} LLHL HL LLHL LLHL The facts in (36) and (37) clearly show that the maximal size of a PrWd is four moras in number enumerations. Compound nouns can form a longer PrWd, as exemplified in (19), but this is due to a morphological requirement demanding correspondence between morphological and prosodic words (or edges). The facts discussed in this and the preceding sections reveal an emergence of the unmarked, or an optimal phonological shape of PrWds in Japanese.

4.6. Morphological evidence Finally, morphological evidence reinforces our claim that the optimal form of a PrWd in Japanese is up to four moras long and, hence, that the mora condition on the voicing of /hon/ in (24) is of a rather general nature in Japanese phonology. Let us consider truncation, first. One of the most basic characteristics of loanword truncation in Japanese is that long words are converted into four-mora or shorter forms: e.g. /irasutoreesyon/ /irasuto/ ‘illustration’ (Itô 1990). Truncation of compounds is subject to essentially the same condition to yield four-mora outputs in most cases: e.g. /poketto

22 Haruo Kubozono monsutaa/ /pokemon/ ‘pocket monster, or Pokémon’. This process admits three-mora outputs in some contexts, but never permits five-mora or longer outputs. The same maximality constraint applies to other morphological processes such as the formation of zuzya-go (a jazz musicians’ secret language). Zuzya-go formation involves metathesis by which the final two moras in the input are combined with the initial two moras to yield four-mora outputs. Here again, three-mora outputs are allowed in some contexts, but fivemora or longer outputs are absolutely illicit (Itô et al. 1996; Kubozono 2002b). This input-output correspondence, too, reveals a tremendous difference between structures with four and structures with five moras. All in all, the fact that five-mora or longer outputs are never tolerated in these morphological processes supports the idea that the optimal word form in Japanese is up to four moras long. There are, of course, quite a few words, mostly loanwords, that are morphologically simplex but phonologically longer than four moras: e.g. /irasutore’esyon/ ‘illustration’, /anime’esyon/ ‘animation’. But there is some accentual evidence suggesting that five-mora or longer loanwords are processed as phonological compounds, i.e. that five-mora or longer words are split into two four-mora or shorter substrings to which accent is assigned by the compound accent rule (Sato 2002; Kubozono 2002a). This, too, lends support to the idea that PrWds in Japanese are optimally up to four moras long.

5. Concluding remarks In this paper I have first considered Fukuda and Fukuda’s (1999) neurolinguistic data suggesting that rendaku voicing falls into two kinds: voicing in some words is lexicalized, while voicing in other words is due to a productive synchronic process of voicing. In the rest of the paper, I have discussed three constraints on rendaku voicing: an extended version of Lyman’s Law, branching constraint and mora constraint. These three constraints define the domain in which the productive process of rendaku voicing occurs in contemporary Japanese. The extended version of Lyman’s Law and the mora constraint apply only to a specific type of compound nouns, while the branching constraint applies in a wider context. Despite this difference, all these constraints represent quite general conditions on phonological and morphological processes in Japanese. In this sense, the constraints on rendaku

Rendaku: Its domain and linguistic conditions

23

voicing should be interpreted in a wider context. These constraints, if examined in more detail, might uncover more interesting aspects and principles of (Japanese) phonology.

Notes 1. What counts here is the consonant in the preceding mora, not in the preceding syllable. This is clearly shown by den and goo, which yield /den-da/ and /gooda/, not /den-ta/ and /goo-ta/, respectively, even though they contain a voiced obstruent (/d/ or /g/). 2. Sugito was mainly concerned with the relationship between the /ta/-/da/ distribution and the accentual pattern of the whole personal name. She found out that three-mora names ending in /ta/ are usually accented on their initial mora in Tokyo Japanese (e.g. si’ba-ta, ku’bo-ta), while those containing /da/, e.g. /ima-da/ and /sima-da/, tend to be unaccented. This is an interesting fact that needs to be explained. 3. We can add the word /kuro-da/ to Sugito’s list of exceptions. 4. Node A c(onstituent)-commands node B if neither A nor B dominates the other and the first branching node which dominates A dominates B (Reinhart 1976: 32). In the right-branching structure [[A][[B][C]]], [A] c-commands [B], but [B] does not c-command [A] because [B] forms a constituency with [C] rather than with [A]. In the left-branching structure [[[A][B]][C]], on the other hand, both [B] and [C] c-command [A]. 5. The morpheme hon ‘book’ should be clearly distinguished from the numeral classifier hon which is used to count the number of objects such as fingers and pencils (e.g. /go-hon no yubi/ ‘five-hon Gen finger=five fingers’). This numeral morpheme alternates between three phomemic forms, /hon/, /bon/ and /pon/, depending on the phonetic property of the immediately preceding sound (Tanaka and Kubozono 1999). 6. An apparent exception to the generalization illustrated in (21) is the word /binibon/ ‘vinyl book, a book enclosed in vinyl’. This particular instance will not count as an exception since /bini-bon/ does not come directly from /bini-hon/, but from /buniiru-bon/ via shortening: namely, /biniiru + hon/ /biniiru-bon/ /bini-bon/. 7. There are some compounds which contain the morpheme hon but have lost its original meaning ‘book’ e.g. /mi-hon/ ‘a sample for sale’, /syoo-hon/ ‘an extract’, hyoo-hon ‘a sample’. Interestingly, these lexicalized compounds conform to the generalization in (21).

24 Haruo Kubozono 8. Contraction is generally blocked if the first morpheme ends in /k/. In this case, vowel epenthesis instead of contraction occurs except when the second morpheme also begins with /k/. Thus, /hak+ti/ and /hak+sai/ undergo vowel epenthesis and turn into /hakuti/ ‘imbecility’ and /hakusai/ ‘Chinese cabbage’, respectively, while /hak+kyuu/ turns into /hakkyuu/ ‘white ball’. The fact that the morpheme-final /k/ blocks contraction reveals an interesting asymmetry between /t+k/ and /k+t/, which is called ‘coronal asymmetry’ by Itô and Mester (1996: 30). Thus, the former but not the latter triggers contraction: e.g. /bet+kak/ /bekkaku/ ‘different style’ vs. /hak+ti/ /hakuti/ ‘imbecility’. A similar asymmetry is observed in the morphophonology of native verbs, where a stem-final /k/ triggers vowel epenthesis rather than contraction when it is followed by a /t/-initial ending like the past marker /ta/. Thus, /kak + ta/ ‘to write (past)’ turns into /kakita/ (and subsequently /kaita/), whereas /yor + ta/ ‘to approach (past)’ and /hasir + ta/ ‘to run (past)’ turn into /yotta/ and /hasitta/, respectively. 9. We occasionally observe reduplicative mimetic forms that are five-mora long. These five-mora forms seem to be split into two PrWds: e.g. {yu’ra} {yura’ri} ‘(to sway) gently’. 10. This tonal pattern is different from the typical pattern of nouns. In Tokyo Japanese, nouns are usually accented, if accented at all, on the third mora from the end of the word: the word ‘Nagasaki’, for example, is accented on /ga/ as in /naga’saki/. However, the tonal pattern characteristic of numeral sequences is also found in four-mora acronyms consisting of two alphabets. Thus, the words for ‘PC’ (personal computer) and ‘OL’ (office lady) are pronounced with an accent on the penultimate mora: /piisi’i/, /ooe’ru/. Alphabetic acronyms are different from numeral sequences, though, in that three-letter and longer acronyms form one unified PrWd that is longer than four moras: e.g. ‘PTA’ /piitiie’e/, ‘IBM’ /aibiie’mu/, ‘YMCA’ /waiemusiie’e/.

Sequential voicing, postnasal voicing, and Lyman’s Law revisited Keren Rice

Japanese exhibits several processes that involve voicing. In this article I examine three of these: rendaku, Lyman’s Law, and post-nasal voicing. Rendaku and post-nasal voicing are generally considered together under the rubric of sequential voicing (e.g., Martin 1952; Vance 1987, 1996). Martin (1952: 48) defines sequential voicing, or voicing alternations, as the ‘replacement of a voiceless consonant with the corresponding voiced consonant’ and Vance (1987: 133) gives a similar definition, saying that sequential voicing or rendaku ‘refers to the replacement of a morpheme-initial voiceless obstruent with a voiced obstruent.’ These definitions do not consider the environment for voicing, nor the nature of the element that triggers voicing, but simply view voicing as a substitution of a voiced obstruent for a voiceless one. In the theoretical literature on voicing alternations in Japanese, it has often been assumed that the voicing feature that is active in sequential voicing is of a single type – see, for instance, Itô and Mester (1986, 1995a, 1999a), Itô, Mester, and Padgett (1995, 1999), Calabrese (1995), Labrune (1999), and Clements (2001) for some recent treatments. Others have argued that two types of voicing features exist phonologically in Japanese – see Rice (1993); Avery and Idsardi (2002), and Kuroda (2002). In the first part of this article, working within a representational framework, I address this issue, arguing that two types of phonological voicing are needed in Japanese, what I will call laryngeal voicing (LV) and what has been identified as sonorant voicing, or SV (e.g., Avery 1996; Piggott 1992; Rice 1993; Rice and Avery 1991). I argue that LV is the feature involved in what I will call rendaku, to be further defined below, and SV the feature involved in postnasal voicing in a non-rendaku environment. My concern in this article is with representations more than grammar, and I leave open exactly how the grammar is to be formalized. While the identification of two types of voicing accounts for one set of problems in the phonology of Japanese, namely the different patterning of voiced obstruents and nasals with respect to rendaku and Lyman’s Law, the

26

Keren Rice

solution that I offer runs counter to the claim that the lexicon of Japanese is synchronically stratified, with a constraint against post-nasal voiceless obstruents holding of the Yamato, or native, stratum of Japanese and not in the Sino-Japanese part of the lexicon.1 In the second part of the article, I review the problems with stratification between the Yamato and SinoJapanese vocabularies based on the proposed constraint that the Yamato stratum of the lexicon requires that post-nasal obstruents be voiced while the Sino-Japanese stratum does not have this restriction, extending the discussion in Rice (1997).

1. Some background Itô and Mester (1986) pose a conundrum in Japanese. First consider the much-studied process of rendaku, or sequential voicing. Several examples are given in (1).2 I use the Romanization used in the source. (1)

Rendaku a. voicing of initial consonant when no consonants follow take + sao take-zao ‘bamboo pole’ bamboo pole (Vance 1987: 133) kan + sya kan-zya ‘patient’ illness person (Labrune 1999: 123) b. voicing of initial consonant when voiceless obstruent follows de + kuc&i de-guc&i ‘exit’ leave mouth (Itô and Mester 1986: 52) c. voicing of initial consonant when sonorant follows is&i + tooroo is&i-dooroo ‘stone lantern’ stone lantern (Vance 1987: 133) ike + hana ike-bana ‘ikebana’ arrange flower (Itô and Mester 1986: 53) hon + tana hon-dana ‘book shelf’ book shelf (Labrune 1999: 123) d. no voicing of initial consonant when voiced obstruent follows doku + tokage ‘poisonous lizard, Gila monster’ poison lizard (Vance 1987: 137) hyootan + kago ‘gourd basket’ gourd basket (Itô and Mester 1986: 69)

Sequential voicing, postnasal voicing, and Lyman’s Law revisited

27

As the examples in (1) illustrate, rendaku fails to apply just in the case that a voiced obstruent follows within the morpheme that is the target of voicing. This blocking of rendaku by a voiced obstruent is conditioned by a process known as Lyman’s Law; see, for instance, Martin (1952), McCawley (1966), Vance (1987), Itô and Mester (1986, 1995a, 2003), and Fukazawa and Kitahara (2001) for discussion. Lyman’s Law disallows more than one voiced obstruent within a morpheme. This constraint holds whether both obstruents are lexically contained within a single morpheme (see (2)) or one would be derived through rendaku (as in (1d)). (2)

Lyman’s Law: morpheme internal futa ‘lid’ fuda ‘sign’

buta ‘pig’ *buda (Itô and Mester 1995a: 819)

Itô and Mester (1986) treat Lyman’s Law as an OCP effect against two voiced segments within a particular domain. If the feature [voice] is contrastive for obstruents and redundant for sonorants, the exclusion of sonorants as blockers is easily explained. The term sequential voicing is also used to describe voicing that is found in a post-nasal environment; see, for instance, Martin (1952), Itô and Mester (1986), and Vance (1987, 1996). I refer to this as post-nasal voicing. An example is given in (3), showing alternations in form of the past tense morpheme. (3)

post-nasal voicing: morphologically derived environments mi-ta ‘look at, past’ (Vance 1987: 176) s&in-da ‘die, past’ (Vance 1987: 177)

Itô and Mester (1995a, 2003) and Itô, Mester and Padgett (1995) also propose that post-nasal voicing is at play within morphemes. They argue that within the Yamato, or native, vocabulary of Japanese, voicing is redundant on post-nasal obstruents, as in (4). (4)

post-nasal voicing: within a morpheme tombo ‘dragonfly’ *tompo unzari ‘disgusted’ *unsari

(Itô and Mester 1995a: 819) (Itô and Mester 1995a: 819)

The facts presented in this section leave us with the conundrum mentioned at the start – the voicing illustrated in (1) is blocked by Lyman’s Law and requires that voicing be absent from nasals (1c), but the process shown in

28

Keren Rice

(4), post-nasal voicing, requires the presence of voicing on nasals to trigger voicing assimilation. Itô and Mester (1986) lay out a variety of solutions, and suggest that rule ordering might offer the best solution, while Itô, Mester, and Padgett (1995) use antagonistic constraints to account for the facts; Clements (2001) offers an alternative analysis using a single voicing feature, as does Calabrese (1995). In this article, I take up my earlier proposal (Rice 1993, 1997; see also Avery and Idsardi 2002; Ohno this volume), that the voicing that underlies rendaku and the voicing that underlies post-nasal voicing are different in nature, with one being laryngeal voice and the other sonorant voice. I refer to this as the dual mechanism hypothesis, and to the alternative as the single mechanism hypothesis. I begin by briefly summarizing the account of rendaku and then examine post-nasal voicing.

2. Rendaku It will be useful to begin by defining some terms. I use the term ‘sequential voicing’ to mean the overall effects discussed in section 1, namely the substitution of a voiced obstruent for a voiceless one without regard for details. I use the term ‘rendaku’ or ‘rendaku element’ in a very narrow sense, to refer to a voicing feature that functions as a compound formative (see Itô and Mester 1986, for instance); it is this feature that causes voicing in compounds. Finally, I use the term post-nasal voicing to refer to the voicing caused by a nasal in the absence of the rendaku element, as in (4). With respect to rendaku, then, I assume an analysis like that proposed by Itô and Mester (1986): the voicing feature, i.e., the rendaku element, is part of a segment that occurs in some compounds (to be elaborated below) and provides voicing to the initial obstruent of the second member of the compound. Following the standard analysis, the association of the rendaku element is blocked by the occurrence of a voiced obstruent later in the morpheme. I refer to the feature involved as LV; representationally, I assume that it is dominated by a root node; see also Labrune (1999). Other than calling the feature LV rather than [voice], this analysis follows Itô and Mester (1986); see Labrune (1999) and Clements (2001) for other alternative analyses.

Sequential voicing, postnasal voicing, and Lyman’s Law revisited

3.

29

Post-nasal voicing

I examine post-nasal voicing in two environments, between morphemes (section 3.1) and within a morpheme (section 3.2).

3.1. Post-nasal voicing between morphemes Post-nasal voicing between morphemes is illustrated in (3). What is the source of post-nasal voicing? In Rice (1993) I argue that post-nasal voicing in a morphologically derived environment in Japanese is a consequence of a second type of voicing, sonorant voicing. Under the assumption that SV is inherent in nasals, in morphologically derived forms, post-nasal voicing results from the sharing of this feature by the nasal at the end of one morpheme and an adjacent obstruent at the beginning of the following morpheme; see Rice (1993). In Rice (1993) I made this proposal in order to account for the fact that nasals are transparent with respect to rendaku voicing but share their voicing with a following obstruent, arguing that two types of voicing are required to account for this chameleon-like patterning of nasals. If this analysis is correct, what would be expected if underlying obstruent voicing (LV) and post-nasal voicing (SV) were to interact? Assuming these two types of voicing, one would expect that Lyman’s Law would not block voicing when the targeted morpheme contains an LV segment and sequential voicing is a consequence of post-nasal voicing, or the feature SV. Under the single mechanism hypothesis, on the other hand, with only a single type of voicing, both underlying voiced obstruents and voiced obstruents derived from post-nasal voicing involve the same type of voicing, and one would expect that Lyman’s Law would block post-nasal voicing if a voiced obstruent is already present in the morpheme.3 Recall that sequential voicing (in the general sense) affects the first consonant of the second element of a compound. In examples of sequential voicing, noun compounds where the first item ends in a vowel are usually used, and examples of post-nasal voicing are discussed separately (e.g., Martin 1952; Vance 1987). Itô and Mester (1986) supply one example of Lyman’s Law at work when the first noun of a noun-noun compound ends in a nasal, (1d). I now examine compounds of other types than noun-noun to see what is found there. Vance (1987: 144) and Vance (this volume), in a discussion of compounds of other types (1987: 142–146), observes that sequential voicing

30

Keren Rice

does not usually apply in compounds where both members are verbs, and semantic and phonological conditions hold such that it sometimes applies in direct object-verb compounds and sometimes not; see also Labrune (1999), among others. Vance further recognizes a source for rendaku voicing, proposing that it comes from a reduced form of the genitive particle no (Vance 1982, 1987: 136; also Labrune 1999) that occurred between nouns; thus verb compounds would be excluded from being affected by sequential voicing due to the absence of this particle. Based on Vance’s discussion, and on the examples used to illustrate rendaku, it appears that the rendaku element is not found in all compound types. Without going into detail, the rendaku element is not found in most compounds involving verbs as a second member. Assuming then that the rendaku morpheme is present in some noun compounds but that compounds with a verb as the second member (I will call these verb compounds) do not contain this element, it might be possible to sort out whether the rendaku element and post-nasal voicing have the same feature through an examination of verb compounds. First consider cases where the rendaku element is present and the second element of the compound contains a voiced obstruent. In the rendaku environment, rendaku LV is not licensed as it would lead to a violation of Lyman’s Law, as discussed above. The example in (5), repeated from (1d), illustrates this. (5)

hyootan + kago

‘gourd basket’

(Itô and Mester 1986: 69)

In this case, the initial obstruent of the second member of the noun compound (k) fails to voice due to Lyman’s Law; the final nasal of the first member of the compound cannot trigger voicing of the /k/ because it is not adjacent, as illustrated in (6). (6)

n Root | SV

rendaku Root | LV

k … Root

g Root | LV

When the rendaku element is present, then, it takes precedence over a nasal in terms of being a trigger for voicing on the initial consonant of the second element of the compound. In (6), Lyman’s Law blocks the association of the rendaku LV to /k/, producing the unvoiced form. Nasal-final and vowelfinal first elements of a compound pattern together, as it is the rendaku LV and not the nasal SV that has the opportunity to be implemented here.

Sequential voicing, postnasal voicing, and Lyman’s Law revisited

31

Consider now forms with verbs as the second member. The prefixed forms in (7) are verbs which contain an initial element with a final nasal and a second element with an initial obstruent. The first morpheme is identified by Itô and Mester (1999a: 68) and Vance (this volume) as fum- ‘to step on’ (and hence these forms are treated as compounds) and by Vance (1987: 137) as fuN-, an unproductive emphatic prefix. (7)

Post-nasal voicing in verb compounds tsukeru ‘attach’ fun-dzukeru *fun-tsukeru ‘trample on’ (Itô and Mester 1999a: 68) kiru ‘cut’ fuN-giru *fuN-kiru ‘give up; take decisive action’ (Itô and Mester 1999a: 68; Vance this volume)

In these forms, the post-nasal obstruent voices. This can be accounted for by either the single voicing mechanism or the dual voicing mechanism hypothesis; under the single mechanism hypothesis, voicing on the initial consonant of the verb would be triggered not only by the rendaku element but also by a nasal (the same is true of forms with the verb suffix in (3)); under the dual mechanism hypothesis, rendaku in (1) is triggered by the LV of the rendaku element and post-nasal voicing in (3) and (7) by the SV of the final nasal of the first morpheme. In the environment where the rendaku element is missing, nasal-final and vowel-final first elements do not pattern together: the final nasal of the first element causes post-nasal voicing, but if these same verb stems follow a vowel-final morpheme, no voicing occurs; see (11) below. Some additional examples of verb compounds are given in (8) and (9). These items have in common that their second element is suru, a verb meaning ‘do’ (Martin 1952: 49). When suru follows an element ending in a nasal, its initial is generally voiced (8); when it follows an element ending in a vowel, its initial is generally voiceless (9); there are lexical exceptions to both statements.4 (8)

Voicing of the initial consonant of suru ‘do’ following a nasal a. karon-zuru ‘to esteem, treat lightly’ (Martin 1952: 51; formal register, Bill Poser, p.c. June 2002) cf. karu(o)-si adjective form (Kazutoshi Ohno, p.c., August 2002) b. omon-zuru ‘to esteem, treat highly’ (Martin 1952: 51; formal register, Bill Poser, p.c., June 2002)

32

Keren Rice

c. kin-zuru kin d. uton-zuru utome. sakin-zuru saki (9)

‘forbid’ (Vance 1986: 139, formal register) ‘prohibition’ [kin-jiru (normal register)] ‘is indifferent, neglects’ (Martin 1952: 51) ‘distant, estranged’ ‘advances’ (Martin 1952: 51) ‘ahead’ + n- intensive

No voicing of the initial consonant of suru following a vowel a. kae-suru ‘to make change, to exchange’ (Parker 1939: 115) b. ai-suru ‘to love’ (Manami Hirayama, p.c.) c. maru.arai-suru ‘to do all the washing ([all.washing]-do])’ (Kazutoshi Ohno, p.c. August 2002) d. sanpo-suru ‘to take a walk’ (Poser 2002: 3) e. tatiuti-suru ‘to cross swords’ (Poser 2002: 3)

Martin (1952: 49) comments on compounds with suru, remarking that the form usually begins with the voiceless obstruent; it is voiced following some long vowels. Vance (1987: 140) identifies these long vowels as coming from vowel-nasal sequences. Martin (1952: 50) further notes that ‘the majority of S[ino] morphs ending in n which occur in this sort of compound are attached to the alternant -zu.ru rather than -su.ru.’ Vance (1987: 140) echoes this observation, pointing out that there are counterexamples to the tendency to voice following a nasal, but it seems to reflect a very old pattern. The overall tendency with suru, then, is that its initial consonant voices after a nasal, but not after a vowel. This can be accounted for if these compounds do not involve the rendaku element, but simply show post-nasal voicing. Either hypothesis could account for these forms. Other compounds show similar patterning. The examples in (10) should be compared with those in (7). (10a, b) illustrate initial voicing of the verb kiru ‘cut’ as the second element of a compound after a nasal (10a), but not after a vowel (10b); (10c) shows the verb tsukeru ‘attach, add’ after a vowel. Some of these forms contain the morpheme fum shown in (7), here in its continuative form fum-i (Vance, this volume). In (10a), in the second example the second part of the compound is a deverbal noun (Vance 1987: 145); this is also the case in the third example in (10c).

Sequential voicing, postnasal voicing, and Lyman’s Law revisited

(10) a. voicing after a nasal fuN-giru ‘give up’ mijin-giri ‘mincing’ mijiN ‘bit’ +

33

(Itô and Mester 1999a: 68) (Vance 1987: 146) kiru

‘cut’

b. no voicing after a vowel fumi-kiru ‘to step out of bounds’ (cf. fuN-giru) (Parker 1939: 39) kami-kiru ‘to bite off’ (Manami Hirayama, personal communication, July 2002) garasu-kiri ‘glass cutter’ (Vance 1987: 146) garasu ‘glass’ + kiru ‘cut’ c. no voicing after a vowel (Manami Hirayama, p.c. July 2002) fumi-tsukeru ‘to trample (something) under foot’ cf. fun-dzukeru in (6) In the forms in (10), the initial consonant of the second element is voiced when it follows a nasal, but not when it follows a vowel. (Note that Vance 1987 discusses the pair ‘mincing’ and ‘glass cutter’ as examples where accent and the grammatical relationship between the two pieces are the major determinants in whether post-nasal voicing occurs. Clearly more work is necessary to sort out the various phonological, syntactic, and semantic factors involved. Vance (1987: 40–41) points out that while a preceding nasal favoured the development of sequential voicing in Sino-Japanese, the correlation is not perfect; thus, counterexamples exist in both ways: both nasal-voiceless obstruent and vowel-voiced obstruent sequences also exist. See also Ohno (2002) for discussion. As mentioned in note 2, there are lexical exceptions where the patterns discussed here simply do not hold, and lexical listing is required.) Now consider another situation, namely when the second morpheme in a verb compound begins with a voiceless obstruent, and a voiced obstruent follows later in the word, i.e., the environment in which Lyman’s Law might be expected to operate. Under this condition, the initial consonant of the second morpheme still voices, as in (11). (11) a. voicing following a nasal s&ibaru ‘tie’ fun-jibaru ‘tie up, immobilize’ (Vance 1987: 137; Itô and Mester 1999a: 68) b. no voicing following a vowel kui-s&ibaru ‘clench one’s teeth’ ku(w)u ‘to eat’ (Manami Hirayama, p.c. July 2002)

34

Keren Rice

In this case, post-nasal voicing occurs despite the presence of the voiced obstruent later in the word, unlike in (1), where rendaku is blocked in similar circumstances. (Note that (11) is reported to be the only example of its type found in Japanese.) At this point, I have argued the following. First, the rendaku morpheme, a compound formative, occurs in compounds such as those in (1), but is not found in the verb compounds in examples like (7) through (11), nor in affixation environments as in (3). When the rendaku element appears, it surfaces so long as its realization does not lead to a violation of Lyman’s Law (1d). Because this morpheme does not occur in all morphological concatenations, we can look to verb compounds and affixation structures to see what the effect of the preceding segment is on the initial consonant of the second morpheme in the absence of the rendaku element. In this environment (7, 8, 10), post-nasal voicing is found, and this voicing is not blocked by Lyman’s Law (11a). These facts are difficult to account for under the single voicing mechanism hypothesis: why would realization of the voicing from the rendaku element be blocked by the presence of a voiced obstruent in the second morpheme (1d), but voicing from a nasal be allowed (11 a)? The dual mechanism hypothesis renders such forms explicable: the voicing triggered by the nasal and the voicing of the morpheme-internal obstruent have different sources, and there is no violation of Lyman’s Law since there is but a single laryngeally voiced obstruent present. To summarize, I have proposed that in studying sequential voicing in Japanese, one must sort out two things that are often conflated. First, what I call rendaku is restricted to the voicing triggered by a compound formative. This compound formative has the feature LV, and the surface implementation of LV is blocked by the presence of a voiced (LV) obstruent later in the target morpheme. This compound formative is found in noun-noun compounds (with exceptions, both lexical and principled), but does not generally occur in compounds headed by a verb, nor in affixation structures. In the non-rendaku environment, nasals generally trigger voicing of the initial consonant of the second morpheme. Voicing triggered by the rendaku element interacts with Lyman’s Law, but post-nasal voicing does not, creating apparent violations of Lyman’s Law. My conclusion is that post-nasal voicing is triggered by SV, while rendaku voicing is LV. See also Pater (1999: 332–334), Rice (1993), Steriade (1995: 185), and Ohno this volume, among others, for discussion.

Sequential voicing, postnasal voicing, and Lyman’s Law revisited

35

3.2. Post-nasal voicing within morphemes So far we have seen post-nasal voicing between morphemes. Itô and Mester (1986, 1995a, 1999a, 2003) and Itô, Mester, and Padgett (1995) also argue that postnasal voicing is found within morphemes, and they state this as a constraint, *NT. Examples are given in (4). This claim is falsified when one considers the Japanese lexicon as a whole, as Itô and Mester (1995a, 1999a) recognize – while in many cases a voiced obstruent follows a nasal (4), there are also nasal-voiceless obstruent sequences, as in (12).5 (12) N-T (Sino-Japanese vocabulary) sampo ‘walk’ hantai ‘opposite’ keNka ‘quarrel’ teNka ‘empire’

(Itô and Mester 1995a: 819) (Itô and Mester 1995a: 819) (Itô and Mester 1999a: 71) (Itô and Mester 1999a: 71)

Such words can be found in the rendaku environment, where rendaku occurs. (13) fuufu + keNka fuufu-geNka husband & wife quarrel onna + teNka onna-deNka woman empire

‘domestic quarrel’ (Vance 1987: 114) ‘petticoat government’ (Itô and Mester 1999a: 70)

Itô, Mester, and Padgett (1995) set aside this class of words containing NT sequences, and argue that post-nasal voicing, in the form of a constraint *NT, is a constraint only within the Yamato vocabulary; with Sino-Japanese (12) and other vocabulary, NT sequences are allowed. Morphemes such as those in (12) are Sino-Japanese, so the constraint *NT does not hold.

4. Sequential voicing and within-morpheme post-nasal voicing The claim that tautomorphemic post-nasal voicing is predictable is problematic for both the single mechanism and dual mechanism hypotheses. If postnasal voicing is due to SV voicing, as under the dual mechanism hypothesis, one would expect that a post-nasal voiced obstruent would be transparent with respect to sequential voicing in the presence of the rendaku element, as in (1b). If, on the other hand, post-nasal voicing is due to redundant LV

36

Keren Rice

voicing, as under the single mechanism hypothesis, that voicing should not be present at the time that rendaku voicing takes place. See Itô and Mester (1986) and Itô, Mester, and Padgett (1995) for extensive discussion. Both hypotheses thus predict that post-nasal voicing should not block voicing triggered by the rendaku element. However, post-nasal tautomorphemic voiced obstruents are not transparent with respect to Lyman’s Law, as one might expect, but instead are blockers of rendaku, as in (14). (14) s&irooto + kaNgae s&irooto-kaNgae layman idea *s&irooto-gaNgae aka + tombo aka-tombo red dragonfly *aka-dombo

‘layman’s idea’ (Itô and Mester 1995a: 576) ‘red dragonfly’ (Kawasaki 1996: 4)

Within-morpheme post-nasal voicing is visible with respect to Lyman’s Law, serving to block rendaku, unlike derived environment post-nasal voicing (11), where Lyman’s Law does not block voicing of a morpheme-initial obstruent.6 Let me summarize what we have seen so far. (15) – Lyman’s Law blocks rendaku in noun compounds (1d) – Lyman’s Law does not block post-nasal voicing in a morphologically derived environment in verb compounds (11) or in affixation structures (3) – Lyman’s Law blocks rendaku if the target morpheme contains an ND sequence (14) The dual mechanism account given so far, that rendaku involves LV and post-nasal voicing SV, handles the between-morpheme facts. Morphemeinternal ND sequences are problematic, however, under the assumption that the voicing on these post-nasal obstruents is predictable and that the voicing is SV, and I now turn to these forms.

5. On the representation of tautomorphemic NC clusters There are three recent accounts of tautomorphemic NC clusters that treat postnasal voiced obstruents as underlyingly voiced rather than voiceless, Avery and Idsardi (2002), Kuroda (2002), and Rice (1993). In Rice (1997), I argue that post-nasal voicing is not redundant, abandoning Itô and

Sequential voicing, postnasal voicing, and Lyman’s Law revisited

37

Mester’s *NT constraint. Instead, I propose that Japanese does not provide appropriate cues to stratify the lexicon into Yamato and Sino-Japanese vocabulary based on this constraint, and that voicing, namely LV, is contrastive in post-nasal position. If LV is distinctive in this position, then the surface facts are as expected: LV blocks Lyman’s Law and SV is transparent with respect to Lyman’s Law.7 Avery and Idsardi (2002) pursue the line of thinking that voicing is contrastive after a nasal. They further examine another constraint proposed by Itô and Mester. Itô and Mester (1995a: 821–822) identify two constraints that are relevant to the representation of nasal-obstruent clusters. First is the now familiar *NT, and second is a constraint against voiced geminates, *DD. They link these constraints with stratification as in (16). (16) Yamato *NT *DD Sino-Japanese *DD Avery and Idsardi (2002) pick up on the constraint *DD and propose that the underlying clusters allowed in the Yamato vocabulary are the following: (17) TT

DD

TT clusters are realized as voiceless geminates, and DD clusters are realized as prenasalized stops in standard Japanese. This account explains the lack of contrast between DD and ND clusters and at the same time allows a uniform account of the interaction between ND clusters and rendaku: voicing is distinctive rather than redundant on this D, and thus serves to trigger Lyman’s Law, blocking rendaku. (See also Hamada (1952) and Ohno (this volume), among others, for discussion of the historical development of Japanese. Ohno, like Avery and Idsardi, argues that what I have called postnasal voicing is historically pre-voiced obstruent nasalization.) By this account then, the Yamato and Sino-Japanese vocabulary allow the following underlying clusters.8 (18) Yamato Sino-Japanese

TT TT

DD DD

NT

As Avery and Idsardi point out, the difference between the lexical strata concerns the distribution of N before a consonant.

38

Keren Rice

I do not try to decide between the two accounts here; critically in both cases post-nasal obstruent voicing is non-contrastive between morphemes but contrastive morpheme internally. Instead I turn next to the issue of stratification between the Yamato and the Sino-Japanese vocabulary with respect to the constraint *NT. These two accounts converge on the point that voicing is distinctive in tautomorphemic clusters. Either treatment accounts for the range of patterns found in the language, as summarized in (19). (19) – observation: Lyman’s Law blocks rendaku when a singleton voiced obstruent follows (1d) account: Lyman’s Law [LV from the rendaku morpheme cannot be realized because of following LV] – observation: Lyman’s Law does not block rendaku when a nasal follows (1c) account: nasals have SV [LV from the rendaku morpheme can be realized because no LV segment follows] – observation: Lyman’s Law does not block post-nasal voicing between morphemes [derived environment post-nasal voicing in a non-rendaku environment] (11) account: Derived environment post-nasal voicing is marked by SV, and thus Lyman’s Law is not violated [realization of SV from nasal is not blocked by later LV] – observation: Lyman’s Law blocks rendaku when tautomorphemic ND follows (14) account: Voicing is distinctive in tautomorphemic post-nasal obstruents [LV from the rendaku morpheme cannot be realized because of following LV] 6. On stratification in the Japanese lexicon While stratification and post-nasal voicing are logically independent of one another, it is nevertheless worth pursuing whether the stratification analysis, that NT holds of Yamato but not of Sino-Japanese vocabulary, is reasonable to maintain. In this section I examine why one might choose to abandon stratification with respect to the properties discussed here. That the Japanese lexicon is stratified is a general assumption in the literature. Martin (1952) divides the lexicon into three groups, Native, Sino-Japanese and Onomatopoeia, and Foreign, as does McCawley (1968).

Sequential voicing, postnasal voicing, and Lyman’s Law revisited

39

Martin (1952: 9), in an interesting discussion about his purpose, states that ‘this is the first attempt to make a systematic study of Japanese morphophonemics on a synchronic level.’ He argues that a study of compounds shows ‘a definite cleavage of morphs into two classes, here called class S (for Sino-Japanese, the historical original of the class) and class Y (for Yamato, or native Japanese, the presumed original of most members of the class). There are numerous hybrid compounds, to be sure; but on the basis of selectivity within immediate constituents which contain only two morphs, for the overwhelming majority of cases, each morph and morph group may be placed clearly in one of the two classes’ (24). In a comparison between Native and Sino-Japanese morphemes, McCawley (1968: 64) states about the Sino-Japanese morphemes that they are ‘borrowed from Chinese in medieval times and which function in Japanese chiefly as elements of compounds which usually have a somewhat learned flavor; their role in Japanese is much like that of the Latin and Greek morphemes found in the learned vocabulary of English. Since Sino-Japanese morphemes are syntactically distinct from the other morphemes of Japanese in that they and only they are the bound morphemes from which two-element compounds such as they above are formed, the syntactic information in the dictionary entry of a Japanese morpheme must indicate (directly or indirectly) whether the morpheme is Sino-Japanese or not.’ McCawley shows that Sino-Japanese and native morphemes have a slightly different vowel inventory (Cyu and Cyo are excluded in the native vocabulary but not in the Sino-Japanese vocabulary); Sino-Japanese items obligatorily have no fewer than two nor more than four mora, as also discussed by Itô and Mester (1995a, 2003). While McCawley differentiates Sino-Japanese and Native vocabularies, he has a number of rules marked [-foreign], including Native and Sino-Japanese together, but only one marked [+native] (restrictions on diphthongs) and one marked [+Sino] (a deletion/epenthesis rule). Looking at the distribution of obstruent voicing following a nasal, the following divisions into strata then are proposed by Itô and Mester (1995a, 1999a). post-nasal voicing between morphemes post-nasal voicing within morphemes

Yamato yes ND

Sino-Japanese yes NT/ND

I consider three issues, learnability, a comparison with the English Germanic/ Latinate split (cited by McCawley 1968; Itô and Mester 1995a, 1996, 1999a), and the writing system.

40

Keren Rice

6.1. Learnability issues So far we have seen the following. In the surface lexicon of Japanese, NT and ND contrast tautomorphemically; heteromorphemically ND is generally found (see Martin 1952 and Vance 1987 for discussion of exceptions) except in the rendaku environment where Lyman’s Law blocks it (hyootan kago ‘gourd basket’ Itô and Mester 1986: 61–70).9 Tautomorphemic ND clusters block sequential voicing triggered by the rendaku element (14), while NT clusters allow sequential voicing triggered by this element (13). In this section, I consider these facts with respect to the hypothesis that the lexicon of Japanese is stratified on the basis of the constraint *NT which holds of the Yamato vocabulary. In Rice (1997), I question the stratification hypothesis with respect to post-nasal voicing, raising the question of how a child would come to stratify the lexicon based on exposure. I ask why, when a child hears ND and NT both, s/he would choose to place these lexical items in different parts of the lexicon rather than abandon the generalization that Yamato post-nasal obstruents are voiced in one portion of the vocabulary. A parallel problem exists in English, one that we might term the font-fond problem: why is it not proposed for English that these terms occupy different strata rather than showing a post-nasal contrast? This then is the learnability issue – what would cause the learner to place tautomorphemic NT and ND clusters in different strata?10

6.2. Comparison with the Germanic/Latinate split in English McCawley (1968) and Itô and Mester (1995a, 1999a, 2003), in arguing for stratification, cite the well-established tradition in Japanese of distinguishing native, Sino-Japanese, other foreign, and mimetic vocabulary. They compare Japanese with English, referring to the Germanic-Latinate split in English. Let us look more carefully at the comparison between English and Japanese with respect to the criteria used to establish strata. First consider English. The Latinate/Germanic vocabularies in English are differentiated in several phonological ways. First consider Latinate vocabulary such as divine, sane, and obscene, the set of words that participate in Trisyllabic Laxing. Notice that there is nothing inherent in these lexical items themselves that tells us that they deserve a special marking in the lexicon, rather it is their patterning under affixation. So, for instance, sane takes the nominalizing suffix -ity, and its vowel laxes in the presence of this

Sequential voicing, postnasal voicing, and Lyman’s Law revisited

41

suffix; vain is parallel in its patterning. The very similar-sounding adjective plain, on the other hand, takes the suffix -ness, and the vowel of this stem does not lax. It is thus the interaction of these stems with affixes that allows us to divide them into two classes; knowledge of the adjective alone does not allow them to be divided into two groups. Similar are Latinate verbs such as permit and resign. These verbs are grouped together into a class based on phonological properties such as unexpected stress assignment (e.g. permít vs. édit), consonantal shifts under suffixation (e.g., permit/permission, remit/remission), and the presence of /s/-voicing intervocalically (consign vs. resign); see, for instance, Kiparsky 1985. It is the different patterning with respect to the grammar that allows one to distinguish classes of lexical items. Now consider Japanese. Stem-internal voiced obstruents, whether singletons or post-nasal, pattern in an identical way with respect to rendaku – they block its application. There is thus no phonological patterning that allows one to distinguish between two classes of ND clusters, those that contrast with NT and those that do not. Turning to post-nasal voicing in the non-rendaku environment, morpheme-initial obstruents in this environment are different. Alternations show that these are lexically voiceless (e.g., the suffixes begin with T in most environments and with D only following a nasal; the stems subject to post-nasal voicing begin with a voiceless obstruent when they are in a non-nasal environment), and they are voiced by postnasal voicing. But this voicing is not blocked by Lyman’s Law. Thus we can distinguish two types of surface ND clusters, the derived environment clusters where the voicing is predictable and the within-morpheme clusters where the voicing is contrastive. Non-rendaku derived environment voiced obstruents also distinguish themselves by their failure to participate in Lyman’s Law, again showing that their voicing is of a different type.

6.3. On the status of alternations Itô and Mester (1999a, 2003) and Itô, Mester, and Padgett (1999) remark that Rice (1997) incorrectly assumes that there are no alternations associated with the constraints that are involved in lexical stratification. For the purposes of the Yamato and Sino-Japanese strata, the only constraint that differentiates them is *NT. As argued in Rice (1993) and summarized in section 4, the derived environment effects are a result of SV and thus are not relevant to the question here.

42

Keren Rice

Again a comparison with English is in order. In English morphemes, NT and ND are contrastive, as in items such as font-fond, ant-and, brant-brand, pint-kind, grant-grand, tent-tend. However, in a derived environment (past tense), only ND occurs (e.g., fanned, canned, banned, pined). One would not claim that words in English with ND always have predictable voicing on the D; rather within morpheme post-nasal obstruent voicing is unpredictable while between morphemes, post-nasal obstruent voicing is predictable.

6.4. Writing system It is sometimes suggested that the writing system of Japanese aids in stratification. For instance, Itô and Mester (1999a: 63) point out that “this stratification corresponds in kind to the distinction in English between the Germanic versus the Latinate vocabulary, but is more accessible and conscious to the non-specialists because of its reflection in the writing system.” The orthography of Japanese is complex, using three different systems. As Kess and Miyamoto (1999: 14) discuss, “there is no strict one-to-one correspondence between type of vocabulary item and script type, although one usually sees Chinese borrowings in kanji characters, native Japanese content words in kanji or hiragana, native Japanese function words in hiragana, and the borrowings from other languages in katakana.” See also Itô and Mester (1999a), Kess and Miyamoto (1999), and Vance (1987), among others, for discussion and further references. Let us examine the assumption that the writing system of Japanese aids in stratification in more detail. The following discussion summarizes Vance (1987: 2–3). Of the three types of writing systems found in Japanese, two are relevant to the distinction between the Yamato and Sino-Japanese strata, namely kanji and hiragana. Kanji, the Chinese characters, are used for Sino-Japanese morphemes and for some Yamato morphemes as well. Hiragana is a syllabic system developed from a small set of simplified kanji. In modern Japanese orthography, grammatical endings are generally written in hiragana while noun, verb, and adjective stems are written in kanji. Native and Sino-Japanese morphemes for which kanji are no longer in general use are commonly written in hiragana. Children’s books ordinarily use hiragana for native and Sino-Japanese morphemes for which the readers are not likely to know the kanji (children learn slightly under ninety kanji in the first year of school; Poser, personal communication, June 2002). Recent borrowings, not discussed in this article, are generally written in

Sequential voicing, postnasal voicing, and Lyman’s Law revisited

43

katakana. Thus a typical text contains both kanji and kana; see Kess and Miyamoto (1999) for detailed discussion. Many words of Sino-Japanese origin are written with two kanji. This might suggest that two morphemes are actually involved, and that the use of two kanji provides orthographic evidence that *NT holds of the Yamato stratum but not of the Sino-Japanese stratum. Martin (1952) can be viewed as providing evidence for the position that the Sino-Japanese forms are morphologically complex: he points out that morphs in Japanese are limited to certain shapes, and that n.C sequences (where the dot represents a morpheme boundary) ‘are nearly always indicative of morph boundaries’ (17). Vance (1996: 23), on the other hand, suggests that ‘many Sino-Japanese words written with two kanji probably should not be analyzed as consisting of two morphemes.’ He further notes that ‘… kana spelling provides a clear indication in some cases that an etymological compound is no longer recognized as a compound’ (27). In discussion of kanji, Kess and Miyamoto (1999: 68) point out that compound kanji are often used for common vocabulary items in literary Japanese, and that many two-kanji compound words are stored and accessed as whole word units (68–69). Based on the studies of the writing system cited above, it appears that orthography is not necessarily a useful tool to demarcate the Yamato and Sino-Japanese vocabularies. First, both strata employ kanji and hiragana. Second, discussion in Kess and Miyamoto (1999) and Vance (1996) suggests that etymological compounds are not necessarily analyzable as compounds synchronically. While the writing system allows recent borrowings to be identified through the use of katakana, it is not necessarily helpful in sorting out the Yamato and Sino-Japanese strata.

7. Conclusion In this article I have made three points. First, I have argued that not all compounds take the rendaku element, and that post-nasal voicing can be studied best outside of the rendaku environment. Second, I have added support to the position that two voicing mechanisms are found in Japanese, LV and SV. Post-nasal voicing is contrastive within a morpheme, marked by LV, and it is predictable between morphemes, marked by SV, in the nonrendaku environment. The contemporaneous inclusion of tautomorphemic post-nasal voiced stops in Lyman’s Law in the rendaku context and the failure of derived environment post-nasal voicing to be blocked by Lyman’s

44

Keren Rice

Law in the non-rendaku environment provides evidence for this claim. Third, I have argued that NC sequences provide little evidence for stratifying the Japanese lexicon into Yamato and Sino-Japanese vocabulary. The kinds of alternations that one would hope to find to distinguish the two strata with respect to this criterion do not appear to be available. Stratification with respect to post-nasal voicing seems to be tangential as no approprite alternations exist to trigger the placement of words in different strata. One certainly does not want to deny the possibility of stratification in grammar. Within-morpheme requirements, without the benefit of alternations, are not clear evidence for stratification, however, as there is no evidence available to the learner for making the morpheme other than what it appears to be.

Acknowledgements Thank you to Bill Poser for helpful discussion of the Japanese facts discussed in this article. I could not have completed this work without his assistance. Thank you also to Kazutoshi Ohno for detailed comments and discussion on an earlier draft, to an anonymous reviewer, and to Manami Hirayama for help with the data. Misunderstandings are my own.

Notes 1. Additional strata are argued for, mimetic and foreign. See Itô and Mester 1995a and Itô and Mester 1999a for recent work that deals explicitly with this classification, and Martin 1952 and McCawley 1968 for foundational work in English. See Vance 1987 for a discussion of some of the older literature. 2. There are many lexical exceptions to the processes discussed in this article; see, for instance, Martin 1952, Vance 1987, Labrune 1999, Ohno 2002, and Kubozono this volume for discussion. For instance, some words always undergo rendaku as the second element of a compound, some never undergo rendaku (e.g., tuti ‘soil, ground,’ himo ‘string’), and some are variable, undergoing rendaku in some but not all compounds (e.g., hune ‘boat’); see Ohno 2002 and others for discussion. Some exceptions can be accounted for by phonological, syntactic, and semantic factors; see Lyman 1894 and Ogura 1910 (cited in Martin 1952: 49) as well as Otsu 1980, Itô and Mester 1986, Vance 1987, Labrune 1999 and Kubozono this volume, among others, for dif-

Sequential voicing, postnasal voicing, and Lyman’s Law revisited

3.

4. 5.

6.

7.

8. 9. 10.

45

ferent perspectives. In examining these processes, it is necessary to sort out what is predictable from what is listed; I concentrate here on what is predictable, and am not concerned with words that are generally considered to be lexical exceptions. In testing this prediction, I do not consider an environment of the following type. A morpheme-initial obstruent is voiced by post-nasal voicing (as in (3)). This form is then put in the rendaku environment. What happens? Such cases perhaps do not exist. Even if they do, Lyman’s Law is usually believed to have as its domain a single morpheme. In addition, other structural factors can block rendaku (see Otsu 1980, Itô and Mester 1986), perhaps rendering any findings uninterpretable as support for one position or another. This morpheme often occurs in an alternative form, jiru, which Martin 1952: 52 identifies as more colloquial. Itô and Mester 1986: 69 treat these Sino-Japanese words as compounds (sam+po, *sam+bo ‘stroll’, han+tai *han+dai ‘opposition’), as does Martin 1952 and McCawley 1968. In Itô and Mester 1995a: 819, these words are written without a morpheme boundary, and are not treated as bimorphemic. Vance 1996 remarks that many Sino-Japanese items that are written with two kanji probably should not be analyzed as consisting of two morphemes, and provides evidence for this claim. See section 5.4 and Vance 1996 for discussion. Itô, Mester, and Padgett 1995 present an Optimality Theory solution to this problem. See especially Pater 1999 for discussion of problems with the details of their account. A reviewer raises the interesting question of whether the post-nasal voiced obstruents that are LV (morpheme-internal) and the post-nasal voiced obstruents that are SV (following a nasal in another morpheme) are phonetically identical. There are preliminary indications that there are some differences between them; see Avery and Idsardi 2002 for discussion. Note that in recent borrowings, DD clusters are also found; see Itô and Mester 1995a, 1999a and Kuroda 2002. I am assuming, following Vance 1996, that these Sino-Japanese items consist of a single morpheme. See the discussion in section 5.4 on writing. Vance 1996: 26 notes “As Varden (1994) points out, typical Japanese children have already acquired many Sino-Japanese binoms before they learn to read, but their vocabularies are unlikely to provide much basis for further analyzing these words. In many cases, of course, the relationship between the meaning of a Sino-Japanese binom and the meanings of its constituent “morphemes” is opaque even to an educated adult.”

Sei-daku: diachronic developments in the writing system Kazutoshi Ohno

1.

Introduction

One issue regarding voicing in Japanese concerns the sei-daku (lit. ‘clearmuddy’1) distinction, which correlates with the voicing opposition in contemporary Japanese. Various aspects of the sei-daku distinction are represented in the history of the writing system. This article presents the historical development of this distinction within Japanese orthography and comments on the nature of such distinctions. This article, therefore, chiefly presents the facts of sei-daku as represented in the writing system, and introduces prior proposals that attempt to account for the inconsistent representation of this phenomenon in the orthographic history. It provides neither new data nor new findings. The contents are as follows: Section 2 illustrates the three diachronic stages of the sei-daku distinction in writing (distinguished, not distinguished, and distinguished again); Section 3 introduces hypotheses that explain the transitions of the three stages; Section 4 displays one of the important remaining issues: sei-daku and nasality; finally, Section 5 concludes with some discussion points.

2.

Sei-daku in writing systems

2.1. Issue 2 In the current usage of the kana syllabary (hiragana or katakana), sei-daku is distinguished by the addition of two dots just to the top right corner of a given kana (see Appendix). These dots are called daku-ten (ten ‘dot’), which change a sei-on (on ‘sound’) character to a daku-on character. That is, daku-on are not presented by independent kana characters, but rather are created by adding a diacritic to a sei-on character. This convention is due to

48 Kazutoshi Ohno the fact that hiragana and katakana materialized as systems without a seidaku distinction. One kana could represent either sei-on or daku-on. The source of hiragana or katakana, man’yougana (see section 2.2 below for details), actually had daku-on characters, and the sei-daku distinction was quite well distinguished by different characters at some point in the past. In terms of the writing conventions, then, the diachronic transition of the sei-daku distinction can be roughly divided into three stages as given in (1) below (see sections 2.2 through 2.4 below for further details and clarification). (1)

Sei-daku in kana system sei-daku distinction Earliest Stage: yes, by different characters Middle Stage: no Current Stage: yes, by diacritic [daku-ten]

kana man’yougana hiragana, katakana hiragana, katakana

The two periods in which sei-daku was distinguished within the kana systems are interrupted by a period in which sei-daku was not distinguished in the writing system. This is a fact of the history of kana usage. Explanations for this fact will be largely different depending on whether we regard it as a reflection of actual spoken language or not. We will address this issue in section 3. In the remaining part of section 2, we will discuss the diachronic development of the kana systems in Japanese in more detail. 2.2. Earliest Stage: man’yougana Chinese characters in Japan, or kanji, were (and still are) read in two ways: the Chinese way (on reading), and the Japanese way (kun reading).3 For example, the Chinese character for ‘four’ could be read either as si (on reading) or as yo (kun reading).4 These readings were utilized to dictate Japanese pronunciation. Such Chinese characters, which present sound information rather than logographic information, are called man’yougana ‘(lit.) kana used in Man’youshuu’ because their use is most diversified in Man’youshuu (see below).5 The earliest written works in Japanese can be traced back to the eighth century, or the Nara Era [710–784], represented by writings such as Kojiki (712), Nihonshoki (720), and Man’youshuu (759?).6 Some parts of these official documents or collections are written in man’yougana.7 For example, waka (Japanese traditional songs) were usually written in man’yougana.

Sei-daku: diachronic developments in the writing system

49

The following is an example of man’yougana use in these works. In order to represent the native vocabulary kamo ‘(admiration marker)’, it may be written with the kanji for ‘duck’ whose kun reading is kamo, or with two kanji for ka-mo. In the latter case, a kanji is chosen from multiple candidates whose on or kun reading is ka, and another is chosen for mo. The choice of kanji is totally dependent on the author and varies among authors and even within an author’s work. Consequently, the same pronunciation could be represented by different kanji,8 while the same kanji could be read differently such as the kanji for ‘four’ (si or yo) mentioned above.9 Studies of man’yougana reveal that there had been separate characters used specifically for sei-on and specifically for daku-on in this period.10,11 Modern studies by Kasuga (1941), no (1947–8, 1953), Nishimiya (1960), Tsuru (1960), among others, confirm that sei-daku in those ancient works cited above was generally distinguished by different man’yougana.12 There is little room to doubt the existence of the sei-daku distinction in this body of literature.

2.3.

Middle Stage: development of hiragana/katakana

2.3.1. Simplification of the kana system The use of man’yougana made it possible to dictate the Japanese language. As the use of man’yougana spread, simplification of the kana system was set forward in two respects: first, this was achieved by the creation of a one-to-one correspondence between each kana and sound, i.e. one kana represents one sound, and one sound is represented by one kana. It is nothing but redundant for a sound-based writing system (syllabary) to have multiple different characters for a single sound (syllable) or multiple ways of representing a sound,13 and incomplete to use the same kana for different sounds. Secondly, this was achieved by simplifying the characters themselves. Man’yougana, or full Chinese characters, were unnecessarily complex for the purpose of representing each Japanese sound.14 A cursive or partial (or mixture of cursive and partial) representation of man’yougana was thus employed.15 Hiragana and katakana were developed through such simplification processes. Interestingly, the two kana systems were established without the sei-daku distinction by characters, despite the fact that sei-daku had been distinguished by using different man’yougana previously. We will see how

50 Kazutoshi Ohno this occurred below, but before moving on, two things must be kept in mind with regard to the development of hiragana and katakana. One is that the development was gradual. The other is that the sei-daku distinction was not “associated” with the writing systems, i.e. it is not accurate to say “man’yougana had the sei-daku distinction, while hiragana and katakana didn’t”.

2.3.2. Simplified kana in chaos In the early Heian Era, or during approximately the first 100 years of the Heian Era [794–1192], the kana system was rather chaotic in a sense because the simplification processes noted above were just beginning to be implemented: the system was not yet standardized. In this period, a single sound was still represented by multiple different kana. However, the number of kana for a given sound was rapidly reduced after Man’youshuu (759?), but the system was not a one-to-one system. Moreover, the degree of simplification of characters varied. Some simplified characters were as simple as hiragana or katakana currently used, or simpler, but some were as complex as full kanji, and yet others were somewhere in between.16 One man’yougana (kanji) could be seen in various degrees of simplified forms. The choice of which kana represented a given sound, as well as how it was written (cursive or partial) and how much the kana was simplified (from hiragana/katakana-like to kanji-like) was determined by the author and the type of work being written. Many variants of simplified characters were created. Some were repeatedly used (and would eventually develop into hiragana or katakana), while many were merely forgotten. It is important to note that simplified characters in this time period were not exactly separated into the distinction of hiragana and katakana yet.17 In this chaotic period, some simplified characters for daku-on were actually used ( tsubo 1977: 257). This was a natural transition from the period in which sei-daku was distinguished by different man’yougana. However, those simplified kana for daku-on were eventually abandoned, and hiragana and katakana were developed without having kana for daku-on later on.

Sei-daku: diachronic developments in the writing system

51

2.3.3. Sei-daku confusion in man’yougana The orthographic sei-daku distinction, which surely existed at some time, rapidly disappeared in this chaotic period. Generally speaking, sei-daku is not distinguished in the literature in and after the ninth century (Hamada 1971: 44; Nakata and Tsukishima 1980: 586, etc.). Tsuru (1977: 238) discusses the decline of the number of man’yougana used only for daku-on (daku-on sen’yougana). (2)

Man’yougana used only for daku-on18 (Tsuru 1977: 238) a. Shokunihongi (797)12[9]: ga, gö, za, za, zi, zi, zu, zö, di, di, dö, bï b. Nihonkouki (840) 6 [6]: ga, gï, gu, zu, ze, bï c. Shokunihonkouki (869) 3 [3]: ga, gï, bï

The sei-daku distinction is rather confused in Shokunihongi (797), which retains 12 man’yougana for daku-on (for 9 different sounds). Nihonkouki (840) retains 6 man’yougana for daku-on. The sei-daku distinction is rarely seen in Shokunihonkouki (869), which retains just 3 man’yougana for dakuon, and which are used only in a particular volume.19 What the statement “the sei-daku distinction is [rather] confused” means is that man’yougana (or its simplified variant) previously used only for sei-on is used where daku-on is expected, and man’yougana previously used only for daku-on is used where sei-on is expected. In other words, the same man’yougana character began to be used to represent both sei-on and dakuon. This “confusion”, or lack of distinction, can in fact already be seen in the newer volumes of Man’youshuu to a large degree, and in other earlier literature of the eighth century to varying degree. It is worth noting that some man’yougana widely used for both sei-on and daku-on in Nihonkouki and Shokunihonkouki are the matrices of hiragana or katakana (Tsuru 1977: 238, who lists 12 such man’yougana).

2.3.4. Transition to hiragana/katakana Eventually, hiragana and katakana developed into two separate writing systems.20 Roughly speaking, cursive characters rapidly developed into hiragana after the chaotic period, i.e. after around 900, while partial characters evolved into katakana in or a little before the second half of the Insei Period [1068– 1221] ( tsubo 1977: 257–264) – both without a character for daku-on.

52 Kazutoshi Ohno To summarize, it is clear that the systems of man’yougana and hiragana/ katakana represent a continuum, illustrating that the man’yougana system gradually shifted to hiragana/katakana. The loss of the sei-daku distinction in writing can already be seen in the man’yougana system (see 2.3.3) and yet preservation of some daku-on characters in simplified forms were established, if only temporarily (see 2.3.2). It is more natural to assume that the tendency not to keep the orthographic sei-daku distinction became stronger and stronger – regardless of the kana system – between the Earliest Stage and the Middle Stage. The transition from man’yougana to hiragana/ katakana happened to overlap with this tendency. It is thus perhaps no surprise that hiragana/katakana developed without the sei-daku distinction.

2.4. Current Stage: diacritic for daku-on In the current use of hiragana and katakana, sei-daku is presented and recognized by the absence or appearance of a two-dot diacritic, or daku-ten. However, daku-on have not been consistently represented in the writing system until quite recently, though the history of diacritics for daku-on is long. The sei-daku distinction in literature began declining rapidly in the ninth century (cf. 2.3.3 above). Yet there was still a need for the sei-daku distinction, e.g. when describing the precise pronunciation of Chinese characters. Chinese dictionaries, commentaries on Chinese literature, commentaries on Buddhist scriptures brought from China, and so forth, all demanded a seidaku distinction. Hence, some texts were written in man’yougana allowing for the sei-daku distinction, even when the simplified kana (which have no daku-on characters) were being widely used.21 The most popular means to represent sei-daku, however, was to mark the character with some diacritic symbol. Nakata and Tsukishima (1980: 586) give a brief history of the development of daku-ten, on which the following discussion draws. The use of a daku-on marker can be traced back to as early as the end of the ninth century, which means that certain Chinese characters must be read as daku-on.22 Diacritical symbols for daku-on on katakana began to be used in the 11th century or the end of the previous century, but they were normally added to the left of kana. The diacritic for daku-on began to be placed in the top right corner in the 14th century.23 At first, the use of diacritical symbols was actually not limited to the representation of daku-on. Some symbols, especially the dots or circles on the left of a given kana, overlap in function with accent [tone] marking.24

Sei-daku: diachronic developments in the writing system

53

Some were also used to represent nasal sound. Placed to the (top) right of a kana, they could be distinguished from tone marks, i.e. function as daku-on markers (Komatsu 1981: 63–71). The use of daku-on markers was fairly established and gradually spread to other fields than those directly related to Chinese in the Muromachi Era [1338–1573], but they were not popularized yet. In the first half of the Edo Era [1600–1867], the diacritic was unified to the two dot form on the top right corner (i.e. same as daku-ten currently used) and the appearance of the diacritic increased greatly (Ono 1995: 80). The early Edo Era, therefore, is often considered to be the era in which stabilization of the sei-daku distinction in writing (by daku-ten) occurred. However, even in the Edo Era, daku-on were not consistently marked by daku-ten.25 Generally speaking, daku-ten was added only when the author thought it necessary. Hence, kana with daku-ten would be daku-on, but kana without daku-ten could be sei-on or daku-on. Not as common as daku-ten, “fudaku-ten” (fu ‘not’) was sometimes added to represent sei-on in the Edo Era (Komatsu 1981: 71).26 It will safely be said that kana was still common to both sei-on and daku-on. The use of daku-ten was finally incorporated into the modern education system in the Meiji Era [1868–1912]. Even so, some documents were still written without daku-ten (Maruyama 1967: 1122).27 Official (Governmental) documents, such as in law or in regards to the constitution, are written without using daku-ten, and the style continued until the end of the World War II (Kamei 1970: 44–45). The rigid sei-daku distinction by daku-ten, then, is much more recent than people normally think.

3.

Interpretations of the three stages

3.1. Two approaches 28 Although the transitions are gradual, the three stages given in (1) – sei-daku distinction by characters, no sei-daku distinction, and sei-daku distinction by diacritic – are surely observed. Two types of approaches will be available to address this issue. The first approach hypothesizes that the stage transitions, or varying degrees of orthographic representation, are reflections of the actual language. That is, sei-daku forms were recognized as distinct at first, but indistinct later, and currently they are considered distinct once again. The question here is: why was the distinction once made, then no longer existed, and then

54 Kazutoshi Ohno seemingly reappeared? Under this assumption, therefore, it must be explained how and why such changes occurred, including changes of the sound values of daku-on and/or sei-on. The second approach hypothesizes that the stage transitions merely reflect the facts of writing. Under this hypothesis, it becomes reasonable to claim that the recognition of the sound values of sei-daku remained the same throughout the history of Japanese. The question here is: why such different writing conventions, in terms of sei-daku, were adopted? In the remainder of section 3, we will discuss two proposals along the lines of the second approach (sections 3.2 and 3.3), and one along the lines of the first approach (section 3.4). In these subsections, we focus on the transition from the Earliest Stage to the Middle Stage, since the explanation of this transition is the key for each proposal. Finally, we will discuss the transition from the Middle Stage to the Current Stage (section 3.5). 3.2. Second approach 1: sei-daku has been distinctive The second approach introduced above hypothesizes that the sei-daku distinction, or the lack of this distinction, is merely a matter of writing practice. This approach can be further divided into two positions, depending on whether or not we assume that sei-daku has remained distinctive throughout the history of Japanese. In 3.2, we will discuss the first position that was adopted within this second approach, i.e. sei-daku has been phonologically distinctive but the distinction was not reflected in writing (in the Middle Stage). This assumption must be accompanied by a satisfactory account of the question mentioned in 3.1 above – why different writing conventions were adopted? More precisely, the following question must be answered: Why is there a stage in which sei-daku was not distinguished in writing if it was distinctive in the [spoken] language? A possible answer to the question is: “Because sei-daku was rarely contrasted for the purpose of interpretation (even if contrasted in pronunciation), the distinction was simply ignored in writing conventions”, as seen in Takagi et al. (annotated) (1960: 42–46), Anonymous (1963: 375–388), etc. As long as the sei-daku distinction seldom triggered semantic confusion, it did not have to be reflected in the writing system (e.g. hasituma and hasiduma would be the same word meaning ‘loving wife’; naturally context played a role as well). The additional explanation above would explain the Current Stage as well – because sei-daku began to trigger semantic confusion, the distinction

Sei-daku: diachronic developments in the writing system

55

was employed within the writing convention. However, it does not explain why sei-daku was relatively well-distinguished in the earliest literature. Hence, further explanation is added as follows (cf. Takagi et al. (annotated) 1960: 43–44): “Man’yougana were used to represent precise pronunciation, while hiragana and katakana were created to represent sound conveniently (simply and quickly)”. That is, the appearance and the loss of the sei-daku distinction in writing resulted from the different functions of man’yougana and hiragana/katakana. Man’yougana would distinguish hasituma from hasiduma because they are pronounced differently, while hiragana and katakana would not because they are the same word for ‘loving wife’.29 For convenience, it would have been better to represent hasituma and hasiduma together, i.e. with a character that represents both sei-on and daku-on (tu/du). After all, the main claim of this position is that hiragana and katakana were established without the sei-daku distinction because the users felt it most convenient for their writing systems. This claim itself is quite reasonable, if we do not assume an association between the kana system and the sei-daku distinction. There remains an important question, which is not detrimental to the claim made above. Would people really give up the sei-daku distinction just for convenience in writing despite the fact that they were well aware of the distinction? If people hear and pronounce two sounds distinctly, will it seem natural to describe them in different ways?30 In order to answer this question, or avoid answering it, some have sought to explain the three stage transitions without assuming that sei-daku was distinctive. 3.3. Second approach 2: sei-daku was indistinctive The second approach does not necessarily assume that sei-daku was distinctive throughout the history of Japanese. Some researchers take a radical position by assuming that there was no sei-daku distinction in the past at all, but many simply assume that generally the auditory distinction was extremely hard and thus confused quite often in the writing system, even though the distinction existed in speech. They assume that sei-daku was actually indistinctive in the past anyway, and gradually became distinctive later (in the Current Stage). In order to justify this assumption, it must be explained why sei-daku was relatively well-distinguished in the earliest literature. Hamada (1960, 1971) proposes that the “knowledge” or “skills” of the authors or editors made the distinction possible. That is, exceptional writers

56 Kazutoshi Ohno could distinguish sei-daku both by hearing and in writing. One of the reasons for this is that they are assumed to have been familiar with Chinese (characters, language, and literature)31, so that they could utilize man’yougana for distinct phonetic phenomena. As discussed in section 2.4 above, the rigid sei-daku distinction was generally required in reading Chinese. Hence, being familiar with Chinese would have resulted in being able to distinguish seidaku. There are several pieces of evidence to support Hamada’s hypothesis. Let us discuss two of them. First, even in the eighth century, in which seidaku was well-distinguished in official documents, sei-daku was not distinguished in private documents and/or by lower-class people (see also Kamei 1970, 1985: 228). For example, the two personal letters of Shousouin kana monjo (762?)32 do not show sei-daku distinctions at all in their man’yougana use. Second, Shinsen jikyou, a set of the oldest existing ChineseJapanese dictionaries, is relatively rigid in the sei-daku distinction, though it was written in 892.33 This is easily understood if we assume that the distinction was the result of an educated editor writing for educational purposes.34 Because there were people who could distinguish sei-daku, it is natural to assume that sei-on and daku-on would have been pronounced differently. However, it is possible that ‘common people’ did not pay attention to the sei-daku distinction. There appears room to suspect that perhaps they would have been even unaware of such distinction; similar to the nasal alternations in contemporary Japanese.35 Assuming so does not require any correction of the argument presented above; rather, it accounts for things more precisely. Let us note two ways in which this is so. First, it explains why hiragana and katakana were established without the sei-daku distinction within their systems. This would have been because people generally had trouble in distinguishing sei-daku. Originally, man’yougana was used by the elite, who could distinguish sei-daku. As man’yougana were popularized, those who were not educated enough to distinguish sei-daku started using manyou’gana or their simplified forms, confusing sei-daku. Hiragana and katakana were not established or issued by one particular person, institute, or authority, at some particular moment – but by various people, over time. Second, even the writers of the earliest literature, who could distinguish sei-daku, sporadically confuse sei-daku in writing. This might have been due to the fact that the pronunciation of sei-daku in that period was rather more indistinctive than is currently believed.

Sei-daku: diachronic developments in the writing system

57

3.4. First approach (sei-daku revived) The first approach hypothesizes that the various transitions seen in the writing conventions reflect phonological perceptions of the spoken language. That is, the sei-daku distinction existed but disappeared later, and then was revived in the spoken language. End (1989) is one of the rare scholars who try to justify this approach.36 His argument does not seem to be well-supported, but is worth discussing because parts of his argument may lead to a fuller understanding of this phenomenon. End (1989) begins his argument by questioning Hamada’s account of the “sei-daku distinction by the elite” (see section 3.3 above). End points out that kakekotoba37 ‘pun(s)’ which ignore the sei-daku distinction are regularly found in the Middle Stage,38 while such kakekotoba are rarely seen in the Earliest Stage. The authors of the kakekotoba verse are supposed to be similar in social class (e.g. aristocrats) in both stages, so the sei-daku (in)distinctiveness may not actually be due to educational backgrounds. He further points out that kakekotoba, or verse in general, are not visually but verbally appreciated. That is, the sei-daku (in)distinctions in kakekotoba are in reality reflections of the spoken language. Here the possibility arises that sei-daku was in fact similar in sound quality and rather indistinctive in the Middle Stage (and therefore there was no sei-daku distinction in kakekotoba, too), while sei-daku was distinctive in the Earliest Stage (and therefore there was no sei-daku ignorance in kakekotoba, either). End (1989) hypothesizes that the sound quality of daku-on actually changed in (or a little before) the Middle Stage, which blurred the sei-daku distinction. Nasality plays an important role in his proposal. It is well known among scholars that daku-on were very likely to be accompanied with nasality in the past. This is indicated in literature written in non-Japanese,39 such as works in Portuguese,40 Chinese,41 and Korean42 (see Hamada 1952a). Assuming that daku-on were generally nasalized in the past,43,44 End proposes an ambitious hypothesis that the nasality “neutralized” the sei-daku distinction. That is, daku-on without nasality were distinct from sei-on, but daku-on with nasality were not.45 End (1989) says that the nasalization of daku-on can be traced back to around 800,46 based on his previous work in End (1973) which discusses that the sound quality of [ b ] changed to [ mb ] around that time.47 It is reported that [m]-column sounds (ma-gyou on, i.e. / ma, mi, mu, me, mo / ) changed to [b]-column sounds (ba-gyou on, i.e. / ba, bi, bu, be, bo / ) in many words in this period (Matsumoto 1965, etc.). Another piece of evi-

58 Kazutoshi Ohno dence, out of several he provides, is the appearance of nasality in the inflections of the b-final verbs in this period; e.g. tob+ta ‘fly+PAST’ > tonda ‘flew’, which is actually similar to the sound alternation of yom+ta ‘read+ PAST’ > yonda ‘read (past tense)’. These will be naturally understood if the sound quality of [ b ] was closer to [ m ] such as [ mb ].48 He further assumes a similar change for other daku-on as well. See End (1973) for other evidence and discussion. To summarize, End (1989) hypothesizes that nasalization of daku-on blurred the sei-daku distinction, which is reflected in the writing of the Middle Stage, while nasality appeared around 800. Before that, daku-on were not accompanied with nasality and distinguished from sei-on, which is reflected in the writing of the Earliest Stage. Thus, according to End , revival of the sei-daku distinction must be closely related to the disappearance of nasality (which will be discussed in 3.5 below). The hardest part in supporting this hypothesis is the unsatisfactory explanation for why the nasalized daku-on and sei-on are indistinctive, while daku-on without nasality are distinctive from sei-on. Another difficulty is generalizing the change seen [ b ] to other daku-on. End (1989) discusses the change of [ b ] to [ mb ] (i.e. ba-column daku-on) extensively, but other daku-on only briefly. Yet another remaining issue is the time when nasality appeared with daku-on. The change of [ b ] to [ mb ] may have been around 800, the transitional span from the Earliest Stage to Middle Stage, but it may not have been indicative of the change in daku-on in general. It is important to remember that the nasalization of daku-on is one of the hardest issues to deal with in the study of the history of Japanese (see also fn. 43 and section 4 below).

3.5. Sei-daku in contemporary Japanese We have reviewed three possible arguments for the transition from the Earliest Stage to the Middle Stage. In this section, we briefly address the transition from the Middle Stage to the Current Stage, and then discuss the status of the sei-daku distinction in contemporary Japanese. Each position will provide motivation for the transition to the Current Stage. The first argument (discussed in section 3.2) argues that it became more convenient to distinguish sei-daku in writing (e.g. started triggering semantic confusion). The second argument (section 3.3) argues that it is because the sei-daku distinction which had been non-existent became con-

Sei-daku: diachronic developments in the writing system

59

trasted. The third argument (section 3.4) explains the phenomenon in a similar fashion to the second argument, but further claims that the contrast became clearer due to the disappearance of nasality from daku-on. However, few scholars will disagree with the assumption that the transition took place in the 16th –17th century or so. Some refer to Christian literature (kirishitan shiryou, see fn. 40) written around 1600. The sei-daku differences seen in the vocabulary there are mostly the same as the sei-daku differences seen in the contemporary Japanese vocabulary (Takagi et al. 1960: 42–43), though it is not so hard to find exceptions. Scholars assume that the sei-daku distinction became fairly stabilized a little before such literature was written, i.e. sometime in the 16th century. Others refer to the fact that the use of daku-ten was increased in the first half of the Edo Era [1603–1867] (see section 2.4 above). Such scholars assume that the seidaku distinction was tentatively stabilized in the early Edo Era, i.e. in the 17th century.49 There is a view that the sei-daku distinction today is not as stable as normally believed,50 despite the fact that Japanese speakers can clearly distinguish sei-daku auditorily and that kana (hiragana, katakana) has a solid way of representing daku-on (by daku-ten). To support this view, we can point out that loan words in contemporary Japanese, which do not have more than a 150-year history, are sometimes pronounced with different voicing values. For example, forms such as betto ‘bed’, zyanbaa ‘jumper’, batominton ‘badminton’, amezisuto ‘amethyst’, etc. (all from English) are still used by many people.51 Even recently introduced words such as zyaguzii ‘jacuzzi’ include such confusion. The functional load of voicing in Japanese may be lower than normally thought. In other words, Japanese may still be in the transition of establishing the sei-daku distinction or consciousness (as voicing distinction). As Komatsu (1971: 26–37) proposes, the recognition of sei-daku may have been like accent (= pitch patterns) in Japanese. In the major dialects of Japanese, accent is lexically assigned to each lexical item and distinguished from other patterns, but its functional load is very low (cf. Vance 1987: 107), i.e. if someone uses an “incorrect” accent, the listener will feel “strange” but can understand it easily.52 Similarly, if someone spoke in different sei-daku, the listeners could align it with their own sei-daku distinction. This can be one explanation of why sei-daku was not reflected in the writing system in the past, also.53 The sei-daku “difference” may have been very small diachronically, and possibly synchronically as well.

60 Kazutoshi Ohno 4. Remaining issues We have illustrated the sei-daku distinctions in the history of Japanese orthography, and seen possible explanations for differences in the various stages. While we investigate the data found in literature, we must try to reconstruct the sound values of sei-daku or assume the consciousness of sei-daku by Japanese speakers of the past. There remain many unresolved issues regarding this exploration. In this section, however, only the most important remaining issue is addressed: the sound values of sei-daku in relation with nasality. In the major dialects of contemporary Japanese, sei-daku is opposed in voicing. Most sei-daku pairs differ not only in voicing but also in places and/or manners of articulation (see appendix), but the statement that sei-on are all voiceless, while daku-on are all voiced holds. The sei-daku opposition, therefore, is usually assumed to be the voicing opposition, implicitly or explicitly. Basically this article has taken this position as well. However, when the sound values of sei-daku are concerned diachronically, nasality must also be taken into account. That is, at least three different sounds in manner must be considered, e.g. [ t ], [ d ], and [ nd ], for sei-daku. Since sei-daku is a binary distinction, how to group the three into two is an important issue. Considering that [ N ] is regarded as a variant of [ g ] in contemporary Japanese54 and that it is noted that “[ b ] is occasionally accompanied with nasality” by Rodriguez in the beginning of the 17th century (see fn. 43), and so forth, voiced obstruents55 (e.g. [ d ]) are perceptually grouped together with (pre)nasalized obstruents (e.g. [ nd ]) and distinguished from voiceless obstruent (e.g. [ t ]). This distinction is the sei-daku distinction. This will be the most popular view. End (1989) hypothesizes that voiceless obstruents (e.g. [ t ]) can be grouped together with nasalized obstruents (e.g. [ nd ]) due to their perceptual closeness, but distinguished from voiced obstruents (e.g. [ d ]), though motivation is not convincingly given. The sei-daku distinction assumed by him, however, is parallel to the view just given above, i.e. in terms of voicing. This is obvious from his statement “the sound values of daku-on changed (from [ d ] to [ nd ])”, etc. There is another view that we have not discussed yet. Some support the idea of grouping oral obstruents (e.g. [ t ] and [ d ]) together and distinguishing them from nasalized obstruents (e.g. [ nd ]) (M. Takayama 1992a,b, among others56). This division is based on nasality (oral vs. nasal), rather

Sei-daku: diachronic developments in the writing system

61

than voicing (voiceless vs. voiced). They propose that the sei-daku distinction had been based on this distinction, i.e. non-nasal obstruents are sei-on and [partially] nasal obstruents are daku-on. Voicing of the non-nasal obstruents (i.e. sei-on) had been allophonic, e.g. voiceless word initially and voiced word internally. That is, the sei-daku distinction in the past is similar to the dialects currently spoken in Tohoku (north-eastern Japan) or part of southern Kyushu.57 We did not, and will not, explore this view mainly because this is not a hypothesis to explain the diachronic transitions of the sei-daku distinction in writing. It is unclear how this view accounts for the transitions in writing representation convincingly. Also, this view, so far, does not explain how, why, and when the sei-daku distinction in the past (by nasality) changed to the distinction now (by voicing). According to their assumption, voiced obstruents (e.g. [ d ]) in the past were sei-on; but now they are categorized daku-on. We would like to have a persuasive explanation of this fact. This, of course, does not mean that this view is not worth exploring. As noted in section 3.4, such as in fn. 44, more and more consideration is required to conclude something about the diachronic background of the nasality appearance with obstruents. Until we reach a solid consensus, it is best to keep our eyes open to various possibilities.58 It is actually worth investigating various proposals, such as Takayama (1992b) who extensively discusses sei-daku in relation to nasality, to consider the diachronic development or change of the sound values of sei-daku. It must also be noted that the real value of End ’s proposals (section 3.4) becomes clearer as the study of the nasalized obstruents are considered in greater detail.

5. Summary In the first half of this article (section 2), it was illustrated that in writing seidaku was distinguished at first (Earliest Stage), then was not distinguished (Middle Stage), and is now distinguished again (Current Stage). The transitions of the kana systems, including daku-ten usage, were also illustrated. In the second half (section 3), three possible explanations for the sei-daku representations were introduced. One approach assumed that sei-daku in literature was a reflection of the actual [spoken] language, and the other approach assumed that sei-daku in literature was merely a fact within the writing system. The latter assumed that the sei-daku distinction existed phonologically in a different manner from that in writing, and allowed for a

62 Kazutoshi Ohno position either that sei-daku was actually distinctive or that sei-daku was indistinctive. Finally (section 4), it was briefly noted that the correlation between sei-daku and nasality could be a key for the further development of sei-daku study in terms of the history of Japanese.

Appendix Sei-daku ‘(lit.) clear-muddy’ sei

daku

࠾㸝ຊ㸞 ࡀ㸝ᖼ㸞 ࡂ㸝஁㸞 ࡄ㸝゛㸞 ࡆ㸝ᕤ㸞

࢜㸝ຊ㸞 ࢞㸝ᖼ㸞 ࢠ㸝஁㸞 ࢢ㸝௒㸞 ࢤ㸝ᕤ㸞

ka ki ku ke ko

(ka) (ki) (ku) (ke) (ko)

[ka] [kji] [kμ] [ke] [ko]

࠿ ࡁ ࡃ ࡅ ࡇ

࢝ ࢟ ࢡ ࢣ ࢥ

ga gi gu ge go

(ga) (gi) (gu) (ge) (go)

[ga]~[Na] [gji]~[Ni] [gμ]~[Nμ] [ge]~[Ne] [go]~[No]

ࡈ㸝ᕞ㸞 ࡊ㸝஄㸞 ࡌ㸝ᑇ㸞 ࡎ㸝ୠ㸞 ࡐ㸝᭧㸞

ࢦ㸝ᩋ㸞 ࢨ㸝஄㸞 ࢪ㸝㡪㸞 ࢬ㸝ୠ㸞 ࢮ㸝᭧㸞

sa si su se so

(sa) (shi) (su) (se) (so)

[sa] [˛i] [sμ] [se] [so]

ࡉ ࡋ ࡍ ࡏ ࡑ

ࢧ ࢩ ࢫ ࢭ ࢯ

za zi zu ze zo

(za) (ji) (zu) (ze) (zo)

[dza]~[za] [d¸i]~[¸i] [dzμ]~[zμ] [dze]~[ze] [dzo]~[zo]

ࡒ㸝ኯ㸞 ࡔ㸝▩㸞 ࡗ㸝ᕖ㸞 ࡙㸝ኮ㸞 ࡛㸝Ḿ㸞

ࢰ㸝ኣ㸞 ࢲ㸝༐㸞 ࢵ㸝ᕖ㸞 ࢷ㸝ኮ㸞 ࢹ㸝Ḿ㸞

ta ti tu te to

(ta) (chi) (tsu) (te) (to)

[ta] [t˛i] [tsμ] [te] [to]

ࡓ ࡕ ࡘ ࡚ ࡜

ࢱ ࢳ ࢶ ࢸ ࢺ

da di du de do

(da) (ji) (zu) (de) (do)

[da] [d¸i]~[¸i] [dzμ]~[zμ] [de] [do]

ࡢ㸝ἴ㸞 ࡥ㸝Ẓ㸞 ࡨ㸝୘㸞 ࡫㸝㒂㸞 ࡮㸝ಕ㸞

ࣀ㸝ඳ㸞 ࣃ㸝Ẓ㸞 ࣆ㸝୘㸞 ࣉ㸝㒂㸞 ࣌㸝ಕ㸞

ha hi hu he ho

(ha) (hi) (fu) (he) (ho)

[ha] [çi] [Fμ] [he] [ho]

ࡣ ࡦ ࡩ ࡬ ࡯

ࣁ ࣄ ࣇ ࣊ ࣍

ba bi bu be bo

(ba) (bi) (bu) (be) (bo)

[ba] [bi] [bμ] [be] [bo]

From the left column: hiragana (cursive/rounded syllabary), man’yougana from which the hiragana on the left was developed [only for sei-on], katakana (partial/angular syllabary), man’yougana from which the katakana on the left was developed [only for sei-on], romanization faithful to kana syllabary (i.e. phonemic romanization, employed in this article to describe

Sei-daku: diachronic developments in the writing system

63

the data), romanization faithful to pronunciation (i.e. phonetic romanization, typically seen in the spelling of Japanese proper names), and [broad] transcription of actual pronunciation. [N] is observed only word internally, if it appears (dialectal). See section 2 for discussion on man’yougana.

Acknowledgements The completion of this paper was supported by the MOE (Ministry of Education) project of the Center for Linguistics and Applied Linguistics of Guangdong University of Foreign Studies, Guangzhou, China. The revised version of this article was written while I was working at the Institute of Cognitive Science, Hunan University, Changsha, China; after considerable restructuring from its original version, entitled “Sei-daku: More than a voicing difference – toward a better understanding of the rendaku phenomenon –“, written in June 2002, while I was studying at the University of Arizona, USA. Reviews from two anonymous scholars were very helpful, especially, detailed comments and suggestions from the second reviewer. Takayama Tomoaki provided me not only with helpful suggestions but also materials that I could not obtain. This article could not have been completed without continuous help from the editors of this volume. Many thanks go to those who have commented on various drafts of this article. All remaining errors are my own.

Notes (Japanese, Chinese and Korean names are given in the order last-first) 1. (lit.) = (literal translation) 2. Discussion in Section 2.1 is largely dependent on Hamada (1971: 44–45). 3. Adding a little more explanation, on reading is based on Chinese pronunciation, while the kun reading is actually native vocabulary (e.g. word) assigned to kanji. In contemporary Japanese, not all, but most of the frequently used kanji have the two readings. Moreover, there may be multiple on readings and/or multiple kun readings for one kanji. It is not surprising that one kanji has several readings. 4. cf. ‘Four’ in contemporary Mandarin Chinese is sì in Pinyin representation. 5. There was another way of reading man’yougana called gisho ‘fun reading’, which relies on association or imagination by the reader. For instance, two kanji [bee]-[sound] represented the sound “bu” (onomatopoetic), two kanji [ten]-[six] (i.e. ‘sixteen’) represented the sound sequence of “sisi” because 44 (called “si-si” ‘four-four’ in Japanese) makes 16, and so forth.

64 Kazutoshi Ohno 6. Kojiki (‘Record of Ancient Matters’) is a history book written by no Yasumaro (who recorded what Hieda no Are said) by Imperial request. The preface to this work is written in Chinese (seikaku kanbun ‘regular Chinese’), while the text is written in highly Japanized Chinese (hentai kanbun ‘irregular Chinese’). Nihonshoki (‘Chronicle of Japan’) is another history book written by Toneri Shinn (Shinn ‘Imperial Prince’), etc. and is the first official document compiled by Imperial command. The text of the book is basically written in Chinese. Man’youshuu (‘Collection of Myriad Leaves’) is an anthology of Japanese traditional songs (waka) written or edited by various anonymous authors in 759, or perhaps a little later than that (around 770; the complete editorial work may have been done even later than this). 7. Using Chinese characters to show Japanese pronunciation is in fact already seen in inscriptions (kinseki-bun) a few centuries earlier. However, those lexical items are limited to proper nouns, which it is often hard to reconstruct original pronunciations for. 8. At least 35 different man’yougana are used for the native sound of si in Nihonshoki (Tsuru 1977: 242). 9. Thus, reading man’yougana was already difficult in the next Era (Heian Era [794 –1192]). 10. According to Tsukishima (1972: 384), this had already been recognized by Keich (Waji shouin, 1691). Motoori Norinaga (Kojiki-den 1: Karina no koto, 1767) studied this issue, and his work was well expanded by his student Ishizuka Tatsumaro (Kogen seidaku kou, published in 1801). 11. Although data are limited, it is generally agreed that man’yougana in inscriptions (see fn. 7 above) in or around the Suiko Era [592–628] also had the seidaku distinction by using different characters (cf. Tsuru 1977). 12. Kasuga (1941) reexamined man’yougana in Kojiki which had been considered dubious regarding the sei-daku distinction at that time, and concluded that they were distinguished well by different characters except for a few exceptions. no (1947–48, 1953), argued that man’yougana in Nihonshoki, which made Motoori Norinaga wonder why there were many exceptions, also generally had the sei-daku distinction by characters as well. Nishimiya (1960) and Tsuru (1960) argued that the sei-daku distinction was well represented by not only on-gana (man’yougana read in on reading) but also kun-gana (man’yougana read in kun reading). 13. An example of “multiple ways of representing a sound” is provided in section 2.2 above (kamo ‘(admiration marker)’ could be represented by one kanji or two ka-mo). Another way is gisho use of man’yougana (see fn. 5 above). 14. This difficulty was remarkable especially when taking supplemental notes on the Chinese literature. For reading help, difficult or special pronunciations, morphemes that Chinese lacked (e.g. particles such as case markers, inflectional endings, etc.), annotations, and so forth, were added in the margin.

Sei-daku: diachronic developments in the writing system

65

15. In fact, some simplified kana had already been sporadically used earlier. For example, a few of them are seen in Shousouin komonjo: Minokoku [Mino no kuni] Kamogun Hanifuri (Hanyuuri) koseki-chou ‘Register book of Hanyuuri, Kamo County, Mino State (currently part of Gifu Prefecture)’, which is the existing earliest family register book written in 702. A claim in the text is that kana simplification was positively processed in this period. 16. Those various variants of simplified characters were sometimes mixed with man’yougana (i.e. full kanji) in the literature. 17. Any simplified form of man’yougana is, thus, often grouped together and called ryakutaigana ‘simplified kana’ in contrast to magana ‘real kana’ which refers to non-simplified man’yougana (i.e. full kanji). 18. The umlaut shows that the sound belongs to the otsu type, not the actual sound quality such as lip rounding. 19. Tsuru (1977) chose this literature for discussion because it represents official documents (history books compiled by Imperial command), which are descended from Nihonshoki. They are written in the Chinese style, and man’yougana are included. 20. They were separately developed because of their preference in the field. Cursive characters were preferred in writing the native literature, while partial characters were preferred in reading (annotating, commenting, etc.) Chinese literature. 21. For example, Japanese pronunciation of Chinese characters are written in man’yougana in Konkoumyou saishououkyou ongi, a commentary on Buddhist scriptures, copied in 1079. 22. This was actually simplified (partially represented) kanji for “daku” noted in red ink (seen in Kongouchou yugarengebu shinnenju giki, 889). 23. Komatsu (1981: 70) notes that the primitive use of daku-ten on the top right corner is seen in the literature in the mid-thirteenth century ([Kanchiin] Ruijuu myougishou, written [copied] in 1241). 24. Originally they were tone marks, which later started to represent sei-daku. 25. Maruyama (1967: 1122) notes that in the Edo Era [1603–1867] people tended to think that sei-on were sophisticated and daku-on vulgar. For example, Japanese-studying scholars often wrote sei-on but no daku-on. This might be one of the reasons for the inconsistency. 26. This is a single small circle on the top right corner of kana, i.e. the same diacritic currently used for handaku-on ([ p ]-initial syllables). 27. e.g. Nihonshoki Tsuushaku (Commentary on Nihonshoki) by Iida Takesato published in 1902. 28. Discussions in sections 3.1 and 3.2 are largely dependent on Hamada (1971). 29. Of course, this is not identical to the statement “man’yougana is phonetic (seidaku is distinct) and hiragana and katakana are phonemic (sei-daku is indistinct)”, which contradicts the basic assumption of this position “sei-daku has been distinctive”.

66 Kazutoshi Ohno 30. But see the last paragraph in section 3.5 below for a possible answer to this question. 31. Mori (1991) points out that some volumes of Nihonshoki were actually written by Chinese scholars (at least two different Chinese scholars). Ide (1989: 241) says, citing M. Inoue (1932: 223–225), that some man’you-gana usage in Man’youshuu requires knowledge of the Chinese literature. 32. There are two of them: kou-monjo and otsu-monjo. They are (independently) written “at least before 762”, according to Komatsu (1981: 57). They are generally regarded as written by a person from “the low class (among intellectuals)”. M. Tanaka (1995: 196) says that they are written in rough style with sloppy characters, so the author is supposedly not highly educated. 33. See also the second paragraph of section 2.4. 34. Wamyou ruiju-shou, which was edited a little later in 934 (by Minamoto no Shitag), has no sei-daku distinction, though it is also a set of ChineseJapanese dictionaries. Hamada (1971: 44) notes that this is also due to the differences between the attitudes of the authors/editors against the sei-daku distinction in writing, rather than temporal factors. 35. Syllabic/Moraic nasal in Japanese undergoes place assimilation. It becomes [ m ] before a bilabial sound, [ n ] before an alveolar sound, [ N ] before a velar sound, and [ – ] elsewhere (word-final, etc.). However, they are all recognized as the same sound and written in the same hiragana or katakana by native speakers of Japanese (Komatsu 1981: 58–59). Sometimes [ – ] (i.e. unassimilated) is observed in the environments of assimilation (e.g. before a bilabial sound). Native Japanese speakers will not recognize it as special or different. 36. End (1989) is a collection of his papers published in 1971–1988. 37. Kakekotoba is one rhetorical devise mainly used in verse, by which two (or more) readings are available from one expression. That is, homonymic expressions are exploited there. e.g. matu > ‘wait(ing)’, ‘pine (tree)’ 38. For example, in the traditional song (waka) in Kokinwakashuu (edited in around 913): wasurenanto omohukokorono tukukarani arisiyorigeni maduzokohisiki (14: 718), two readings are available from “madu” – madu ‘first (of all)’ and matu ‘wait(ing)’. 39. Such nasality is not indicated in Japanese literature at all. 40. Around 1600, several books and dictionaries related to Japan or Japanese were written by missionaries of the Society of Jesus. They are called Kirishitan shiryou ‘Christian literature’, represented by Arte da Lingoa de Iapam [Nihon daibunten] (1604–1610) by João Rodriguez. These are most reliable for discussion of sound values of sei-daku because they are relatively new and most of them are written in alphabets which distinguish voicing by using different letters. What Rodriguez actually says is that the preceding vowel is nasalized, but Hamada (1952a: 21, note 9 on p. 31) says that the nasality must be accompanied

Sei-daku: diachronic developments in the writing system

41.

42.

43.

44.

45.

67

with daku-on because: (i) the nasality is also expected word-initially; (ii) the pronunciation of daku-on shares the property of nasals in youkyoku (singing Noh), heikyoku (singing Heike monogatari ‘Tale of Heike’), citing Iwabuchi (1934). Helinyuli [Kakuringyokuro] edited by Luo Dajing [Ra Taikei] (13c, 1252?), Ribenjiyu [Nihonkigo] edited by Xue Jun [Setsu Shun] (1523), etc. In these works, daku-on are usually preceded by a coda nasal. e.g. f n-zh for hude ‘(writing) brush’. Pronunciations are given in contemporary Mandarin Chinese here for convenience. Iropa [Iroha] (author unknown) (1492), Cheophaesineo [Shoukaishingo] by Gang Useong [K Gsei] (1676), etc. They spelled in similar fashion as in the Chinese literature (see fn. 41 above), i.e. put a nasal coda before daku-on. One might think that it is showing the voicing of the obstruent rather than the nasalization of daku-on, mentioning that Hangul (Korean syllabary) itself has no way to show voicing. In fact, Hangul had a letter for / Z / and it was used for the Japanese / z / sound in the works cited above; nonetheless, it is preceded by a nasal coda (see also Hamada 1952b). In Nosondang Ilbonhaengnok [Roushoudou Nihonkouroku] by Song Huigyeong [Kikei S ] (1420?), Haedongjegukgi [Kaitoushokokuki] edited by Sin Sukju [Shin Shukush] (1471), etc., Japanese place names are written with Chinese characters, using the same representation for daku-on as seen in the Chinese literature mentioned in fn. 41 above. Special thanks go to Choi Kyung-Ae who helped me to transliterate the authors and titles using the new Korean romanization system. Precisely speaking, Rodriguez writes in Arte da Lingoa de Iapam [Nihon daibunten] (see fn. 40 above) that nasality is always observed before D, DZ, G (/ d, g /, where / d / includes [ dz ] before / u /) and occasionally before B (/ b /). However, considering examples in Korean literature (see fn. 42 above) and kè- n-zh for kaze ‘wind’ (Helinyuli), huáng-bng for obou ‘monk’ (Helinyuli), yn-b-jì for ibiki ‘snore’ (Ribenjiyu), etc. in Chinese literature, it will be reasonable to assume that daku-on were generally accompanied by nasality around the 14th century. It will be fair to note that it was kindly pointed out to me by an anonymous reviewer that this is not well-established or an uncontroversial view. Rodriguez does not say daku-on are consistently accompanied with nasality (see fn. 43 above). Some argue that the coda nasal before daku-on may not represent nasality but emphasize voicing of the following obstruent (e.g. Fukushima 1959). However, it is also a fact that there is no strong evidence to refute the assumption (daku-on were generally accompanied with nasality); and therefore, not so many scholars seem to reject it. This issue, whether daku-on were consistently nasalized or not, is still debated. End (1989) does not provide any convincing explanation or evidence for what he calls “neutralized”. However, his hypothesis itself is actually very suggestive, especially when we compare the onomatopoeias that contrast in sei-on,

68 Kazutoshi Ohno

46.

47. 48. 49.

50. 51.

daku-on, and nasals. It is well known that the sei-daku contrast in Japanese onomatopoeia results in contrastive meanings, such as positive and negative, clear and dirty, light and heavy, etc., respectively. For example, one’s eyes are kirakira when enjoying something, while they are giragira when in hunger. Let us consider the following examples: surusuru/zuruzuru vs. nurunuru and torotoro/dorodoro vs. noronoro. (All examples here are mine.) Surusuru refers to smooth action such as sliding down a rope, while zuruzuru refers to the friction caused by moving a heavy object; and nurunuru refers to the oily and/or slippery surface, which is much closer to surusuru (lubricious) and opposite of zuruzuru (frictional). Moreover, turuturu also refers to a slippery surface. Moving on to the other set, torotoro refers to slow action or transition, while dorodoro refers to the state or movement which is jelly-like or pulpy; and noronoro is to describe that some movement is slow, which is closer to the meaning of torotoro. These are interesting, but it is unknown how these impressions/feelings of the modern people can be evidential for determining the recognition of the intuitions of people hundreds years ago. See Komatsu (1981: 107–110) for an interesting discussion of the [nasal]–[daku-on] ([ n ]–[ d ]) contrast that functions similarly to the sei-daku contrast in meaning as described above. (His example is nora vs. dora, both of which have a meaning of ‘stray’.) Hamada (1952a: 27f) also assumes the change (nasalization over daku-on) in the same period, but for different reasons. He assumes that the change was triggered by the influence from Chinese that had pre-nasalized obstruents around this period. However, many others take a prudent attitude to this assumption. Considering that education, or knowledge of Chinese, in that period was strictly limited to a small set of people, it is difficult to accept an assumption that the knowledge (and preference) of nasalized obstruents by such a limited group changed the sound quality of the entire Japanese sound system. The existence of it is confirmed by foreign literatures (see fn. 40–43). However, not much is actually known about the emergence of the nasalization of daku-on (see also section 4 below). Then the nasality weakened to something like [ mb ] a few centuries later to be realized as described by Rodriguez (see fn. 40 above). Mabuchi (1971) also indicates the sound change of [ b ] in this period. In the previous paragraph, it is expressed that the sei-daku distinction was “rather” or “tentatively” stabilized. This is because daku-on had been inconsistently specified in the Japanese literature throughout the Edo Era, as mentioned in section 2.4 above. Many do not deny this view. Hamada (1960: 7–8) notes this, and so does M. Takayama (1992b: 45). Many fix their pronunciations after they study the spellings of those words. That is, for Japanese speakers, the voicing distinction is still hard just through listening.

Sei-daku: diachronic developments in the writing system

69

52. In other words, it is “social rather than linguistic” (Vance 1987: 107). 53. It is also suggestive in that most Japanese speakers cannot easily name the accent pattern, e.g. HLLL, LHL, etc. (L = Low and H = High), if they can distinguish them. The sei-daku distinction in the past may be similar to this: even if they could distinguish them in speech, they could not write them down correctly. 54. [ g ] is supposed to be [ Ng ] in the past, and [ N ] would have been developed from [ Ng ] because [ N ] was not a phoneme. It is sometimes taught that / g / is prescriptively [ N ] word internally, but this is actually dialectal. The velar nasal is not clearly observed in the dialects of the western half of Japan. Even in some dialects of the eastern half of Japan, the younger generations do not produce the nasal consistently (e.g. Tokyo dialect). 55. Whenever “nasal” is unspecified, e.g. “voiced obstruents”, they are “oral” or “non-nasal” sounds. 56. Takayama (1992a, b) says that he followed Hayata (1977a,b), although Hayata, in fact, just mentioned it without providing any evidence in his articles. 57. Yamane Tanaka (this volume) discusses voicing and nasality of obstruents in Tohoku dialects. 58. This is one of the reasons this article focuses on introducing different views, rather than proposing or concluding something new.

The representation of laryngeal-source contrasts in Japanese Kuniya Nasukawa

1. Introduction With a focus on generative restrictiveness and the significance of crosslanguage variation in source contrasts, this paper identifies those phonological primes responsible for creating laryngeal-source contrasts in Japanese. The arguments will be based on Element Theory (Kaye, Lowenstamm and Vergnaud 1990; Harris 1994; Harris and Lindsey 1995, 2000). Unlike theories based on SPE-type distinctive features, this theory of melodic representation recognizes primes which are monovalent and which can therefore be interpreted separately without needing to be combined with other primes. The theory admits only two autonomous melodic categories for crosslinguistic source contrasts: one contributes aspiration and the other prevoicing. This paper claims that Japanese exploits only the prevoicing element in the representation of phonation-type contrasts, this position being supported by evidence from assimilatory processes, early language acquisition and aphasia. The argument leads to the further claim that vowel devoicing is not a process triggered by the laryngeal element for aspiration, but by a manner element called ‘noise’. Furthermore, in accordance with the general trend towards reducing the size of the element inventory, the paper will discuss the validity of a recent proposal to merge the prevoicing and nasal elements.

2. Phonetic characteristics of the laryngeal-source distinction It is widely acknowledged that Japanese exhibits a two-way source contrast between ‘voiced’ and ‘voiceless’. The term ‘voiced consonants’ refers to the set of sounds represented by the phonetic symbols b, d, g, z while ‘voiceless’ corresponds to p, t, k, s. These labels are variously exploited in the literature, and usually suffice as a means of identifying the contrasts in phonemic and pedagogic terms.

72 Kuniya Nasukawa However, it has been acknowledged that the use of these terms is insufficient for describing the varied phonetic manifestations of the source contrasts across different languages. In articulatory terms, for example, the socalled ‘voiced’ obstruent plosives b, d, g in most Slavic (such as Polish and Russian) and Romance languages (such as Spanish and French) are typically produced in word-initial position with glottal pulsing during articulatory closure: in precise phonetic terms, they are described as voiced unaspirated. The ‘voiceless’ plosives p, t, k in those languages are articulated without glottal pulsing and are described as voiceless unaspirated. On the other hand, the ‘voiced’ plosive series of some Germanic languages such as English and Swedish, for example, is typically produced in word-initial contexts without glottal pulsing, and is identified as voiceless unaspirated. The ‘voiceless’ series in those languages is also articulated without glottal pulsing, but with aspiration: it is described as voiceless aspirated, which is thus identical to the ‘voiced’ series of most Slavic and Romance languages. The cross-linguistic differences in the phonetic realization of the two-way contrast is often captured by voice onset time (VOT), which is the wordinitial interval between the release of the stop closure and the onset of vocalfold vibration (Lisker and Abramson 1964; Abramson and Lisker 1970). Members of the voiced unaspirated (truly voiced) series, as found in Spanish and French, are characterized by a relatively long lead time between the onset of glottal pulsing and stop release. On the other hand, in the voiceless aspirated (fortis) series found in languages such as English and Swedish, there is a relatively long time lag between closure release and the onset of glottal pulsing. In the voiceless unaspirated (lenis or neutral) series, which is common to all languages of the world, there exists either a relatively short or zero time lag between closure release and the onset of glottal pulsing. It is possible to identify the perceptual source distinction using experimental phonetic methods. Spectrographic analysis reveals how the VOT value in an initial CV context is reflected in the cutback of the onset of the first formant (F1) relative to the higher formants. According to experimental tests involving the discrimination of contrasting initial CV contexts (Lisker and Abramson 1970), for English speakers the perceptual boundary of the b-p distinction is observed in the VOT value from +20 to +30 msec, whereas the b-p boundary for Spanish speakers lies in the VOT value +10 to +20 msec. To avoid confusion arising from the use of the cover terms ‘voiced’ and ‘voiceless’, the experimental phonetics literature often employs the terms voiced unaspirated, voiceless aspirated and voiceless unaspirated to refer to

The representation of laryngeal-source contrasts in Japanese

73

long voicing lead (negative VOT), long voicing lag (positive VOT) and short or zero voicing lag (zero VOT) in the description of the interval between the stop release and the onset of voicing. With respect to these three VOT categories, languages are classified into at least four groups (there exists a fifth type which will be discussed in §5) as follows: (1)

VOT systems of two-way contrast Type I II III IV

Contrast One-way Two-way Two-way Three-way

Short lag

Long lead

Long lag

Types II and III are systems exhibiting a two-way source contrast: Type II – to which Spanish and French belong – displays short-lag and long-lead plosives; Type III – to which English and Swedish belong – shows shortlag and long-lag plosives. There are two other types of source contrasts: Type I – which is found in Finnish – exhibits only the short-lag series; on the other hand, Type IV – which is the system observed in Thai and Burmese – employs all three source contrasts. It should be noted that the shortlag series is always present in every language system. Japanese, like Spanish and French, is considered as belonging to Type II, since it exhibits a contrast between short voicing lag and long voicing lead in word-initial plosives: in initial consonants, the language does not show the aspiration which is a property of Type III languages. Also, experimental tests for the discrimination of source contrasts in initial CV contexts in Japanese (Shimizu 1977) reveal that the perceptual boundary of the b-p distinction lies in the VOT value from +15 to +20 msec, which is almost identical to the result for Spanish.

3. The representation of laryngeal-source distinctions Current phonological theories are agreed that speech sounds are decomposable into smaller categories. The existence of these categories is widely supported by the notion of natural classes; furthermore, they are considered to be part of the linguistic aspects of the human genetic endowment. These categories have been investigated within a number of different theoretical

74 Kuniya Nasukawa frameworks, and have been variously labeled as distinctive features (SPE, et passim), components and gestures (in the framework of Dependency Phonology: Anderson & Ewen 1987, van der Hulst 1989), particles (Schane 1984, 1995) and elements (Kaye, Lowenstamm and Vergnaud 1985; Harris 1990, 1994; Harris and Lindsey 1995, 2000). In order to represent cross-linguistic source distinctions, various types of phonological primes have been proposed within this range of theories. Distinctive feature theories, for example, use the bivalent features [±voice] and [±tense] (see Jakobson, Fant & Halle 1952, Chomsky and Halle 1968, and others). In addition to these, we find references to [±heightened subglottal pressure] and [±glottal constriction] in Chomsky and Halle (1968), [±spread glottis], [±constricted glottis], [±stiff vocal cords] and [±slack vocal cords] in Halle and Stevens (1971), and [±fortis] in Kohler (1984). In frameworks employing monovalent distinctive features, the singlevalued prime [voice] can be found (see Itô, Mester & Padgett (1995), Lombardi (1995) and others). Also, [stiff vocal cords] is employed in Halle & Stevens (1991) while [spread glottis] is used in Jessen & Ringen (2001). In Dependency Phonology (Anderson & Ewen 1987), which employs only monovalent primes, the addition of the |V| component in the phonatory sub-gesture represents voicing while its absence indicates voicelessness. In the variant of Dependency Phonology known as Radical CV phonology (van der Hulst 1995), C under the phonatory sub-gesture is for constricted glottis, Cv for spread glottis, V for obstruent voicing and the absence of these components for voicelessness in obstruents. Particle Phonology investigates mainly vocalic systems and does not provide any significant details about the melodic representation of source contrasts in consonants. Like Dependency Phonology, Element Theory can be traced back to Anderson & Jones (1974). However, more recent developments in Element Theory have necessitated a change of name, and the label Government/ Licensing-based Phonology (Kaye, Lowenstamm & Vergnaud 1985, 1990) has become the accepted term. In this framework the two elements [L] and [H] are employed to express laryngeal-source contrasts: [L] for long voicing lead (truely voiced); [H] for long voicing lag (voiceless aspirated); the absence of these elements stands for short or zero voicing lag (neutral). Most studies investigating the phonological phenomena of Japanese have traditionally employed the bivalent feature [±voice] to represent the twoway source contrast (Itô & Mester 1986, et passim): [+voice] and [–voice] for long voicing lead and short voicing lag respectively. In recent theories

The representation of laryngeal-source contrasts in Japanese

75

which claim that source contrasts involve a privative prime (Rice 1992, Itô & Mester 1993; Itô, Mester & Padget 1995), the monovalent feature [voice] is typically employed: the existence of [voice] refers to long voicing lead and its absence to short voicing lag. The first Element Theory analysis of source contrasts in Japanese was given in Shohei Yoshida (1990, 1996), where both elements [L] and [H] are employed. Without taking into consideration any of the cross-linguistic differences regarding source contrasts, he utilises [L] for ‘voiced’ obstruents and [H] for ‘voiceless’ cognates. An alternative approach such as Nasukawa (1995), however, succeeds in incorporating the cross-linguistic facts concerning source distinctions into the melodic analysis of Japanese. Specifically, it is claimed that [L] is the only element required for the source contrast in Japanese: its existence is interpreted as long voicing lead (‘voiced’ series of obstruents) and its absence as short voicing lag (‘voiceless’ series of obstruents). Furthermore, by merging [L] and the murmur/nasal element [N], Nasukawa (1998) proposes that long voicing lead phonetically manifests itself when [N] is the head of a given melodic expression. In the context of Element Theory, this paper will consider (i) how the phonological primes involved in laryngeal-source contrasts contribute to the internal organisation of individual sounds, and (ii) how the representation of those primes succeeds in incorporating implicational universals as well as the phonological properties associated with source contrasts.

4. Oppositions, interpretation and definition in melodic representation Let me first identify the theoretical characteristics which define Element Theory. In comparison with the other sub-segmental theories, elements are defined by at least the following three tenets: (2) An element is a. monovalent in terms of phonological oppositions, b. the minimal unit of interpretation by the sensorimotor system, and c. an information-bearing pattern that humans perceive in speech signals. Firstly, (2a) is concerned with the way melodic information is expressed. In element-based theory, the binary nature of phonological contrasts is represented through the presence (activeness) or absence (inactiveness) of a given

76 Kuniya Nasukawa monovalent prime. For instance, like some melodic theories employing gestures or particles, ‘voice’ contrasts are encoded by the presence versus the absence of the ‘voice’ prime in a given expression. Some feature-based theories such as Itô, Mester & Padgett (1995) and Lombardi (1995) take the same theoretical stance. In contrast, the competing notion of bivalent oppositions is exploited by orthodox distinctive-feature theories (SPE, et passim), where an opposition is derived by specifying plus and minus values to a given prime. For instance, the voice contrast is captured by the voice prime, to which a plus or minus value is specified. Under this view, bivalency produces at least three possibilities: [+voice] is active in processes; [–voice] is active; and both [+voice] and [–voice] are simultaneously active. As the literature indicates, however, the processes involving laryngealsource contrasts in Japanese do not employ these three options in equal measure: [+voice] is traditionally said to trigger most dynamic processes such as postnasal voicing and compounding; [–voice] rarely triggers processes except for high vowel devoicing between voiceless consonants; and no simultaneous participation of both features is attested. Furthermore, in the context of a rule-based multi-stratal model, the bivalent format substantially over-generates the number of unattested processes. In a model exploiting the notion of monovalency, on the other hand, the ‘voice’ contrast is captured by the presence or absence of the ‘voice’ prime. In this case, only the prime which is present in a given context can be active for processes such as voicing assimilation. Its absence means a failure to participate in any processes. This adequately describes the asymmetric processes involving the ‘voice’ prime, yet it does not generate processes which are unattested. Secondly, (2b) states that an element can be interpreted separately without needing to be combined with other elements. Indeed, this property is common to all theories which adopt the notion of monovalency. This suggests that information from phonological representations is accessible by the sensorimotor systems. As Harris & Lindsey (1995) discuss, this approach succeeds in eliminating redundancy rules of the kind which fill in predictable feature values, and instead pursues a mono-stratal approach to phonology. In orthodox feature theories, on the other hand, a single prime cannot be interpreted without being harnessed to the signatures of other primes. This implies that the minimal units of phonetic interpretation are segments, not features. The element-based approach (Harris & Lindsey 2000) is also characterised by (2c), which states that elements are not defined, by properties such

The representation of laryngeal-source contrasts in Japanese

77

as tongue height or formant height: rather, they are sound images which comprise information-bearing patterns that humans perceive in speech sounds. They should be detectable through the traditional method of determining the manner in which sounds are organized into systems and natural classes. This view is based upon the assumption that speech sounds are represented cognitively as auditory images – primary media which are neutral between speaker and hearer: speakers transmit and monitor such information and listeners receive it. This position is rarely touched upon in the literature of other representational approaches such as orthodox feature theory, where features are chiefly defined in terms of articulation or raw acoustics (Chomsky & Halle 1968; Clements & Hertz 1991) or otherwise in terms of coexisting articulatory and acoustic specifications (Flemming 1995). As a challenge to non-element-based theories, Harris & Lindsey claim that articulation and raw acoustics are not information-bearing categories: articulation is a delivery system for linguistic information and raw acoustics is a mere outcome delivered by articulation. 5. Laryngeal-source elements Element Theory employs two autonomous melodic primes: the low source element labelled [L] and the high source element labelled [H], which are phonetically interpreted in obstruents as long voicing lead (true voicing, prevoicing) and long voicing lag (aspiration) respectively. The auditory images of these elements are, as the names imply, low source and high source, the acoustic patterns of which appear in spectrograms as lowered fundamental frequency (F0 down) and raised fundamental frequency (F0 up) respectively. The nearest corresponding features might be taken to be [slack vocal cords] and [stiff vocal cords] respectively (Halle & Stevens 1971, 1991), although the equivalence is found only in terms of glottal execution. Whereas [slack vocal cords] and [stiff vocal cords] are defined in terms of articulation, [L] and [H] may be treated as phonologically defined auditory images. In addition, [slack vocal cords] and [stiff vocal cords] are intended only for representing laryngeal activity (that is, phonation-type distinctions in consonants and also tonal distinctions in syllable nuclei) (Bao 1990, Halle & Stevens 1991, cf. Yip 1980), whereas [L], as we will see in §7, is also relevant to the representation of nasality (Nasukawa 1995, 1998, 2005a; Ploch 1999). Harris (1998) claims that the availability of the two monovalent elements [L] and [H] implies the possibility of the following four combinations:

78 Kuniya Nasukawa (3)

Element specification of source contrasts Source elements Non-specified [L] [H] [L, H]

Phonetic manifestation Short or zero voicing lag (neutral) Long voicing lead (truly voiced) Long voicing lag (voiceless aspirated) Long voicing lead and lag, murmur (breathy)

Besides the specification of both [L] and [H] alone, there are two further possibilities: both elements are unspecified in obstruents, and both elements are simultaneusly specified in a given melodic expression. The former option, where no source category is specified, manifests itself phonetically as short or zero voicing lag in obstruents. The latter option, where both source categories are specified together, is interpreted as long voicing lead and lag (breathy voice), which is associated with the voiced aspirated plosives found in, for instance, Hindi and Gujarati. These combinatorial specifications are selected by parameter on a language by language basis. The typology of source-element specifications is illustrated below: (4)

Typological specification of [L] and [H] Type I II III IV V

E.g. Finnish Spanish English Thai Hindi

Non-specified

[L]

[H]

[L, H]

In presenting the VOT typology in (4), Harris (1994, 1998) claims that its arrangement straightforwardly captures implicational universals and allows us to identify those segmental classes that are active in processes involving laryngeal source. With respect to implicational universals, the relative markedness of source contrasts is explained in terms of compositional complexity. The unmarked setting is represented by the absence of source elements and corresponds to the short or zero voicing lag found in all languages. This nonspecification of source elements is regarded as the baseline on to which source elements are superimposed: so the existence of any source elements

The representation of laryngeal-source contrasts in Japanese

79

implies the existence of short or zero voicing lag – the manifestation of non-specification of source elements. In addition, the most complex (Type V) systems allow the combination of [L] and [H], which results in a fourway contrast. In this case, the existence of the [L]-[H] combination implies those parameter settings which permit the existence of sole [L], sole [H] and also the absence of both [L] and [H]. Furthermore, both the active and the inert members of a segmental class are straightforwardly captured by the presence or absence, respectively, of a particular source element. For instance, true-voicing assimilation (spreading of [L]) is only observed in languages which employ [L], such as Polish and Serbo-Croatian; and similarly, voiceless assimilation (to voiceless aspirated obstruents) (spreading of [H]) is only observed in languages such as English which employ [H]. Members of the neutral series of obstruents may undergo processes such as neutralisation and lenition in certain contexts, but they never participate in processes as triggers: for example, final devoicing in Polish, Northern German and Catalan is explained in terms of the suppression of [L]; and in the phonology of English-speaking children the favoured neutral state in obstruents is accounted for in terms of the suppression of [H]. In Element Theory, the source elements [L] and [H] play a dual role: while they contribute the phonation-type contrasts in non-nuclear positions, as just discussed, in syllable nuclei [L] and [H] are interpreted as low tone and high tone respectively (Harris 1998). This dual role stems from the close correlation in tone languages between voicing in non-nuclear positions and register differences in nuclear positions. In diachronic terms, tone on nuclei has developed partially or fully from phonation-type contrasts in neighbouring non-nuclear sites (Yip 1980; Bao 1990; Halle & Stevens 1991). It should here be noted that [L] and [H] represent tonal contrasts but not voice activity in nuclei. This reflects the fact that voice in nuclei is characterised by spontaneous voicing, which the theory considers to be an innocent by-product of its manner characteristics (Harris 1994: 135–136). (Sonorant devoicing such as high vowel devoicing in Japanese is a different issue. As we will see in §6.3, it has nothing to do with source elements.) In contrast, the unmarked (lexically unspecified) laryngeal state for obstruents is voiceless unaspirated. The state of true voicing or aspiration is not spontaneous since the presence of a source element [L] or [H] is superimposed upon the unmarked state.

80 Kuniya Nasukawa 6. Japanese as a Type II language 6.1. Phonetic evidence There are several pieces of evidence to support the claim that Japanese belongs to the group of Type II languages in (4).1 One piece of phonetic evidence, as §2 has already mentioned, comes from the observation that, like Spanish and other languages belonging to the Type II category, aspiration (which is a characteristic of two-way source contrasts in Type III languages) is cannot be detected in typical contexts for VOT measurement in Japanese. The ‘voiceless’ series of two-way source contrasts in Japanese comprises voiceless unaspirated consonants. This is supported by the results of tests for source discrimination in initial CV contexts. Shimizu (1977) claims that the perceptual boundaries of the b-p, d-t and g-k distinctions lie in the VOT value from +15 to +20 msec, from +20 to +30 msec, and from +20 to +30 msec, respectively. These characteristics of VOT perception are almost identical to those found in a Type II language like Spanish. In contrast, as Shimizu argues, Type-III languages (showing another two-way source contrast) such as English exhibit rather different values for the perceptual boundaries in the same b-p, d-t and g-k distinctions: from +20 to +30 msec, from +30 to +40 msec, and from +30 to +40 msec, respectively.

6.2. Active processes of true voicing Phonological evidence for true voicing in Japanese obstruents comes mainly from two areas: assimilatory and concatenating processes (which will be discussed here), and early language acquisition and aphasia (which will be discussed in §6.3). First, as confirmed by the literature in general (see also the other papers in this volume), Japanese displays active processes involving true voicing: postnasal voicing and sequential voicing known as Rendaku. Postnasal voicing is generally described as categorical voicing assimilation occurring in nasal-obstruent clusters (e.g. kaNg ae ‘thought’ within a single morpheme and kande ‘chew’ kam + te across a morpheme boundary). This phenomena is also attested in some Type II languages such as Campa (Arawak), Quichua and Zoque. (Some Type IV languages such as Thai, which also employ [L] as the category for true voicing, also exhibit this sort of pattern.)

The representation of laryngeal-source contrasts in Japanese

81

The true voicing process of Rendaku is also regarded as categorical. Under Rendaku, an initial ‘voiceless’ (voiceless unaspirated) consonant of the second member of a non-coordinate compound is realised as its ‘voiced’ (truly voiced) counterpart (e.g. onna + kokoro onnagokoro ‘woman’s heart’)2. This process has its origins in the diachronic change which derived voicing from the genitival particle no (or ni) by eliding the vowel and then absorbing the nasal into the following ‘voiceless’ obstruent (Unger 1977; Vance 1983, 1987). In light of this fact, Itô and Mester (1986), for example, assume that the source of voicing is considered to be a compounding conjunctive morpheme, which consists of no melodic content other than a voice feature appearing at the left edge of the second member of a compound. In terms of elements, the sole independent source category [L] participates in both of the phenomena outlined above. In postnasal voicing, [L] in the nasal of a nasal-obstruent cluster spreads to the following obstruent. In the case of Rendaku, the compounding conjunctive morpheme consisting of [L] docks on to the initial obstruent position of the second member of a compound unless the process violates Lyman’s Law.

6.3. Devoicing Turning to the so-called ‘voiceless’ members the two-way laryngeal contrasts in Japanese, these involve no source specification and are therefore predicted to be phonologically inert. That is, the members of this neutral series of obstruents can undergo laryngeal-source assimilation as in postnasal voicing but cannot trigger the process. At first sight, however, phenomena such as devoicing do seem to contradict this prediction. Both this volume and also the wider literature report that Japanese exhibits the notable exception of vowel devoicing, which most frequently occurs when a high vowel is flanked by ‘voiceless’ obstruents: e.g. aki8ta ‘Akita’ (place name). According to one view, as explained by Tsujimura (1196: 27–28), the voiceless properties of both flanking consonants affect the intervening vowel: the value of [±voice] in the vowel changes from plus to minus. For this reason alone, it is often assumed that the ‘voiceless’ property in obstruents can be phonologically active in Japanese. Within the framework of Element Theory, however, unlike distinctive feature theories, the ‘voiceless’ (neutral) obstruents in Japanese, as well as all vowels and sonorant consonants, have no source specification which can act as a trigger for laryngeal assimilation. Instead, I assume that vowel de-

82 Kuniya Nasukawa voicing in Type II languages is related to the active status of the noise element [h], which manifests itself acoustically as aperiodic energy. The closest corresponding feature might be [+continuant], although unlike [+continuant], [h] is present not only in fricatives and affricates but also in plosives (Harris and Lindsey 1995: 73–4). It is always specified in the internal organisation of obstruents (except the glottal stop /). It will be recalled that the context which triggers high vowel devoicing makes crucial reference to obstruents: segments flanking high vowels must be not only ‘voiceless’ but also obstruents. In element terms, this condition is described by the statement that [h] (which is required in obstruents) in non-nuclear positions flanking a high vowel nucleus affects the nuclear position only if [h] is not combined with any laryngeal-source specification (that is, no [L] in Japanese – which indicates a neutral obstruent). Following the analyses of the interpolation of nasality in Cohn (1993) and Nasukawa (1995, 2005a), I assume that the extension of [h]’s phonetic signature over a flanked nucleus results from the phonetic interpolation of the two [h]s in the flanking obstruents. This can be basically attributed to the quality of Japanese high vowels, which are relatively centralised and are frequently involved in various processes. For example, they are often used as an epenthetic vowel because of their less salient profile (e.g. tas ‘to add’ + ta ‘past tense suffix’ tasita in Yamato Japanese, and ski ‘ski’ sμ kii in lexical borrowing), and often undergo assimilatory processes (e.g. eiga ‘film’ eega and μ ma ‘horse’ M ma). Spontaneous voicing of such phonetically ‘weak’ vowels tends to be overshadowed by its neighbouring [h]s, and the result is generally perceived as devoicing. This kind of interpolation is not achieved when high vowels are flanked by obstruents with long-lead voicing – that is, when [h] co-exists with [L] in a single expression. This is due to an acoustic effect whereby the characteristics of aperiodic energy are partially suppressed by the co-existence of long-lead glottal pulsing.

6.4. Early language acquisition and aphasia The zero-[L] source contrast straightforwardly captures the early stages of language acquisition in Japanese. In the speech of very young Japanese children (before the age of about 1 year 6 months), the laryngeal properties of plosives are, as Jakobson (1968: 14) claims, neutralized into the region of zero or short voicing lag (Nasukawa 2005b). Then at a later stage the long-

The representation of laryngeal-source contrasts in Japanese

83

lead and short-lag contrasts emerge. Assuming that unmarked melodic representations are acquired before marked ones, then we predict that acquisition is characterized by two separate stages: first, the non-specification of [L] (neutral laryngeal state), then later the specification versus non-specification of [L]. That is to say, the neutral state (baseline) of laryngeal-source specification is preferred in the earlier stages of plosive production. This unmarked status of short-lag plosives in the speech production of very young children is also backed up by acquisition studies involving other languages (see Harris 1998: 17–9, for a detailed discussion and references therein). Reports of aphasic deficit in Japanese can also be accounted for in similar terms. For instance, in Broca’s and global deficit the laryngeal-source contrasts are collapsed, and the production of stops tends to converge on the short-lag region (Itoh, Tatsumi and Sasanuma 1986). This convergence can be regarded as a loss of the categorical representation [L]. Similar patterns are attested across different languages: e.g. the loss of [H] in English (cf. Blumstein, Cooper, Statlender, Goodglass and Gottlieb 1980) and the loss of [L] and [H] in Thai (cf. Gandour and Dardarananda 1982, 1984).

7. Reductionism: merger of [L] and [N] Within established versions of Element Theory (Harris & Lindsey 1995), the laryngeal-source contrast in Japanese is represented by assuming a nonspecified neutral baseline versus the specification of [L]. While maintaining this basic assumption, an alternative representation of long-lead voicing (henceforth ‘voicing’) is proposed by Nasukawa (1995, 1998, 1999). This move is intended to contribute to a general trend towards reducing the number of elements available to the phonology.3 Reflecting the strong correlation between voicing and nasality in melodic representations, Nasukawa proposes a merger of voicing and nasality under a single element [N] (murmur element), which has been used for representing nasality.4 Then, the notion of melodic headedness contributes to the dual phonetic interpretation of the single element:5 if [N] is headless, it is interpreted as nasality; on the other hand, if [N] is headed, it is interpreted as voicing.6 (5)

Element

Phonetic manifestation

[N] (non-headed [N]) [N] (headed [N])

Nasality (long-lead) voicing

84 Kuniya Nasukawa The first formal evidence that voicing and nasality are two instantiations of the same category is provided in Nasukawa (1995, 1998), which present an integrated approach to the paradixical behaviour of voice and nasality: nasals appear to be specified for voice in postnasal voicing assimilation (e.g. åin + ta åinda ‘died’), while they behave as if they have no voice in Lyman’s Law, which allows only a single voiced obstruent in a particular domain (e.g. kanade ‘a play, a dance’, *ganade). By adopting the representations in (5), postnasal voicing is treated as the extension of the [N] across both positions of an NC cluster, where only the element in the second position is promoted to a headed status.7 On the other hand, the transparency of nasal obstruents to Lyman’s Law follows from [N] failing to be headed. Furthermore, the dual interpretation of [N] is supported by some robust correlations between voicing and nasality. A typical instance of such a relation is postnasal voicing assimilation, found not only in Yamato Japanese, but also in many languages such as Quichua and Zoque, where an obstruent preceded by a nasal is obligatorily voiced. Another example of the relation between voice and nasal is found in processes involving alternations between voiced obstruents and their nasal reflexes – such as fully-nasalised and prenasalised voiced cognates. This kind of process is often observed in intervocalic contexts. For example, voiced obstruent prenasalisation is witnessed in Northern Tohoku Japanese, languages of the Reef Island-Santa Cruz family, those in the Pacific area and several Bantu languages; in the intervocalic context, conservative Tokyo Japanese exhibits voiced-velarobstruent nasalisation. Furthermore, in the verbal inflexion of Yamato Japanese, the stem-final b in a verbal stem such as tob ‘to fly’ is realised as a nasal that is homorganic with the initial obstruent of a suffix such as -te (gerundive).8 According to Nasukawa (2005a), the assignment of head status to voicing rather than to nasality can be justified as follows. First, the representations in (5) can encode an implicational universal between voicing and nasality. (6)

Typology of (long-lead) voicing and nasal Languages e.g. Quileute Finnish, English — Spanish, Thai

Voicing

Nasal

The representation of laryngeal-source contrasts in Japanese

85

As illustrated in (6), we never encounter a system which displays trulyvoiced plosives without also having nasals. This observation allows us to express the implication that the existence of voicing implies the existence of nasal. This implicational universal is straightforwardly captured by the representations in (5). Second, the representations in (5) encode the optional status of voicing. Almost all languages exploit contrastive nasality, whereas voicing is parametrically controlled. The optional status of voicing is reflected in the nonintegral nature of headedness: some systems permit [N] to be headed, while others disallow this as a structural possibility. Furthermore, the representations in (5) reflect differences in complexity between voicing and nasality. In the analysis of prenasalisation and velar nasalisation by Nasukawa (1999), nasality must be less complex structurally than voicing, since the latter property is often suppressed in intervocalic contexts and instead nasality is interpreted (e.g. in some dialects of Japanese, some Western Indonesian languages and several Bantu languages). According to Harris (1994, 1997), segmental structure is less complex in weak positions than in strong positions – a state of affairs predicted by the proposed representations in (5). Finally, the elimination of [L] from the melodic inventory for phonationtype contrasts complements a recent analysis of tone and intonation. Instead of [L] representing low pitch in nuclear positions, for example, the nonspecification of [H] (which is interpreted as high tone) in nuclear positions phonetically manifests itself as low pitch in the analysis of Japanese pitch accentuation (Yuko Yoshida 1995). Also, in the analysis of intonation, as an alternative to the specification of [L], a prosodic boundary unassociated to [H] is interpreted as low pitch (Cabrera-Abreu 2000). According to this revised approach, the specification of source contrasts can be summarized as follows: (7)

Source elements Non-specified [N] (headed [N]) [H] [N, H]

Phonetic manifestation Short or zero voicing lag (neutral) Long voicing lead (truly voiced) Long voicing lag (voiceless aspirated) Long voicing lead and lag, murmur (breathy)

The two-way source contrast of Japanese (a type II language) is then represented by the contrast between a non-specified neutral baseline and headed [N].

86 Kuniya Nasukawa 8. Summary The main purpose of this paper has been to present an analysis of laryngealsource contrasts in Japanese within the scope of Element Theory. Exhibiting cross-linguistic variation in VOT values, the contrasts are derived by the non-specification of any source elements versus the specification of the low source element, which correspond phonetically to the laryngeal properties of neutral and long voicing lead, respectively. This is supported by a number of phonologically-active phenomena involving true voicing, the preference for the neutral laryngeal state in early language acquisition and the convergence of VOT values on the neutral region in cases of aphasia in Japanese. Finally, following a recent proposal to merge the low source and nasal elements, this paper has shown how long voicing lead is represented by a headed nasal element ([N]) in a given expression. To support this position, evidence has come from the correlations observed between voice and nasal, as well as from implicational universals. To conclude, the Japanese laryngealsource contrast is represented by the specification of two laryngeal states: the ‘bare’ source baseline versus the same baseline with a headed nasal element superimposed on to it.

Notes 1. The first Element Theory analysis of source contrasts in Japanese was given in Shohei Yoshida (1991, 1996), where both elements [L] and [H] are employed. Without taking into consideration any of the cross-linguistic differences regarding source contrasts, he utilises [L] for long ‘voiced’ obstruents and [H] for ‘voiceless’ cognates. 2. An exception arises when a voiced obstruent is already specified in a given lexical form. In such cases Lyman’s Law requires the original ‘voiceless’ consonant to remain unchanged. 3. There have been several proposals to extend the element-reducing programme in various ways, for instance by merging aspiration with noise and coronality with openness (van der Hulst 1995; Marten 1996; Charette & Göksel 1998; Kula & Marten 1998). The conceptual advantages of this approach are clear. However, the empirical consequences have yet to be fully worked out. 4. Instead of [N], Ploch (1999) and others use [L] and eliminate [N] from the element inventory. However, I have opted for [N] rather than [L] to represent

The representation of laryngeal-source contrasts in Japanese

5.

6.

7. 8.

87

the correlation between voicing and nasality, since the ‘bare’ element without headship status contributes nasality. In this treatment, the headedness of a given element is regarded as an intrinsic property which enhances the acoustic image of the element (Harris 1994; Harris & Lindsey 1995; Backley 1998; Nasukawa 1998, 1999). Within a geometry-based version of Element Theory (Backley 1998; Backley & Takahashi 1998), Nasukawa (2005a) proposes that the contrast between voicing and nasality is represented using the same idea: if the element [N] licenses its [comp], then it is interpreted as (long-lead) voicing, while the same element without a licensed [comp] is interpreted as nasality. See Nasukawa (2005a: §4.5) for a detailed discussion. See also Ploch (1999) for further arguments to support the merger of long-lead voicing and nasal elements.

Rendaku in inflected words Timothy J. Vance

1. Rendaku The Japanese term rendaku, which Martin (1952: 48) translates as ‘sequential voicing’, refers to a morphophonemic phenomenon found in compounds and in prefix+base combinations. A morpheme that shows rendaku has one allomorph beginning with a voiceless obstruent and another allomorph beginning with a voiced obstruent. The rendaku allomorph (i.e., the allomorph beginning with a voiced obstruent) of such a morpheme appears only when it is a non-initial morph in a word. The examples in (1) illustrate the pairs of phonemes that can alternate. (1)

ALTERNATING PHONEMES

/f/~/b/ /h/~/b/ /t/~/d/ /k/~/g/ /c/~/z/ /s/~/z/ //~/jˇ/ //~/jˇ/

VOICELESS ALTERNANT

/fune/ /hako/ /tama/ /kami/ /cuka/ /sora/ /i/ /irui/

‘boat’ ‘case’ ‘ball ‘paper ‘mound’ ‘sky’ ‘blood’ ‘symbol’

VOICED ALTERNANT

/kawa+bune/ /hai+bako/ /me+dama/ /kabe+gami/ /ari+zuka/ /hoi+zora/ /hana+jˇi/ /ya+jˇirui/

‘river boat’ ‘chopstick case’ ‘eyeball’ ‘wallpaper’ ‘anthill’ ‘starry sky’ ‘nosebleed’ ‘arrow symbol’

Because of well-known historical changes, some of the alternations in modern Japanese involve more than just a difference in voicing. Notice that /b/ alternates not with /p/ but with /f/ ([F]), as in /fune/~/bune/, and with /h/ ([h] or [ç]), as in /hako/~/bako/.1 Notice also that /z/ ([dz] or [z]) alternates both with /c/ ([ts]), as in /cuka/~/zuka/, and with /s/, as in /sora/~/zora/, and that /jˇ/ ([ÔÛ]) alternates both with // ([cÇ]), as in /i/~/jˇi/, and with /s/ ([Ç]), as in /irui/~/jˇirui/.2

90

Timothy J. Vance

2. Historical development The oldest substantial texts in Japanese date from the 8th century, and the language they represent presumably reflects a variety spoken by the aristocracy in the contemporary capital of Nara. There is general agreement that word-medial voiced obstruents were prenasalized in Old Japanese: [Ng ndz n m 3 d b] (Vance 1983: 335–337). As Unger (1977: 8–9) first pointed out, if we make the plausible assumption that such prenasalization was present in prehistoric Japanese as well, a satisfying explanation for the origin of sequential voicing is available. Hamada (1952: 23) cites the examples in (2) to illustrate the historical process of interest in some items that developed after the 8th century.4 (2)

/sumi+sur-i/ /suzuri/ /fumi+te/ /fude/ /ika ni ka/ /ikaga/

‘ink’+‘scraper’ ‘inkstone’5 ‘letter’+‘hand’ ‘writing brush’ INTERROG+ADV+‘?’ ‘how’

In each case, it looks as if a sequence of the form N (nasal consonant) + V (vowel) + O[-vce] (voiceless obstruent) was replaced by O[+vce], i.e., the voiced counterpart of the original obstruent. This replacement would have been a natural consequence of vowel syncope, given that word-medial voiced obstruents were prenasalized at the time. Vowel syncope alone would have yielded a phonotactically anomalous nasal+obstruent cluster in each case.6 It is a plausible inference that this process was involved in the origin of rendaku. As the example in (3) shows, this account requires us to posit an earlier syllable of the form NV between the two elements of a compound that showed rendaku in Old Japanese.7 (3)

POJ

/yama/ + POJNV + POJ/ta/ OJ/yama+da/ ‘mountain’ + ?? + ‘paddy’ ‘mountain paddy’

The obvious candidate for the mystery syllable is the genitive particle POJ /nö/, the ancestor of OJ/nö/ and modern /no/. Attested Old Japanese vocabulary items like those in (4) suggest why rendaku was irregular (as it continues to be in modern Japanese).8 (4)

a. b. c.

OJ

/akî+nö+pa/ /taka+pa/ OJ /sasa+ba/ OJ

‘autumn leaf’ ‘bamboo leaf’ ‘bamboo-grass leaf’

POJ

/sasa nö pa/

Rendaku in inflected words

91

As expected, lexicalized phrases that retained genitive OJ/nö/ (as in 4a), did not show rendaku.9 Noun+noun compounds could have originated either by simple juxtaposition, in which case rendaku did not occur (as in 4b), or by contraction of a phrase, in which case rendaku did occur (as in 4c).10

3. Inflected words Verb+verb compound verbs are abundant in Japanese, but they rarely show rendaku. An example is /kak-i+tor-u/ ‘write down’, which contains the roots of /kak-u/ ‘write’ and /tor-u/ ‘take’. The first component verb in such a compound is invariable; it must appear in its ‘continuative’ form.11 The second component verb bears whatever inflectional ending is required for the compound as a whole; the citation form of a verb is the nonpast indicative. The account in §2 of the origin of rendaku provides a natural explanation for the rarity of rendaku in compounds of this type (Vance 1983). There is no reason to suppose that the components of a verb+verb compound verb were ever connected by a genitive particle or any other NV syllable in earlier stages of Japanese. As noted in the previous paragraph, the first element of a verb+verb compound verb appears in its continuative form. The continuative of any verb is an inflectional form, and as a word on its own it functions to connect its clause to a following clause. The example in (5) illustrates with /hana-i/ (romanized hanashi), the continuative of /hanas-u/ ‘speak’. (5)

Tomodachi to hanashi, sore kara nemashita. friend with speak-CONT that from sleep-POLITE-PAST ‘(I) spoke with (my) friend, and after that (I) went to bed.’

According to the traditional Japanese analysis of verb morphology, almost all verbs fall into one of two regular conjugation classes.12 Assuming the widely adopted morphological segmentation of verb forms proposed by Bloch (1946), every verb in the first of these two classes has at least one stem allomorph that ends in a consonant.13 The verb meaning ‘speak’ in the first clause in (5) is an example of such a consonant-stem verb: the stem allomorph in the citation form /hanas-u/ ends in /s/, and the stem allomorph in the continuative /hana-i/ ends in //.14 The continuative of every consonant-stem verb has the inflectional ending /i/. Every verb in the other regular conjugation class has an invariant stem ending in a vowel (either /i/ or

92

Timothy J. Vance

/e/). An example of such a vowel-stem verb is /tabe-ru/ ‘eat’, with the nonpast indicative marked by /ru/ rather than by the /u/ of consonant-stem verbs. The continuative of this verb is /tabe/, with no inflectional ending, since the continuative of every vowel-stem verb is identical to its stem.15 Many verbs have a corresponding deverbal noun that is segmentally identical to the continuative, although it may be accented on a different syllable (Martin 1952: 34). The examples in (6) illustrate, pressing English gerunds into service as translations of the continuative forms. (6)

/yasum-u/ ‘rest’16 /yasum-i/ ‘resting’ /yasum-i/ ‘vacation, break’

/kikoe-ru/ ‘be audible’ /kikoe/ ‘being audible’ /kikoe/ ‘sound’

Okumura (1955) claims that rendaku does not occur in compounds of inflected word plus inflected word.17 In fact, we do find examples of rendaku in such compounds, but as mentioned above, rendaku is rare in verb+verb compound verbs. Okumura’s illustrative examples actually suggest a more interesting generalization. Two of those examples appear in (7). (7)

a. /waka-i+kak-u/ ‘write with spaces between words’ (a verb) b. /waka-i+gak-i/ ‘writing with spaces between words’ (a noun)

Both examples in (7) derive from the verbs /wakac-u/ ‘divide’ and /kak-u/ ‘write’, and the former, like all non-final verbal elements in compounds, appears in its continuative form /waka-i/. The verb /waka-i+kak-u/ (7a: V1+V2=V) is given in its citation form, with the second element bearing the nonpast affirmative ending /u/. The noun /waka-i+gak-i/ (7b: V1+V2=N), on the other hand, does not inflect; the second element is fixed in form. Okumura’s precise claim thus appears to be that rendaku will not occur in a compound which consists of two inflected words and is itself an inflected word. At the same time, the second example suggests that we should expect rendaku in a compound that consists of two verb stems but is itself a noun. The examples just considered involve verbs. The other major class of inflected words in Japanese is adjectives.18 Just as in the case of a verb, the citation form of an adjective is the nonpast indicative. The adjectival nonpast indicative suffix has the invariant form /i/. The continuative form of an adjective is always marked by the suffix /ku/ and is never identical to the stem. When a compound contains an adjective as its initial element, the adjective always appears as a bare stem. The examples in (8) illustrate.

Rendaku in inflected words

(8)

/omo-i/ /omo-ku/ /omo+kurui-i/

‘heavy’ ‘being heavy’ ‘oppressive’ (cf. /kurui-i/ ‘strained’)

/haya-i/ /haya-ku/ /haya+oki/

‘early’ ‘being early’ ‘early rising’ (cf. /oki-ru/ ‘get up’)

93

Some adjective stems can be used as nouns (Martin 1975: 399), as the examples in (9) show. (9)

/maru-i/ ‘round’ /maru/ ‘circle’

/a+iro-i/ ‘brown’ (literally ‘tea-colored’) /a+iro/ ‘brown’ (literally ‘tea-color’)

4. Verb+verb compounds A set of verb+verb compounds was collected to assess the notion that rendaku does not occur in a compound that consists of two inflected words and is itself an inflected word. The first step in the collection procedure was to make a list of all the non-compound verbs beginning with a voiceless obstruent that appear in Kazama (1979), a reverse dictionary that has a separate section for each part of speech. There is no point in considering verbs that do not begin with a voiceless obstruent, since rendaku cannot affect a vowel (as in /oboe-ru/ ‘remember’), a sonorant (as in /nom-u/ ‘drink’), or an obstruent that is already voiced (as in /de-ru/ ‘leave’). In order to limit the investigation to words in common use in modern Japanese, each verb on the list was checked in a medium-size Japanese-English dictionary (Hasegawa et al. 1986). Every verb on the list that does not appear as a headword in this dictionary was eliminated from further consideration. Also eliminated was every verb that contains a medial voiced obstruent (e.g., /sage-ru/ ‘lower’). A compound containing such a second element is subject to a well-known constraint on rendaku called Lyman’s Law (Vance 1987: 136– 139): rendaku almost never affects an initial obstruent of an element that already contains a voiced obstruent. Consequently, it would not be appropriate to cite a verb+verb compound verb such as /hik-i+sage-ru/ ‘pull down’ as support for the claim that compounds of this form resist rendaku. The next step in the data collection process was to find compounds of the form V1+V2=V or V1+V2=N in which V2 is one of the verbs remaining on the list described in the preceding paragraph. The original intent was to

94

Timothy J. Vance

select every compound of the appropriate form that appears in either or both of two reverse dictionaries (Y. Kitahara 1990 and Iwanami Shoten Henshûbu 1992). However, this step in the process turned out to be tremendously time-consuming, and it was completed only for items in which V2 is a consonant-stem verb.19 Each compound was then checked in an unabridged dictionary produced by a major Japanese publisher (Matsumura 1988). To avoid missing any relevant items, even if, for a particular V1 and a particular V2, only one of V1+V2=V and V1+V2=N was found in the reverse dictionaries, both were checked in the unabridged dictionary. Those compounds that appear in the unabridged dictionary were retained for further consideration.20 The next step in the process was to eliminate compounds that probably should not be analyzed as containing two verbs in modern standard Japanese. For some of these excluded items, the reason is simply that the entry in the unabridged dictionary identifies them as obsolete or dialectal. In other cases, the unabridged dictionary does not list V1 as an independent verb in the modern language. In still other cases, a compound contains an etymological verb form that seems to have lost its verb-form status. Examples of this last type include /tatami+kae/ ‘replacing mats’, in which the etymological continuative /tatam-i/ seems to be functioning as a simple noun meaning ‘mat’ with no connection to the verb /tatam-u/ ‘fold’. Another example is /kumori+gai/ ‘tending toward overcast’, in which /gai/ is etymologically the rendaku allomorph of the continuative of /kac-u/ ‘win’. In modern Japanese, /gai/ is simply a derivational suffix that derives nouns from verbs and no longer has any connection to the verb meaning ‘win’.21 Next, coordinate compounds such as /yom-i+kak-i/ ‘reading and writing’ were eliminated, since it is well known that rendaku generally does not occur in coordinate compounds (Vance 1987: 144–145). It would not appropriate to cite /yom-i+kak-i/ as evidence that verb+verb compound nouns do not show rendaku. The last step in the process was to search the remaining examples for pairs in which a verb+verb compound verb (V1+V2=V) and a verb+verb compound noun (V1+V2=N) both involve the same two verbs in the same order. The final data set consists of every such pair (a total of 234 pairs) and the pronunciation(s) given in the unabridged dictionary for each paired word. The data set includes some pairs that exemplify the pattern described in §3, including (10). Notice that the verb does not show rendaku while the noun does.

Rendaku in inflected words

(10) V1+V2=V [–rendaku] V1+V2=N [+rendaku]

: :

95

/toor-i+kakar-u/ ‘pass by’ /toor-i+gakar-i/ ‘passing by’

But there are also pairs in which both the verb and the noun show rendaku and other pairs in which neither shows rendaku. The examples in (11) illustrate. (11) a. V1+V2=V [+rendaku] : /kaer-i+zak-u/ ‘bloom again’ V1+V2=N [+rendaku] : /kaer-i+zak-i/ ‘second blooming’ b. V1+V2=V [–rendaku] : /mi+toos-u/ V1+V2=N [–rendaku] : /mi+too-i/

‘foresee’ ‘prospect’

In some pairs, the pronunciation of one or both members has the mora obstruent /Q/ following the continuative form of V1, as in (12a), or in place of the last syllable of the continuative form of V1, as in (12b).22 (12) a. V1+V2=V [–rendaku] : /hane+kaer-u/ ‘rebound’ V1+V2=N [–rendaku]~[mora obstruent] : /hane+kaer-i/~/hane-Q+kaer-i/ ‘rebound’ b. V1+V2=V [mora obstruent]: /yoQ+para-u/ ‘get drunk’ V1+V2=N [mora obstruent]: /yoQ+para-i/ ‘drunken person’ The continuative form of /yo-u/ ‘get drunk’ (V1 in 12b) is /yo-i/. Since the mora obstruent pre-empts rendaku (Vance 1987: 148), if the unabridged dictionary gives a pronunciation with /Q/ as the only pronunciation of either member of a pair, that pair was excluded from the statistics reported below. In other pairs, the pronunciation of one or both members has the mora nasal /N/ in place of the last syllable of the continuative form of V1, as in (13). (13) V1+V2=V [mora nasal] V1+V2=N [mora nasal]

: /fuN+gir-u/ ‘take decisive action’ : /fuN+gir-i/ ‘taking decisive action’

The continuative form of /fum-u/ ‘step’ (V1 in these examples) is /fum-i/. Since the mora nasal seems to induce rendaku in compounds of this kind, if the unabridged dictionary gives a pronunciation with /N/ as the only pronunciation of either member of a pair, that pair was excluded from the statistics reported below. The unabridged dictionary gives alternative pronunciations for several of the words in the data set. The three examples in (14) illustrate.

96

Timothy J. Vance

(14) a. V1+V2=V [–rendaku]~[+rendaku] : /i+kum-u/~/i+gum-u/ ‘set up’ b. V1+V2=N [–rendaku]~[+rendaku] : /ne+kom-i/~/ne+gom-i/ ‘(time of) sound sleep’ c. V1+V2=V [–rendaku]~[mora obstruent] : /sa-i+hik-u/~/saQ+pik-u/ ‘deduct’ In the statistics reported below, if a verb+verb compound verb has one pronunciation with rendaku and another pronunciation without rendaku, the pronunciation without rendaku was counted. For example, (14a) was counted simply as not showing rendaku. On the other hand, if a verb+verb compound noun has one pronunciation with rendaku and another pronunciation without rendaku, the pronunciation with rendaku was counted. Consequently, (14b) was counted simply as showing rendaku. Treating verbs and nouns differently in this way biases the statistics in favor of the putative pattern described in §3, i.e., that rendaku does not occur in verb+verb compound verbs but does occur in verb+verb compound nouns. Since the statistics reported below will be used to deny the existence of this pattern, the deliberate bias in counting will strengthen the argument. As for items having one pronunciation with /Q/ or /N/ and another pronunciation without, the pronunciation without a mora consonant was counted. For example, (14c) was counted simply as not showing rendaku. A few pairs in the data set were particularly problematic, including the two examples shown in (15). (15) a. V1+V2=V [–rendaku]: /mi+tor-u/ V1+V2=N [–rendaku]: /mi+tor-i/ V1+V2=N [+rendaku]: /mi+dor-i/

‘comprehend by looking at’ ‘comprehending by looking at’ ‘looking over and selecting’

b. V1+V2=V [+rendaku]~[–rendaku]~[extra mora obstruent]: /de+bar-u/~/de+har-u/~/deQ+par-u/ ‘protrude’ V1+V2=V [+rendaku]~[–rendaku]~[extra mora obstruent]: /de+bar-i/~/de+har-i/~/deQ+par-i/ ‘protruding; protrusion’ Corresponding to the verb in (15a), the unabridged dictionary has two separate noun entries, one with rendaku and another without, and the definitions for these two entries are different. Although the noun without rendaku matches the verb semantically, in keeping with the deliberate bias explained just above, (15a) was counted as a verb not showing rendaku and a noun showing rendaku. The unabridged dictionary gives three pronunciations

Rendaku in inflected words

97

each for the verb and noun in (15b). Maintaining the bias again, (15b) was counted as a verb not showing rendaku and a noun showing rendaku. The data set contains a total of 234 verb/noun pairs, and the table in (16) shows how these pairs pattern in terms of rendaku. V+V=V +rendaku –rendaku V+V=N

(16)

+rendaku

10

22

–rendaku

0

202

As the lower right cell in (16) shows, in the great majority of the pairs (202/234 =86%), neither the verb nor the noun shows rendaku. In other words, pairs like /mi+toos-u/ ‘foresee’ and /mi+too-i/ ‘prospect’ (11b) are the norm. By comparison, despite the deliberately biased counting described above, only a small fraction of the pairs in the data set (22 /234 = 9%) exhibit the behavior that Okumura (1955) suggests is typical. This means that pairs like /toor-i+kakar-u/ ‘pass by’ and /toor-i+gakar-i/ ‘passing by’ (10) are actually quite unusual. Needless to say, a data set consisting of entries in an unabridged dictionary certainly will not match the relevant portion of a representative native speaker’s actual vocabulary. To get some idea of how serious this shortcoming might be, a well-educated native speaker went through the 234 verb/noun pairs in the data set, discarded those that were unfamiliar to her, and noted pronunciations (with or without rendaku) that differed from her own.23 Applying the same counting bias as above to this revised data set, the pairs in this speaker’s vocabulary pattern as in (17). V+V=V +rendaku –rendaku V+V=N

(17)

+rendaku

7

13

–rendaku

0

188

The total number of pairs in this revised data set is 208, and their distribution in the four cells of the table in (17) differs very little from the distribution of

98

Timothy J. Vance

the pairs in (16). Here again, in most of the pairs (188/208 =90%), neither the verb nor the noun shows rendaku, and only a small fraction of the pairs (13/208 =6 %) show rendaku in the noun but not in the verb. In short, the revised data set suggests that simply relying on the dictionary entries does not lead us astray. Consequently, no attempt was made to go beyond dictionary entries for the counts reported below in §5.

5. Compounds involving adjectives As Kikuda (1971) notes, adjectival elements in compounds actually pattern very differently from verbal elements. Compounds of four additional types will be considered in this section: adjective+adjective compound adjectives (A+A=A), verb+adjective compound adjectives (V+A=A), adjective+verb compound verbs (A+V=V), and adjective+verb compound nouns (A+V=N). Although some adjective stems can be used as nouns, as noted in §3, compound nouns ending with an adjectival element (A+A=N or V+A=N) are very rare and will not be considered further. A set of relevant compounds involving adjectival elements was collected by following a procedure parallel to the one described in §4 for verb+verb compounds. In this case, of course, the first step was to make a list of all the non-compound adjectives (rather than verbs) beginning with a voiceless obstruent that appear as headwords in the medium-size Japanese-English dictionary (Hasegawa et al. 1986). The result of the collection procedure was a data set consisting of compounds that appear in the unabridged dictionary (Matsumura 1988). The verbal elements in items of the form A+V=V or A+V=N were restricted to consonant-stem verbs, since these items were collected in tandem with verb+verb compounds. As explained above in §3, this part of the process was so time-consuming that it was completed only for examples ending in a consonant-stem verbal element. The number of adjective+adjective compound adjectives in the data set is small, but almost half (8 /18=44%) show rendaku, as in (18). (18) V+A=A [+rendaku]: /usu+gura-i/ Cf. /usu-i/ ‘thin’, /kura-i/

‘dim’ ‘dark’

Rendaku also appears in nearly all verb+adjective compound adjectives (17/20 =85%) and in all adjective+verb compound verbs (7/7 =100%), as in (19).

Rendaku in inflected words

99

(19) V+A=A [+rendaku]: /utaga-i+buka-i/ ‘suspicious’ Cf. /utaga-u/ ‘doubt’, /fuka-i/ ‘deep’ A+V=A [+rendaku]: /naga+bik-u/ ‘be prolonged’ Cf. /naga-i/ ‘long’, /hik-u/ ‘pull’ Most adjective+verb compound nouns in the data set also show rendaku (43/47=91%), as in (20). Just like the second element in a verb+verb compound noun, the verbal element in an adjective+verb compound noun appears in its continuative form. (20) A+V=N [+rendaku]: /waka+gaer-i/ Cf. /waka-i/ ‘young’, /kaer-u/

‘rejuvenation’ ‘return’

The table in (21) summarizes the data collected for the six categories of two-element compounds in which both elements are verbal or adjectival. (21) +rendaku –rendaku rendaku %

V+V=V

V+V=N

A+V=V

A+V=N

A+A=A

V+A=A

16 716 2%

211 258 45%

7 0 100%

43 4 91%

8 10 44%

17 3 85%

Unlike the table in (16), the table in (21) includes unpaired items in the two V+V categories. Of the 732 (16+716) V+V=V items tabulated in (21), only 234 are those tabulated in (16). The remaining 498 are V+V=V compounds for which no corresponding V+V=N compound is listed as a headword in the unabridged dictionary. For example, the verb /okur-i+kaes-u/ ‘send back’ (cf. /okur-u/ ‘send’ and /kaes-u/ ‘return’) is listed, but there is no entry for a corresponding noun (which would be either /okur-i+kae-i/ or /okur-i+gae-i/). Similarly, of the 469 (211+258) V+V=N items tabulated in (21), 235 are V+V=N compounds for which no corresponding V+V=V compound is listed as a headword in the unabridged dictionary. For example, the noun /oboe+gak-i/ ‘memo’ (cf. /oboe-ru/ ‘remember’ and /kak-u/ ‘write’) is listed, but there is no entry for a corresponding verb (which would be either /oboe+kak-u/ or /oboe+gak-u/). Including such unpaired items makes it clear that V+V=N compounds are much more likely to show rendaku than V+V=V compounds. Nonetheless, this difference is just a strong statistical tendency, not an inviolable principle. Furthermore, the fact that compounds containing adjectival elements are so

100

Timothy J. Vance

likely to show rendaku means that only verbal elements exhibit this tendency. It is not a generalization that applies to all inflected-word elements.

6. Conclusion As shown in §4, only a small minority of paired V+V=N and V+V=V items show rendaku in the noun but not in the verb (as in /toor-i+gakar-i/ ‘passing by’ and /toor-i+kakar-u/ ‘pass by’). On the other hand, when unpaired items are taken into consideration (as in §5), it is clear that V+V=N compounds are much more likely to show rendaku than V+V=V compounds. Nonetheless, this difference is just a strong statistical tendency, not an inviolable principle. Furthermore, the fact that compounds containing adjectival elements are so likely to show rendaku (as demonstrated in §5) means that only verbal elements exhibit this tendency. It is not a generalization that applies to all inflected-word elements. Incidentally, if rendaku originated as described in §2 above, the behavior of adjectival elements is a mystery, since there is no reason to suppose that the two elements in a compound containing an adjectival element were linked by a syllable of the form NV at some time in the past.

Notes 1. Many linguists prefer to analyze [h], [ç], and [F] as allophones of a single phoneme except in borrowings. I am assuming a uniform phonemic inventory for all vocabulary strata and a split that has resulted in a contrast between [F] and [h]/[ç] (with [h] appearing before /e/, /a/, or /o/ and [ç] appearing before /i/ or /y/). Either way, the rendaku alternation is not simply a matter of voicing. The ancestor of modern /h/ and /f/ was pronounced [p], although there is some controversy about how long the [p] pronunciation persisted in the central dialects (Kiyose 1985). See also Ohno (this volume, §4.2). 2. In Vance (1987: 24), I said that the two allophones of /z/, [dz], and [z], are distributed as follows: [dz] word-initially or immediately following the mora nasal /N/ and [z] elsewhere. The actual distribution is certainly not this clean, but there is no contrast, and the two are unquestionably allophones of a single phoneme. The modern rendaku pairing of /z/ with /c/ and /s/ reflects the historical merger of a voiced affricate and a voiced fricative, and so does the pairing of /jˇ/ with // and // .

Rendaku in inflected words

101

3. Old Japanese also had a corresponding series of phonemes realized as voiceless obstruents, two phonemes realized as nasals, and two phonemes realized as semivowels. The entire Old Japanese consonant inventory is typically transcribed phonemically as /p t s k b d z g m n y w/. For a recent attempt at phonetic reconstruction of the entire Old Japanese phonological system, see Miyake (2003). 4. The etymologies in (2) are reasonably secure and are given in Nihon Daijiten Kankôkai 1972–76, although Miller (1967: 213–14) is dubious about this etymology for /fude/. The earliest attestations for the shortened forms range from ca. 900 for /ikaga/ to ca. 1000 for /fude/. Although the earliest attestation for /sumi+suri/ is from the tenth century, the other two long forms are attested from the eighth century. 5. The hyphen /sur-i/ separates what are commonly analyzed as a verb stem and an inflectional ending. See the discussion below in § 3 for details. 6. The mora nasal /N/, which occurs syllable-finally in modern Japanese, is a later development (Hamada 1955; Vance 1987: 56–57). 7. All Old Japanese examples are marked with a superscript OJ. The transcription conventions follow Miller’s (1986: 198) slightly modified version of the system first adopted by Mathias (1973) and endorsed by Martin (1987: 50). The transcription reflects the fact that many modern standard syllables with one of the vowels /i e o/ correspond to two distinct eighth-century syllables. (For details, see Lange 1973 and Shibatani 1990: 125–139). For each such eighth-century pair, it is standard practice to label one syllable type A (kô-rui) and the other type B (otsu-rui), following Hashimoto (1917: 173–186). Some researchers construe the phonological differences between the type-A and type-B syllables as vowel-quality distinctions; others construe them as distinctions between syllables with and without a glide: CV vs. CGV. In any case, the transcription adopted here represents type-A syllables with a circumflex over the vowel /î ê ô/, typeB syllables with a diaresis over the vowel /ï ë ö/, and syllables for which there was no A/B distinction with no diacritic /i e o/. A capitalized vowel /I E O/ indicates a syllable for which there was an A/B distinction but for which the category is unknown. The source for all Old Japanese forms is the Jôdaigo Jiten Henshû Iinkai 1967, the definitive dictionary of Old Japanese. Hypothetical pre-Old Japanese forms are marked with a superscript POJ. 8. On the fundamental irregularity of rendaku, see Vance 1987: 146–148, Ohno 2000, and several of the papers in this volume. It is curious, to say the least, that these irregularities have not been leveled out over the course of the last millennium, but they have not. This is not to say that the situation has been static. Many individual vocabulary items that used to have rendaku no longer do and vice versa. But these changes do not seem to have any discernible direction. To give just one set of examples, Hepburn’s 1867 dictionary lists the verb /ki+kae-ru/ ‘change clothes’ and the corresponding noun /ki+gae/ ‘changing clothes’, and it also lists the verb /nor-i+kae-ru/ ‘change horses’ and the

102

9.

10.

11.

12.

13.

14.

15.

16.

Timothy J. Vance corresponding noun /nor-i+gae/ ‘changing horses’. The descendants of these items for most modern Tokyo speakers are /ki+gae-ru/ (a gain for rendaku), /ki+gae/ (no change), /nor-i+kae-ru/ (no change), and /nor-i+kae/ (a loss for rendaku). On the other hand, there are puzzling examples of rendaku in phrasal items of this form, including OJ/ama+nö+gapa/ ‘Milky Way’ (cf. OJ/kapa/ ‘river’), with genitive OJ/nö/, and OJ/ma+tu+gë/ ‘eyelash’ (cf. OJ/kë/ ‘hair’), with genitive OJ/tu/ (which has not survived into modern Japanese). As explained in note 7, forms marked with a superscript POJ are hypothetical pre-Old Japanese. The genitive POJ/nö/ in POJ/sasa nö pa/ in (4c) is not attested; it is merely an inference. Interestingly, the form now in use in modern Japanese is /sasa no ha/, not /sasaba/. The other two items in (4) are also obsolete. I have nothing illuminating to say about why certain composite items in the preOld Japanese vocabulary contained a genitive particle while others did not. I assume the situation was much the same as it is in modern Japanese. For example, I have no explanation to offer for why the notion ‘toe’ is expressed by the phrase /ai no yubi/ ‘foot’s digit’ whereas the notion ‘ankle’ is expressed by the compound /ai+kubi/ ‘foot-neck’. The term ‘continuative’ is Kuno’s (1973: 195). The traditional term in Japanese grammar ren’yôkei ‘adverbial form’, and Bloch (1946: 6) calls it the ‘infinitive’ form. The two classes are called godan-kastuyô-dôshi ‘five-row inflection verbs’ and ichidan-kastuyô-dôshi ‘one-row inflection verbs’. For details, see Vance (1987: 178 –184). This rather clumsy characterization is necessary because of verbs such as /ka-u/ ‘buy’, which has a consonant-final allomorph only before /a/, as in the negative /kaw-ana-i/. The Old Japanese citation form of this verb was OJ/kap-u/, and the modern forms reflect a well-known sequence of historical changes. The standard account is that, in word-medial position, [p] >[w], and then [w] >º except before /a/. I use Bloch’s (1946) morphological segmentations as a convenience, not as an endorsement of the analysis behind them. Many linguists prefer not to analyze [s] and [Ç] as contrastive, treating [Ç] before /i/ as a realization of /s/ and [Ç] before by any other vowel as a realization of /sy/ . Parallelism with consonant-stem verbs would dictate a zero morph ‘marking’ the continuative of a vowel-stem verb (as in /tabe+º), but I will not clutter the transcriptions in this paper with zero morphs. For an argument that the very notion of a zero morph is incoherent, see Matthews (1974: 117). The distinctive part of the pitch-accent pattern on a Japanese word is a fall from high pitch to low pitch. I mark the location of a fall with a downwardpointing arrow (). Some words are unaccented, i.e., contain no pitch fall, and no arrow appears in the transcription of an unaccented word. Standard references on Japanese accent include McCawley (1977), Haraguchi (1977), and

Rendaku in inflected words

17.

18.

19.

20. 21.

22. 23.

103

Pierrehumbert and Beckman (1988). Aside from these examples in (5), I have not bothered to mark accent in this paper, since accent does not figure in the discussion. Sakurai (1966: 41) makes a similar claim about compounds of inflected word plus inflected word, but he qualifies it by saying that if the first element is used as a noun, sequential voicing can occur. However, since the first element must appear in its stem form, it is not clear how to determine whether it is being used as a noun (Vance 1987: 143). The other class of inflected words in Japanese contains only a single member: the copula (Bloch 1946: 21–24). We will not consider it here, since even if it occurred in forms that could be construed as compounds, its citation form /da/ and most of its other forms begin with the voiced obstruent /d/, making rendaku inapplicable (or vacuous). I have no reason to think that examples containing vowel-stem verbal elements would significantly change the overall picture that emerges. I could be wrong, of course. A compound could appear either as a headword itself or as a subentry under its first element. To be more precise, the continuative of a verb followed by /gai/ is either an adjectival noun (keiyôdôshi) or what Martin (1975: 179) calls a precopular noun. See Martin (1975: 418–419) for discussion and examples. For details on the mora obstruent in compounds like (12b), see Vance (2002). I am grateful to my research assistant, Mieko Kawai, for her painstaking work.

Ranking paradoxes in consonant voicing in Japanese Haruka Fukazawa and Mafuyu Kitahara

1. Introduction Quite often phonological phenomena are found only in a certain vocabulary class but not in others within a single language. However, the basic tenet in OT, a single invariant ranking, seems incompatible to those multiple vocabulary classes with inconsistent phonological phenomena. Recent OT analyses have developed useful notions to approach this problem. First, multiple sub-lexica are defined when their phonological properties are distinct enough. For instance, Japanese has at least four phonological sub-lexica, such as Yamato, Sino-Japanese, Mimetics, and Foreign (Itô and Mester 1995, 1999; Fukazawa et al. 1998). Second, those sub-lexica are organized in a core-periphery structure (Itô and Mester 1995). Generally (and historically), the native vocabulary tends to form the core part while non-native vocabularies tend to form the periphery. A constraint-based implementation of the core-periphery structure is to assume that the more native the sub-lexicon is, the more markedness constraints it may obey. So, for example, in the most native sub-lexicon Z, constraints *F, *G, and *H are all respected. In the least native sub-lexicon X, however, only the constraint *F is satisfied. Figure 1 shows these relations in a set of concentric ellipses. X Y Z

*H respected *G respected *F respected

Figure 1. A schematic diagram of the core-periphery structure in a constraint-based system. X, Y, and Z are sublexica and *F, *G, *H are markedness constraints.

106 Haruka Fukazawa and Mafuyu Kitahara Faithfulness constraints must be ranked between any two markedness constraints for this core-periphery structure to work. For example, in the sublexicon X, *F is respected but *G and *H are violated. What ensures the latter two to be violated is a relevant faithfulness constraint for the sublexicon X. By the same token, the sub-lexicon Y has its own version of faithfulness constraint where *F and *G are respected but *H is violated.1 The overall ranking for this example is shown in (1). (1)

*F >> Faith-X >> *G >> Faith-Y >> *H >> Faith-Z

In this system, the ranking of markedness constraints is what determines the order of posited sub-lexica. In other words, the ranking of markedness constraints must be consistent anywhere in the phonology of the language. A piece of evidence for a ranking paradox in the markedness hierarchy totally confuses the sub-lexicon analysis. However, we have found three apparent ranking paradoxes around consonant voicing phenomena in Japanese. The markedness hierarchy for the consonant voicing in Japanese established in Itô and Mester (2001) cannot account for the data, which were first brought up in Tateishi (2001) and (2002). In this paper, we will re-examine the constraint ranking regarding consonant voicing in Japanese. Through the analysis in the following sections, our goal is to show three theoretical claims. First, building on Fukazawa, Kitahara, and Ota (2002), we argue that any etymology-motivated sublexica does not exist in the phonological grammar of Japanese. 2 We will introduce a new system of sub-lexica which is based solely on grammatical/ morphological information. The gist of our claim is that Japanese phonological lexicon is classified not into “native”, “non-native” … etc. but into “marked” and “unmarked” groups with respect to a particular markedness constraint. Second, these groups are defined by the relativized faithfulness constraints. The trigger for relativizing the set of faithfulness constraints is again grammatical/morphological information. Historical/etymological categories cannot become evidence for relativizing the faithfulness constraints. We will propose that the relativization is triggered by “stem/affix” distinction in the case of consonant voicing in Japanese. Third, we thrash out a question: can markedness constraints be relativized as well as faithfulness constraints? In Fukazawa and Kitahara (2001), we have argued that only the faithfulness constraints can be relativized. This point is further reinforced by the analysis in the present paper.

Ranking paradoxes in consonant voicing in Japanese 107

The organization of this paper is as follows: In section 2, we will show that the constraint ranking proposed by Itô and Mester (2001) leads to three cases of ranking paradoxes against the data presented in Tateishi (2001). Section 3, then, will give our solution to these cases. In section 4, we will discuss theoretical implications of the present analysis in OT, such as sublexicalization of lexicon without etymological knowledge, removing domains from markedness constraints, and relativization of faithfulness constraints.

2.

Data and issue

2.1. Consonant voicing and Japanese phonological lexicon As reviewed in introduction, the recent OT analyses for the grammar with phonological sub-lexica assume that a core-periphery structure arises through the interaction of faithfulness constraints and markedness constraints. A partial constraint ranking relevant for consonant voicing in Japanese has been considered as in (2). (2)

Ranking for consonant voicing (adapted from Itô and Mester 2001) IDENT[voice]-Foreign | *VoiObs2stem | IDENT[voice]-Sino-Japanese(SJ) | ExpressAffix | IDENT[voice]-Common-Sino-Japanese(CSJ) | *NT | IDENT[voice]-Yamato | *VoiObs

In this ranking, *VoiObs2stem, Express Affix, *NT, and *VoiObs are markedness constraints whose definitions are given in (3).

108 Haruka Fukazawa and Mafuyu Kitahara (3)

Definition of relevant markedness constraints a. *VoiObs2stem: no double obstruent voicing in a stem. (Lyman’s Law) b. EXPRESSAFFIX: affixes must be realized in the output. c. *NT: no voiceless obstruent after a nasal. d. *VoiObs: no voiced obstruents.

These markedness constraints are respected most in the Yamato sub-lexicon, but none in the Foreign sub-lexicon, which is enforced by the intervening faithfulness constraints. The ranking in (2) further shows that there are two sub-lexica between Yamato and Foreign: the Sino-Japanese(SJ) sub-lexicon and the Common-Sino-Japanese(CSJ) sub-lexicon. The only difference between them is whether Rendaku occurs or not. Itô and Mester (2001) posit EXPRESSAFFIX as the regulating markedness constraint for Rendaku since, in their treatment, Rendaku is an insertion of a [voice] feature as an affix in compounding (see Itô and Mester 1986, 1998).3 Due to the fact that Rendaku is typically a characteristic of Yamato words, we see that CSJ words are more nativized than SJ words. In a constraint-based view, this is represented in such a way that EXPRESSAFFIX is ranked between the faithfulness constraint for CSJ and that for SJ. Let us see just one example from Itô and Mester’s analysis where all the markedness constraints appeared in (3) are relevant. (4)

Tableau for [oyako geNka]–‘parent-child quarrel’ in Itô and Mester (2001) /oyako-keNka/

*VoiObs2stem EXPRESSAFFIX ID[voice]CSJ *NT *VoiObs

a. oyako-geNka

*

b. oyako-keNka

*!

c. oyako-keNga

*!

d. oyako-geNga

*!

*

*

* *

*

**

**

The word [oyako-genka] is a Yamato-CSJ compound, which triggers only the IDENT[voice]CSJ constraint to watch voicing modification in /kenka/. A violation of that constraint by Rendaku in candidate (a) is not fatal since other markedness constraints are ranked higher. Candidates without Rendaku (b and c) are penalized by EXPRESSAFFIX constraint. Candidate (d) has Rendaku and voiced obstruent after [N], which is penalized severely by the highest ranked *VoiObs2stem. Note that the “stem” domain specified for

Ranking paradoxes in consonant voicing in Japanese 109

this self-conjoined constraint is crucial in the analysis. The constraint only sees two voiced obstruents as in [geNga]: another voiced obstruent in the first stem does not matter as in, for instance, [mizu-geNka]-’fight for water’.

2.2.

Ranking Paradoxes

2.2.1. Paradox between markedness and faithfulness Tateishi (2001) provides data of loanwords from English plural forms which pose problems for the ranking in (2). These words are team names of Major League Baseball and National Hockey League in the US, which have become popular quite recently in Japan. The data in (6) are added to show that words in other areas are also relevant. Tateishi indicates that English plural forms are not just borrowed into Japanese as they are in English but altered so as to fit into Japanese phonology. In English, the voicing of the plural morpheme “-s” depends on that of the last segment of the stem. However, the voicing contrast in plural forms does not necessarily follow the pattern in English when they are taken into Japanese. (5)

Data from Tateishi (2001) a. b. c. d. e.

(6)

[howaito sokkusu] [iNdiaNzu] [saNzu] [kabusu] [reddo uingusu]

‘White Socks’ ‘Indians’ ‘Suns’ ‘Cubs’ ‘Red Wings’

Additional Data a. b. c. d.

[shuuzu] [jiiNzu] [bibusu] [daburusu]

‘shoes’ ‘jeans’ ‘bibs’ ‘doubles’

Words in (5a–c) show that loanwords are copying the voicing value of the plural morpheme in the original. However, those in (5d–e) and (6c–d) suggest that the situation is not that simple. The pronunciation of the plural morpheme of those words in English is always [z] since the stem ends in a voiced segment. However, corresponding Japanese loanwords have [-su]. It

110 Haruka Fukazawa and Mafuyu Kitahara is obvious that this is not a simple final devoicing phenomenon because of the existence of [-zu] forms in (5b) and (5c). The pattern here seems that (i) plural “-s” is voiced after a nasal as in (5b), (5c) and (6b), (ii) plural “-s” is voiceless when the stem contains at least one voiced obstruent as in (5d), (5e), (6c) and (6d), (iii) otherwise, plural “-s” copies the voicing of the original pronunciation as in (5a) and (6a). Thus, the data suggests that there is a phonological alternation of some sort. Tateishi points out that relevant markedness constraints for this phenomenon are *NT and *VoiObs2stem.4 In Japanese, especially in the native vocabulary, these two markedness constraints are considered to be highranked due to the phenomena of post nasal voicing (PNV) and Lyman’s Law, respectively. There is no voicing contrast after nasals in Yamato since voiceless obstruents are not allowed in the environment. A morpheme does not include more than one voiced obstruent in Yamato (Lyman’s Law). These phenomena lead to the following constraint ranking. (7)

*NT, *VoiObs2stem >> IDENT[voice]-Yamato

In contrast to Yamato, voicing contrast after a nasal is observed and a morpheme can contain more than one voiced obstruent in the Foreign sublexicon, which leads to the following constraint ranking. (8)

IDENT[voice]-Foreign >> *NT, *VoiObs2stem

It is evident that Tateishi’s data in (5) contradict to the ranking for Foreign words in (8) although those words are undoubtedly Foreign. It is true that there are some “assimilated-foreign” words5 in Japanese lexicon, such as [karuta] ‘card’ (borrowed from Portuguese ‘carta’ in 16th century). Assimilated-foreign words are phonologically quite close to Yamato words. For example, [karuta] shows Rendaku in a compound [iroha garuta] ‘cards of the Japanese syllabary’ (see Takayama, this volume, for similar examples). However, we cannot say the words in (5) are “well-assimilated” to Japanese because they are not widely used and are not popular to people other than sports fans. Looking closely, due to the fact that “-s” must be voiced after a nasal as in (5b–c), *NT needs to be ranked higher than the faithfulness constraint for the Foreign sub-lexicon. Thus, the data suggest that either we abandon the membership of the words in (5) to Foreign sub-lexicon, or admit a paradoxical ranking (9).

Ranking paradoxes in consonant voicing in Japanese 111

(9)

*NT >> IDENT[voice]-Foreign

Also, *VoiObs2stem must be ranked higher than the faithfulness constraint for Foreign to account for the data in (5e–g), which is shown in (10). (10) Tableau for “Cubs” /kabusu/

/kabu-zu/

*VoiObs2stem

a. kabu-zu

*!

IDENT[voice]-Foreign

b. kabu-su

*

These are the first two problematic cases for the current OT analyses of multiple sub-lexica in Japanese. The ranking paradoxes here occur between a markedness constraint and a faithfulness constraint. The next subsection introduces a more serious case where the ranking paradox arises between two markedness constraints.

2.2.2. Paradox within markedness As we have seen in (2), Itô and Mester (2001) propose a ranking where *VoiObs2stem is ranked higher than *NT. However, we need a reversed ranking, *NT >> *VoiObs2stem to account for the data in (5b) and (6b). In those words, English plural morpheme “-s” is pronounced as [zu] after a nasal although there is a voiced obstruent in the stem. Tableau (11) shows that the reversed ranking is justified from the data. Candidate (a) has two voiced obstruents, violating the *VoiObs2stem constraint. However, it wins over candidate (b) which has a voiceless obstruent after a nasal. To get a correct output, the *NT constraint must be ranked higher than the *VoiObs2stem constraint. (11) Tableau for “Indians” /iNdiaNzu/ /iNdiaN-zu/

*NT

a. iNdiaN-zu b. iNdiaN-su

*VoiObs2stem *

*!

IDENT[voice]-Foreign

112 Haruka Fukazawa and Mafuyu Kitahara 2.2.3. Summary of ranking paradoxes We have seen three paradoxical cases for the ranking in (2) from the data in Tateishi (2001). To account for Tateishi’s data, we need the rankings in (12). (12) Partial rankings conforming Tateishi’s data a. *NT >> IDENT[voice]-Foreign b. *VoiObs2stem >> IDENT[voice]-Foreign c. *NT >> *VoiObs2stem

for (5b) [iNdiaN-zu] for (5d) [kabu-su] for (5b) [iNdiaN-zu]

On the other hand, Itô and Mester (2001) proposed the ranking in (2) to account for consonant voicing in Japanese. The relevant partial rankings are summarized in (13). (13) Partial rankings proposed by Itô and Mester (2001) for e.g. [furaNsu]-‘France’ a. IDENT[voice]-Foreign >> *NT 2 b. IDENT[voice]-Foreign >> *VoiObs stem for e.g. [gyagu]-‘gag’ c. *VoiObs2stem >> *NT for (4) [oyako-geNka] Itô and Mester’s rankings are thus in contradiction to the ranking for the new data introduced by Tateishi (2001). In the following section, we will give a solution to these ranking paradoxes.

3. Solution Two of the three paradoxes in the previous section are essentially coming from mixing up etymological knowledge with phonological knowledge. We want to put IDENT[voice]-Foreign higher than *NT because we etymologically know that “Indians” is a foreign word in Japanese. Meanwhile, we cannot phonologically know “Indians” as a foreign word because the obstruent after the nasal is voiced. Fukazawa, Kitahara and Ota (2002) show the necessity of reconsidering etymology-based Japanese sub-lexica. The previous literature has claimed that sub-lexica are phonologically motivated and etymology-oriented labelling of sub-lexica is just a convention (Itô and Mester 1995; Fukazawa et al. 1998). Subscript numbers or letters are often used instead. However, just substituting labels to anonymous numbers or letters does not guarantee the independence of phonology from etymology. Fukazawa et al. (2002)

Ranking paradoxes in consonant voicing in Japanese 113

proposed a concrete alternative that phonological sub-lexica can totally be independent of etymological information. In lieu of etymology-based categorization, lexical items are classified into a marked or an unmarked group with respect to a particular markedness constraint. In other words, there are no items with [+Yamato] diacritics nor faithfulness constraints labelled as Yamato which is sensitive to the diacritics. (14) Markedness-driven system (Fukazawa, Kitahara and Ota 2002) IDENT[voice]-X Marked (both NT and ND possible) | [no alternations: PNV contrastive] *NT | IDENT[voice]-Y Unmarked (only ND possible) [alternations in all morphemes: PNV redundant] In the schematic partial ranking in (14), the upper IDENT constraint designates the marked sub-lexicon where there is no alternation in voicing after a nasal. That is, both voiced and voiceless obstruents are possible after a nasal for words in this sub-lexicon. When a word has a voicing alternation in post nasal position, such as verb roots and the past tense /ta/, it belongs to the unmarked sub-lexicon designated by the lower IDENT constraint. *NT is the determining constraint in this case. But, what can X and Y in (14) be? Our proposal in the present paper is that general morpho-phonological domains and categories, such as stem, affix, and word might replace those letters. This is not a new trick of any sort but a quite standard approach to the relativization of faithfulness. Morphophonologically natural domains are the basics of Correspondence Theory (McCarthy and Prince 1995) where base, reduplicant and such are specified for the domain of faithfulness constraints. On the contrary, we argue against any relativization and domain specification of markedness constraints. This is the approach advocated in Fukazawa and Kitahara (2001) where we tried to eliminate the domain specification of the Obligatory Contour Principle (OCP) constraint. *VoiObs2stem constraint is the equivalent of the OCP in the present analysis for the case of Rendaku. As we have seen in (3), the domain “stem” is the crucial part of the definition. However, there is no restriction which domain can be specified for a self-conjoined markedness constraint.6 Introducing arbitrary domains for self-conjunction leads to the relativization of markedness if the same markedness constraint with different domains are put in a single ranking. Therefore, we will use a plain *VoiObs2 without “stem” domain in the present analysis.

114 Haruka Fukazawa and Mafuyu Kitahara Having those considerations in mind, let us analyze the data in (5) and (6). We assume “-zu” transferred from English plural morpheme “-s” belongs to the unmarked sub-lexicon in Japanese because Japanese speakers are tacitly aware of the voicing alternation of that morpheme.7 We assume only morphological alternation is the driving factor for the split of a faithfulness constraint. IDENT[voice]stem and IDENT[voice]affix are thus introduced and they are ranked in that order. With the *VoiObs2 ranked at the top, these split IDENT constraints produce the correct output [kabu-su] as shown in tableau (15).8 The ranking essentially says “avoid two voiced obstruents, but voicing change in the stem is worse than that in the affix”, leading candidate (b) wins over candidate (c). (15) Tableau for “Cubs” /kabusu/ /kabu-zu/

*VoiObs2

a. kabu-zu

*!

IDENT[voice]stem

b. kabu-su

IDENT[voice]affix *

c. kapu-zu

*!

As in the schematic ranking in (14), *NT is ranked below IDENT[voice]stem , which is evident from a case without affix, such as /furaNsu/-’France’ as in (16). (16) Tableau for “France” /furaNsu/ /furaNsu/

IDENT[voice]stem

a. furaNzu

*!

b. furaNsu

*NT *

Thus, we have established a partial ranking: *VoiObs2 >> IDENT[voice]stem >> *NT >> IDENT[voice]affix. However, this ranking cannot account for [iNdiaN-zu] in (5b). The highest ranked *VoiObs2 kills the desired output (a), leaving candidate (b), without voicing change in the stem, as the selected output.

Ranking paradoxes in consonant voicing in Japanese 115

(17) Tableau for “Indians” /iNdiaNzu/ /iNdiaN-zu/ !

a. iNdiaN-zu b. iNdiaN-su c. iNtiaN-zu d. iNtiaN-su

*VoiObs2 IDENT[voice]stem *NT

IDENT[voice]affix

*! *! *!

*

*

*

*

The problem here is that the highest ranked *VoiObs2 constraint immediately kills the desired output because there are apparently two [voice] features for obstruents in /iNdiaN-zu/. However, is this really a problem? The answer is “No” since a single [voice] feature can have two segments to be voiced. In other words, a fusion of two [voice] features is a viable representation for /iNdiaN-zu/. In Fukazawa and Kitahara (2001), we proposed that UNIFORMITY[F] can be relativized to a morpheme to regulate the fusion of features as a repair strategy for the OCP violation. In the present paper, we will explore more candidates with featural fusion for /iNdiaN-zu/ and other examples. In addition to IDENT[F] and UNIFORMITY[F], relativized MAX[F] will be necessary in the present analysis. For the sake of brevity, the definitions are all given in (18) and the proposed overall ranking of relevant constraints is shown in (19). (18) Definition of relativized faithfulness constraints relevant for the present analysis (Abbreviations in parentheses) a. IDENT[voice]stem (IDst): the correspondent segments in a stem in the input and the output have identical values for the feature [voice]. b. IDENT[voice]affix (IDaff): the correspondent segments in an affix in the input and the output have identical values for the feature [voice]. c. UNIFORMITY[voice]stem (UNIst): no feature [voice] in a stem in the output has multiple correspondents in the input (i.e., no coalescence regarding the feature [voice] in a stem). d. UNIFORMITY[voice]word (UNIwd): no feature [voice] in a word in the output has multiple correspondents in the input (i.e., no coalescence regarding the feature [voice] in a word). e. MAX[voice]stem (MAXst): every feature [voice] linked to a segment in a stem in the input has a correspondent in the output. f. MAX[voice]affix (MAXaff): every feature [voice] linked to a segment in an affix in the input has a correspondent in the output.

116 Haruka Fukazawa and Mafuyu Kitahara (19) Proposed ranking for consonant voicing in Japanese (Abbreviations in parentheses) *VoiObs2 | MAX[voice]stem (MAXst) | UNIFORMITY[voice]stem (UNIst) | ExpressAffix (EXPAFF) | IDENT[voice]stem (IDst) | *NT | UNIFORMITY[voice]word (UNIwd) | IDENT[voice]affix (IDaff) | MAX[voice]affix (MAXaff) With this new ranking, not only the problem in (17) but also all the ranking paradoxes mentioned in the previous section are solved. 9 First, as shown in (20), the fused candidate (a) is selected in spite of the violation of UNI10 FORMITYword. (20) Tableau for “Indians” /iNdiaNzu/ /iNdiaN-zu/

*Voi Obs2

MAXst

UNIst

EXPAFF

IDst *NT UNIwd IDaff

iNdiaN-zu

a.

hf

*

[voi] iNdiaN-zu b.

#

#

*!

[voi] [voi] iNdiaN-zu c.

#

*!

*

[voi] iNdiaN-zu d.

#

*!

*

*

*!

*

**

[voi] e.

iNdiaN-zu

*

Ranking paradoxes in consonant voicing in Japanese 117

In (20), candidate (b) has two [voice] features resulting in a violation of the highest ranked constraint *VoiObs2. The violation of MAX[voice]stem penalizes candidates (d) and (e). In those candidates, the feature [voice] attached to the segment [d] in the input is lost in the output: [iNtiaN- zu/su]. Devoicing in the stem is worse than that in the affix because MAX[voice]stem is ranked far higher than MAX[voice]affix. In the optimal candidate (a), two [voice] features are fused into one. Therefore, it does not violate *VoiObs2. Coalescence of the features in the word violates the faithfulness constraint, the low-ranked UNIFORMITY[voice]word, but does not violate UNIFORMITY[voice]stem since one of the [voice] features belongs to the stem but the other belongs to the affix. That is, the coalescence takes place not within a stem but within a word. Candidate (a) wins over candidate (c) since *NT outranks UNIFORMITY[voice]word. (21) Tableau for “oyako-genka” /oyako-keNka/

*Voi Obs2 MAXst UNIst

EXPAFF

IDst *NT UNIwd IDaff

oyako-geNga a.

hf

*!

**

*

[voi] oyako-geNga b.

# #

*!

**

[voi] [voi] oyako-geNka

c.

#

*

*

[voi] oyako-keNga d.

#

*!

*

[voi] e.

oyako-keNka

*!

*

Now, let us reanalyze the example (4) in from Itô and Mester (2001). In (21), candidate (b) loses due to the violation of *VoiObs2. Candidate (a) in which two [voice] features are fused within a stem violates UNIFORMITY[voice]stem . Candidates (d) and (e) lose since Rendaku does not take place, resulting in the violation of EXPRESSAFFIX. Consequently, candidate (c) in which Rendaku takes place becomes optimal. It violates both IDENT[voice]stem and *NT, but neither violation is more serious than those in other candidates.

118 Haruka Fukazawa and Mafuyu Kitahara (22) Tableau for “bibs” /bibu-su/ *Voi Obs2 MAXst UNIst

/bibu-zu/

EXPAFF

IDst *NT UNIwd IDaff

b i b u-su

a.

hf

*

*

*

[voi] b i b u-zu b.

# #

#

*!**

[voi] [voi] [voi] b i b u-su c.

# #

*!

*

[voi] [voi] b i b u-zu d.

#

#

[voi]

[voi]

*!

*

*

*!

*

b i b u-su e.

#

*

[voi] b i b u-zu f.

#

hf

*!

*

[voi] [voi] b i b u-zu g.

hf

#

[voi]

[voi]

*!

*

*

In (22), both candidates (a) and (c) have the same segmental structure [bibu-su], but the featural structures are different. Candidate (c) has two independent [voice] features violating the highest ranked constraint *VoiObs2. On the contrary, candidate (a) violates UNIFORMITY[voice]stem, since two features are fused within a stem. Similarly, we can consider three different featural structures for [bibu-zu] as shown in candidates (b), (f), and (g). All of them lose because they result in violating the highest ranked constraint *VoiObs2 regardless of their featural structures. Candidate (e) violates MAX[voice]stem because the [voice] feature in the stem in the input loses the correspondent in the output, resulting in a violation of MAX[voice]stem. On the contrary, the loss of [voice] feature in candidate (a) occurs in the affix, resulting in a violation of low-ranked IDENT[voice]stem. Consequently, candidate (a) becomes optimal.

Ranking paradoxes in consonant voicing in Japanese 119

(23) Tableau for “Cubs” /kabu-su/ /kabu-zu/

*Voi Obs2

MAXst

UNIst

EXPAFF

IDst *NT UNIwd IDaff

kabu-zu a.

hf

!*

[voi] kabu-su

b.

#

*

[voi] kabu-zu c.

##

*!

[voi] [voi] kabu-zu d.

#

*!

*

[voi]

In (23), candidate (c) loses due to its violation of *VoiObs2. Candidate (d) loses since devoicing takes place in the stem, resulting in the violation of high-ranked MAX[voice]stem. The violation of UNIFORMITY[voice]word in candidate (a) is more serious than that of IDENT[voice]affix in candidate (b) although both of them are relatively low-ranked. Two [voice] features are fused not in the stem but in the word in candidate (a). Devoicing in the affix makes candidate (b) violate IDENT[voice]affix. However, (b) becomes optimal since other candidates commit more serious violations. 11

4. Conclusion We have seen paradoxical cases for the previously proposed system of multiple phonological sub-lexica in Japanese. Our proposal to resolve the paradoxes is simple: relativize faithfulness constraints with standard morphophonological categories. All the patterns brought up in Tateishi (2001) in (5) and additional data of our own in (6) are all accounted for in our analysis. The analysis so far brings up some theoretical implications. First, as we have claimed earlier in introduction, any etymological information should not be mixed up with phonological information for setting up sub-lexica. This position is enforced by a simple consideration about language acquisition. There is no a priori knowledge for children that a certain item belongs to a particular sub-lexicon. At the early stage of acquisition, the grammar,

120 Haruka Fukazawa and Mafuyu Kitahara vocabulary, and the structure of lexicon are all acquired through phonological input. Second, we have discussed elsewhere that relativization of a markedness constraint is not a viable idea (Fukazawa and Kitahara 2001). That is the background reason why we eliminate the “stem” domain from the selfconjoined *VoiObs2 constraint. If we allow arbitrary domain specification in a markedness constraint and allow the same markedness constraint with different domain specifications co-exist in a single ranking, the result will be a relativization of markedness. Finally, we would like to point out that relativization of faithfulness is a fairly standard and well-motivated idea in recent OT studies. What domainrelative faithfulness constraints represent is, we believe, that different domains have different phonological “tightness”. For example, stem is less vulnerable to modification than affix is since it is more tightly woven.

Acknowledgements We thank the editors of this volume, Jeroen van de Weijer, Kensuke Nanjo and Tetsuo Nishihara for giving us an opportunity to contribute our paper. We also thank Shigeto Kawahara, Linda Lombardi, Mits Ota, and Koichi Tateishi for helpful inputs. Of course, all errors are our own.

Notes 1. Sub-lexicon specific faithfulness constraints are derived from a general faithfulness constraint, which is called relativization, or split of faithfulness (Fukazawa 1999). 2. Of course, etymology was not part of the phonological grammar in the previous analyses either. It has been repeatedly pointed out that historical origins of morphemes do not necessarily coincide with the identification of lexical subclasses (Itô and Mester 1999; Tateishi 2003). However, what blurs our eyes is native speakers’ intuition about lexical classification. As Takayama (this volume) argues, we do have an intuition about lexical classes and it may interact with the phonological grammar. Our point is that the former is theoretically distinct from the latter.

Ranking paradoxes in consonant voicing in Japanese 121 3. Rendaku is known for many exceptions and there has been a number of analyses dealing with its irregular nature. Articles in this volume covers most (if not all) aspects of irregularity in Rendaku. In accordance with Kubozono (this volume), we admit Rendaku has a hybrid nature: some words are lexicalized but there also exists a productive synchronic process. Ohno (this volume) and Takayama (this volume) focus on the lexicalized part of Rendaku and argues against productivity. Meanwhile, Haraguchi (2002) and Rice (this volume) try to capture generalizations in the productive part. Following the thread of research on Rendaku in the generative literature, we assume the productivity in Rendaku though being aware of its limitation. 4. Tateishi suggests that a faithfulness constraint called “NEIGHBORHOOD[voice]” might be relevant here. It essentially bans a change in a non-derived environment. His argument is rather directed to the free ranking of faithfulness constraints among multiple sub-lexica, such as Yamato and Foreign. We will not go into this issue because our solution in Section 3 does not require those labelled sub-lexica anymore. 5. This terminology is adopted from Itô and Mester (1995). 6. Self-conjunction is an extension of Local Conjunction (Smolensky 1994, 1995, 1997) where the domain is given a priori. 7. Japanese speakes are certainly aware of the fact that English plural “-s” is a suffix. Suzuki (1990) found that “-s” is often dropped in loanwords (e.g., [on za rokku] ‘on the rocks’) due to the low functional load of plurality in Japanese grammar. As for the unmarked status of “-zu”, the sole motivation for this assumption is that it obeys the post nasal voicing effect, which is commonly seen in native suffixex, such as “-ta/-da” alternation in [kaN-da] ‘bite-Past’ and [kai-ta] ‘write-Past’. Tateishi (2003) argues that plural “-zu” is indeed a Yamato morpheme. 8. The input form /-zu/ is postulated for the plural suffix following Tateishi (2001, 2002, 2003). If we take /-z/ as an input instead, constraints on the licit syllable structure in Japanese forces the final /u/ to be epenthesized. This follows the notion of the Richness of the Base (Prince and Smolensky 1993). 9. MAX[voice]affix is skipped in the following tableaux because it plays no role in selecting the winning candidate. 10. For a detailed ranking argument for UNIFORMITY[voice] and markedness constraints, see Fukazawa and Kitahara (2001). 11. We have analyzed other data such as [gyagu]-‘gag’, a usual Rendaku word like [tabi-bito] ‘travelers’, a usual Lyman’s law case like [kita-kaze] ‘north winds’ with the proposed ranking, and have found out that they are all accounted for without any paradox. We do not put the analyses in this paper due to space limitations.

The implicational distribution of prenasalized stops in Japanese Noriko Yamane-Tanaka

Introduction This paper gives an account of the regular affinity among voice, nasal and place of articulation, focusing on intervocalic stop consonants in Japanese. There is agreement that voiced obstruents appeared only intervocalically and were prenasalized [Ng, ndz, nd, mb] in prehistoric and Old Japanese (Unger 1977; Vance 1983), and that these prenasalized stops (henceforth, PNS) are still retained only in some of the Tohoku dialects in northern Japan.1 It has been suggested that the phonemic inventory of the Tohoku system is similar to that of Old Japanese (F. Inoue 2000), when it was only prenasal which used to play a distinctive role for the intervocalic obstruents (Hamano 2000; M. Takayama 2002; among others). The loss of PNS spans several centuries and is still in progress crossdialectally, but the variations are not random. It is well-known, in the National Language Study in Japan, that the existence of [mb] implies that of [nd] but not vice versa, and [nd] implies [N] but not vice versa (Hashimoto 1932; Hirayama et al. 1992; Kamei et al. 1997; Kindaichi 1941; Oohashi 2002; M. Takayama 2002; T. Takayama 1993; Uwano 1989; Yanagita 1930; among others). In the framework of Optimality Theory (henceforth OT; Prince & Smolensky 1993; McCarthy & Prince 1995), [N]–[g] alternation in current Tokyo Japanese has been taken up (McCarthy & Prince 1995; Itô & Mester 1997; Hibiya 1999; among others), however, little discussion has been made as to the related alternation in other places of articulation, its dialectal variance, and the relation to the historical shift of voice contrast. This paper will try to shed light on these issues, adopting FAITH reranking model developed by Itô & Mester (1995a–b, 1997, 1998, 1999a–b, 2000, 2003). Focusing on the interface between synchronic variation and diachronic change emphasized by Anttila & Cho 1998, Cho 1998, and others (references cited in Yamane & Tanaka 2002), we will show that the minimal demotion of FAITH leads to the gradual loss of PNS. It will also be

124 Noriko Yamane-Tanaka discussed that the shift of the distinctive role – from prenasal to voice – may follow from the grammatical force. 1.

Prenasalized stops in Japanese dialects

1.1. Tohoku dialects Tohoku dialects, which are spoken in the northeast area in Japan (cf. Appendix 1; No. 2–7, and North of No. 15), have a unique phonological character, or what is called “synchronic chain shift” in intervocalic stop consonants.2 In their systems, especially in Yamato lexical items, i) /k/ becomes /g/, and /g/ becomes /N/, ii) /t/ becomes /d/, and /d/ becomes /nd/, and iii) /b/ becomes /mb/, in a non-neutralizing way.3 Thus, unlike other dialects, prenasalized stops (henceforth, PNS) appear in the surface forms.4 It should be noted, however, that PNS tend to be replaced by plain voiced stops, due to the ongoing loss of the synchronic chain shift in younger generation. This section describes how PNS are realized in a system which retains PNS. For expository convenience, the underlying representation will be used as the corresponding lexical output of Tokyo Japanese (cf. Itô & Mester 1997: 430, fn. 14). Then the prenasalization facts will be given below:5 (Examples are from Aomori dialects.) (1)

Intervocalic (Pre-)Nasalization (i) g N [N] / V _ V /g/ [g] / elsewhere

(ii) d Ѝ nd 6 [nd] / V _ V /d/ [d] / elsewhere (iii) b mb [mb] / V _ V /b/ [b] / elsewhere

a. kagami b. kagi c. uguisu d. toge e. kago

kaN Nami6 kaN Ni uN Nuisu toN Ne No kaN

‘mirror’ ‘key’ ‘bush warbler’ ‘thorn’ ‘basket’

a. hada b. ude c. mado

handa unde mando

‘skin’ ‘arm’ ‘window’

samba embi ambura kambe

‘mackerel’ ‘shrimp’ ‘oil’ ‘wall’

a. saba b. ebi c. abura d. kabe

The implicational distribution of prenasalized stops in Japanese

125

PNS appear only in intervocalic position. ‘Elsewhere’ contexts are, besides word initial position, (a) V1. _ V2 : where V1 = long, (b) C1V1. _ V2 : where C1 = [–voice], V1 = [+high], V2 = [–high], (c) V1. _ V2. C3V3 : where V2 = [+high], C3 = [–voice], V3 = [–high], (d) after nasal (see note 5). Unlike [mb] and [nd], [N] is a simple nasal stop. This idiosyncrasy of the velar nasal may be interpreted as a gap-filling action of phonemic systems; [N] resulted from [Ng], when [m] and [n] existed already. However, from the viewpoint of OT, the early loss of [Ng] would be expected. Not only prenasalized stop is structurally complex but also velar place is the most marked among the three places of articulation (henceforth, PoA).7 This can be characterized as “banning the worst of the worst effect (Prince & Smolensky 1993: 180)”. The direction of the shift of [Ng] to [N] can also considered to be articulatorily natural. The difference between the two is in the relative timing of a gesture of velic opening and a gesture of oral closure (Maddieson & Ladefoged 1993: 255). This would be illustrated as below. (2)

a. Plain nasal velic aperture |-------------| oral constriction |-------------| b. Prenasalized stop: shortening of velar lowering gesture velic aperture |------| oral constriction |-------------| c. Prenasalized stop: lengthening of oral closing gesture velic aperture |-------------| oral constriction |----------------------|

In a simple nasal, the relative timing of oral and velic gestures are closely coordinated as in (2a), but in a prenasalized stop, the nasal passage has to be closed before the oral articulation is released. According to this view, a prenasalized stop seems articulatorily unstable. There are several different ways of producing PNS: One way is to shorten the duration of the velic lowering gesture as illustrated in (2b), and the other is to extend the duration of the oral closure as in (2c). Furthermore, the total duration of the prenasalized stop is longer in case (2c), which “would be consistent with the idea that the durations of complex segments are greater than the duration of simpler ones.” (Maddieson & Ladefoged 1993: 255). However, it seems that the Japanese system would choose option (2b) rather than (2c), otherwise the contrast between moraic nasals (e.g., kaN N go ‘nursing’) and

126 Noriko Yamane-Tanaka nonmoraic nasals (e.g., kaN go ‘basket’) would be hard to be maintained. It follows that as for Japanese PNS, the gestural shift of the velum (i.e., lowering and raising) is forced in a rather short span. Under the condition, [Ng] in Japanese does not seems to be as clearly articulated (and probably perceived) as [mb] and [nd], since the velar has to be raised before the constriction at the velum is released. This conflict could be adjusted by the parallel overlap of both of oral and velic gestures as in (2a).8 The shift of [Ng] to [N] is also supported by the geographical distribution. Among the variations, [N] is attested most widely in Tohoku dialects. The distribution is summarized based on the previous reports and linguistic atlas, as shown below (Oohashi 2002: 225). (3)

a. [N] All Tohoku areas except below b. [ )g](=[Ng]) Midsouth in Akita, Midsouth of the inland in Yamagata, north of the lower part in Niigata, a part of Fukushima c. [g] Coastal area of Sanriku in Iwate, Midwest in Fukushima, periphery of Murakami city in Niigata

Furthermore, Oohashi’s phonetic experiments indicate that only [N] was observed even in some dialects in (3b). This fact may also suggest that [Ng] has already been replaced by [N].10 In contrast to intervocalic positions, word-initial plain voiced stops appear as they are. The examples are follows. (4)

a. [g] [gakko˘] ‘school’, [geda] ‘clogs’, [go] ‘five’ b. [d] [daigu] ‘carpenter’, [degiru] ‘be able to’, [dogu] ‘poison’ c. [b] [baSa] ‘carriage’, [biwa] ‘Japanese lute’, [budo˘] ‘grape’, [boro] ‘rag’

PNS never appear word-initially, which might erroneously lead one to assume that PNS are mere positional variants. However, voiced stops appear also intervocalically, and in this sense, they are not in complementary distribution. Furthermore, the intervocalic voiced stops are synchronically derived from voiceless stops, for instance saka sag a ‘slope’, kaki kagi ‘persimmon’, doko dogo ‘place’, geta geda ‘cloggs’, mato mado ‘window’, and kata kada ‘shoulder’. As a result, minimal pairs are easily found as follows.

The implicational distribution of prenasalized stops in Japanese

(5)

127

Minimal pairs (i) /g/ vs. /N/ (k g, g N): ageru ‘open’ vs. aNeru ‘raise’, kagi ‘oyster’ vs. kaNi ‘key’ (ii) /d/ vs. /nd/ (t d, d nd): mado ‘target’ vs. mando ‘window’, hada ‘flag’ vs. handa ‘skin’

This fact suggests that the contrast between voiced stops and PNS is considered to be phonemic rather than allophonic.

1.2. Cross-dialectal variation Although Tohoku dialects could generally prenasalize all voiced stops b, d, g, this pattern does not simply extend to other dialects. Cross-dialectal comparison leads us to divide prenasalization systems into four distinct types A, B, C and D. Let us call Tohoku system A, which prenasalizes all three voiced stops. If we incorporate the details in (3a, b), system A may be divided into A1 and A2.9 System B prenasalizes only d and g, system C (pre)nasalizes only g, and system D prenasalizes none of them. Thus, as far as PNS and nasals are concerned, the output inventory of each system would look like this. (6)

System A1 A2 B C D

Intervocalic PNS and nasals {Ng, mb, nd; (N,) m, n} { mb, nd; N, m, n } {nd; N, m, n} {N, m, n} {m, n}

The regional dialects described as A through D are regarded as synchronic variation, in that all systems above are attested as regional dialects in Japan. Regional information is below.

128 Noriko Yamane-Tanaka (7)

System A

B C

D

Regional information Aomori, parts of Iwate, Miyagi, Akita, parts of Yamagata, parts of Fukushima, parts of Niigata, parts of Mie, parts of Ehime, parts of Nagasaki, parts of Kagoshima Parts of Nara, parts of Wakayama, Kochi Parts of Ibaragi, parts of Tochigi, parts of Chiba, Tokyo, Kanagawa, Toyama, Ishikawa, Fukui, Yamanashi, Nagano, Gifu, Shizuoka, parts of Shiga, parts of Kyoto, Osaka, parts of Hyogo, parts of Tottori, parts of Okayama, parts of Okinawa The rest of the regions above

Linguistic atlas (cf. Appendix 2) shows the geographical distribution of intervocalic PNS in Japan. Regions marked with black shadow indicate that they have the voiced PNS. Map [1] represents the areas with (pre)nasalized d; map [2] the areas with prenasalized d, and map [3] those with prenasalized b. Notice that the marked areas get minimized in the order [1] > [2] > [3]. This means that [N] survives in the most extensive areas, including Kinki, Kanto, Shikoku and Tohoku areas, [nd] is seen in limited regions of Kinki, Shikoku and Tohoku, and [mb] is only in some regions of Tohoku. More importantly, areas [1]–[3] are not distributed at random, but are in an inclusion relation: Area [3] can be included in area [2], and area [2] can be included in area [1] (i.e., [3]ө[2]ө[1]). This suggests that the existence of [mb] implies that of [nd], the existence of [nd] implies [N], and similarly [mb] implies [N]. In this paper I refer to this kind of regularity as “the implicational relation in the geographical continuum”. The next section will show that such a regularity is also observed in the chronological continuum.

1.3. Historical change The assumption that remnants of the past survive in remote regions (Yanagita 1930) is widespread. This is known as “hoogen shuuken ron” or “theory of peripheral distribution of dialectal forms”.10 He found that i) the innovative forms are seen in the areas of Kyoto or Nara, ex-capitals of Japan, while the older forms are seen outside the areas, and ii) the new forms

The implicational distribution of prenasalized stops in Japanese

129

are diffused in a gradual succession, just like a ripple, which would be made by a stone thrown into a pond. This assumption will be supported if the forms in peripheral areas existed in the past. As far as PNS are concerned, there is agreement that [Ng, n dz, nd, mb] existed intervocalically in prehistoric and Old Japanese (Unger 1977; Vance 1983).13 More importantly, PNS are gradually lost, which has been attested by historical materials in the central and other dialects. According to Hashimoto (1932), the loss of [mb] took place in the late Muromachi era, and the loss of [nd] took place in Modern Japanese. 14 It is not known when [Ng] was lost, and there is no clear phonetic evidence to prove its existence in OJ and even in current Tohoku dialects. Although it would be reasonable to assume that the loss of [Ng] occurred prior to the loss of [mb], but as for the period of the loss, I could only speculate it is in OJ. The scenario would be that [Ng] was replaced by [N] in OJ, then [N] is also going to be replaced by [g]. In fact, the loss of [N] is a well-known ongoing change in present-day Japanese. Kindaichi (1941) reports that [N] started to be lost and replaced by [g] among the younger generation in Tokyo. Summarizing, the loss of PNS affected velar, labial, and coronal in this order, and further affected [N]. This is shown in the rightmost row of the table below. (8)

Loss of PNS System Intervocalic PNS and Nasal A {mb, nd; N, m, n} B {nd; N, m, n} C {N, m, n} D {m, n}

Period

Major Change

Old Japanese (OJ) Middle Japanese (MJ) Modern Japanese (ModJ) Present day Japanese (PJ)

Loss of Ng Loss of mb Loss of nd Loss of N

Following the generalization above, we could specify OJ as system A, MJ as system B, ModJ as system C, and PJ as system D. Each system historically turns up in the order ABCD. I will refer to this kind of regularity as “the implicational relation in the chronological continuum”. Notice that the historically reconstructed contextual phonemic systems A–D all match with those in cross-dialectal distribution A–D which I described in the previous section. That is, the implicational relation in the chronological continuum and that in the geographical continuum are perfectly matched. It is clear that this regularity would strengthen Yanagita’s

130 Noriko Yamane-Tanaka assumption that the current geographical variation mirrors a series of grammars that existed at the past historical stages. In the next section, I will show that such a parallelism between synchrony and diachrony is not just a matter of coincidence, but would be expected under the analysis of OT.11 I will propose that the demotion of Faithfulness constraint (henceforth, F) is responsible.

2.

OT analysis

2.1. Subset structure and harmonic completeness The parallelism between the chronological and geographical continuums suggests that it is unlikely that prenasalization patterns deviate from either of these systems; for example, no system can be found whose inventory has {Ng, mb; N, m, n} or {mb; N, m, n}, or {nd; m, n} and so on. This is because PNS are realized in accordance with a certain markedness scale of place of articulation. Now let us suppose four constraints relevant to this phenomenon; Markedness constraints in (9a-d), and Faithfulness constraint in (9e). (9)

a. *Ng: Ng is prohibited in the output. b. *mb: mb is prohibited in the output. c. *nd: nd is prohibited in the output. d. *N: N is prohibited in the output. e. MAX(NAS): Velic aperture of the input has an identical correspondent in the output. (No deletion of velic aperture)

MAX(NAS) requires that every PNS of the input should be realized in the output, so the change of prenasalized stop to simple voiced stop in the output would incur the violation of this constraint (This will be explained in (19ii) and (20b)). The random permutation of these constraints could generate 120 possible dominance hierarchies. However, as far as PNS of OJ through present-day Japanese is concerned, the hierarchies should be limited to only 4 ways. This can be achieved by adopting the following two assumptions.

The implicational distribution of prenasalized stops in Japanese

131

(10) a. Fixed markedness ranking hypothesis (Prince & Smolensky 1993: ch. 9) b. Faith-reranking hypothesis (Itô & Mester 1995a–b, 1999, 2002) The details will be described in the following. First, there are reasons to set the fixed markedness hierarchy relevant here as *mb >> *nd >> *N. The original version of OT (Prince & Smolensky 1993: ch. 9) claims that possible segmental inventories can be captured with the fixed ranking of universal Markedness constraints (henceforth, M). Crucial to PNS in Japanese are the rankings pertaining to two kinds of harmonic scales. One is the PoA markedness scale, COR ӡLAB, i.e., coronal is more harmonic than labial, which leads to the constraint hierarchy as *Lab >> *Cor. There has been little agreement as to the ranking between *Lab and *Dor.16 But based on the assumption that *Lab and *Dor may be equivalent but that the ranking is determined by language-specific choice (Rice 2003: 418), we can posit *Dor >> *Lab so that it could fit to the facts of Japanese PNS. Thus the overall ranking among the three M concerning PoA would be *Dor >> *Lab >> *Cor. The other harmonic scale is SIMPLEX ӡ COMPLEX, i.e., a simple segment is more harmonic than a complex one, which would roughly lead to *Complex >> *Simplex. Since PNS are structurally complex, the constraint against PNS would have to be ranked above the constraint against simple nasals as below. (The constraint on the top is most dominant among those in the same box.) (11)

*PNS * Ng *mb * nd

*Nasal >>

*N *m *n

Once M hierarchy is fixed in this way, the rerankable constraint would necessarily be only F. This assumption allows us to capture the so-called harmonic completeness. The definition of harmonic completeness is given below. (12) Harmonic completeness (Prince & Smolensky 1993; Prince 1998) Let S be a system and , elements that are markedness-wise comparable, with ӡӡ . Then, if S contains , it must also contain : (Ѱ 6& ӡ ӡ ) э Ѱ 6

132 Noriko Yamane-Tanaka The output inventory of Japanese PNS is harmonically complete in this sense. (13) a. Harmonically complete systems { mb, nd, N}{ nd, N}{ N}{ } b. Harmonically incomplete systems { mb, nd}{ mb, N}{ mb}{ nd } We already observed the regularity of harmonically complete systems, in terms of the implicational relation in the geographical and the chronological continuum. None of the harmonically incomplete systems was found in any of the continuums. Harmonically complete patterns are very common crosslinguistically, and F reranking model would appropriately capture many hierarchical inclusions between areas of constraint activity in the phonological lexicon. Given the fixed M ranking M1 >> M2 >> M3 with rerankable F, it would follow that items observing M1 may also observe M2, but not vice versa; items observing M2 may also observe M3, but not vice versa. Given the hierarchy *Ng >> *mb >> *nd >> *N for Japanese segment inventories, the hierarchical inclusion can also visually be expressed with “constraint domain map (Itô & Mester 1995a–b)” as below. The allowable segments are shown in { }. (14) a.

* Ng A m * b B n *d C *N D

{ mb, nd, N} { nd, N} {N} { }

b.

more marked implicans implicatum

less marked As we go toward the periphery, more structures are allowed, while as we go inward, the allowable structures are minimized due to the obligation to the observance to more constraints. Based on the observation in the preceding sections, we could state that system A falls outside of the circle of any domain; system B is inside the domain of *mb; system C is inside the domain of *nd, and system D is in the innermost circle of *N. From a historical perspective, Japanese starts from the outer circle allowing the full range of instantiations of PNS, and as time goes by, the system goes inward eliminating marked segments gradually. This seems to be

The implicational distribution of prenasalized stops in Japanese

133

intuitively right, in the sense that the historical shift proceeds toward the less marked structure. Such a direction could be expressed as an inward shift or a shift from implicans to implicatum as shown in (14b).17 Before representing how variation between grammars is accounted for, let us clarify some difference between the constraint domain map developed by Itô and Mester and the one in here. Itô and Mester’s map is about lexicon-internal variation within a grammar of one speaker of current Tokyo Japanese. What I am addressing is variation between grammars in the PNS inventory of speakers both across time and space. But in both models, the emphasis would be placed on the observance of the hypotheses in (10a–b).

2.2. Three-dimensional constraint map Given that a diachronic grammar of a language consists of a series of synchronic grammars, I will present a three-dimensional constraint map. This new image would only make sense if each synchronic grammar would be aligned in a chronological order. Systems A through D are arranged as vertical planes as if we cut across a tree trunk, and also arranged from bottom to top as if the tree grows, along the line of a diachronic continuum. The shaded area in (15) indicates inactive constraint domain in each system, while the white area indicates the active constraint domains. As is clear, the inactive domain is minimized step by step.18 In system A, as OJ is in the outermost ellipse, the inner constraint domain *mb does not have any force, so that {mb, nd, N} is possible. In system B, domain *mb starts to be active, so only {nd, N} is attained. In system C, the white area encroaches on the domain *nd, then only {N} is attained. In system D, the white area further reaches *N, making the whole domain active, where PNS and even {N} are not allowed. As the white domain augments, the segment inventory is minimized.

134 Noriko Yamane-Tanaka (15) Three dimensional domain map *Ng *mb *nd

*N

PJ (=D) C+D

*Ng *mb *nd

*N

ModJ (= C) B+C

*Ng *mb *nd

*N

MJ (=B) A+B

*Ng *mb *nd

*N

OJ (= A)

This image shows that every system is harmonically complete, and the sound change proceeds toward a specific direction step by step. Again, the direction of the diachronic change here can be characterised from implicans to implicatum. (16) a. Implicational relation: DӨCӨBӨA b. Unmarked direction of diachronic change: ABCD In terms of OT, thanks to the fixed ranking of M, harmonically incomplete systems such as those in (13b) would never be generated. Furthermore, a diachronic prediction can be made. That is, [mb] may be lost before [nd], but [nd] will never be lost before [mb]. Likewise, [nd] may be lost before [N], but [N] will never be lost before [nd]. This prediction holds true for both the synchronic and the diachronic continuum, as seen in the previous sections.

2.3. Max demotion analysis on prenasal loss We have seen that if the rerankable constraint is only MAX, there are four possible hierarchies (i.e., rankings in A, B, C and D shown below). MAX prohibits changing the segment or feature from input to output. Given all

The implicational distribution of prenasalized stops in Japanese

135

the input as diachronically old forms (Yamane-Tanaka 2003), then candidates which lost PNS would incur violations of MAX(NAS). In this section, I will insist that the rerankability of MAX(NAS) is also not random. If FAITH(NAS) can be reranked in any free way, even if allowable systems would be only 4, the ways of order that each system will emerge should add up to 24 ways (e.g., DЍCЍBЍA, AЍCЍBЍD, CЍAЍBЍD), but it is not the case. In order to derive the correct diachronic change (i.e., AЍBЍCЍD), F has to be demoted minimally among the relevant constraint system. Thus the minimal F demotion limits the possible order of permutations to only one way, as shown below. (The constraint at the top is most highly ranked.) (17) Possible permutations and direction System

Ranking

Output

A * Ng MAX(NAS) *mb * nd *N

B * Ng *mb MAX(NAS) *nd *N

C * Ng *mb *nd MAX(NAS) *N

D * Ng *mb *nd *N MAX(NAS)

{ mb, nd, N}

{Nd, N}

{N}

{}

As F is ranked higher, more M are invalidated (i.e., free from force), so that more PNS are allowed in the system. As F goes downward, more M become active, so that more PNS are removed from the system. F minimal demotion hypothesis correctly captures the way of progression that the less marked structures never have been eliminated until the relatively marked ones were eliminated. Let me discuss some lexical items, to see how PNS would be realized in each system A-D. Items /kabe/ ‘wall’, /hada/ ‘skin’ and /toge/ ‘thorn’ are synchronically attested as below. (18) PNS in intervocalic context m

System A System B System C System D

b kambe kabe kabe kabe

n

d handa handa hada hada

N toN Ne toN Ne toN Ne toge

136 Noriko Yamane-Tanaka System A has all {mb, nd ,N}, system B has {nd, N}, system C has {N} only, and system D does not have any of them. Notice that in systems A–C, the place asymmetry between velar and non-velar is observed: only [Ng] turns to simple nasal [N], while [mb] and [nd] directly turn into voiced obstruents [b] and [d] respectively. Thus two strategies to avoid PNS can be posited: (19) Timing difference between PNS, simple nasal and simple stop PNS Velic aperture

|------|

Oral constriction

|-------------|

(i) Simple Nasal

(ii) Simple Stop

Ѝ |-------------| |-------------|

|-------------|

The first strategy is to extend the velic aperture of PNS to create a simple nasal, as in (19i), and the other is to delete the entire velic aperture to create a simple stop, as in (19ii). Both strategies would satisfy *PNS, but each of them would violate one of the following constraints. (20) a. DEP(NAS): Velic aperture of the output has an identical correspondent in the input. (No extension of velic aperture) b. MAX(NAS): Velic aperture of the input has an identical correspondent in the output. (No deletion of velic aperture) The strategy in (19 i) would satisfy MAX(NAS), but violate DEP(NAS) since the velic aperture is extended in the output (e.g., mbm). On the other hand, the strategy in (19ii) would satisfy DEP(NAS), but violate MAX(NAS) since the velic aperture is deleted in the output (e.g., mbb). As it stands, there is no predictive force to capture the idiosyncrasy of the velar. Then I will propose the following constraint. (21) MAX(NAS)/VEL: Velic aperture with the oral constriction at velar of the input has an identical correspondent in the output. (No deletion of N) MAX(NAS)/VEL demands that the velum remains to be lowered when the oral constriction is formed at the same area (i.e., velar). Thus it is violated if the velum is raised while the constriction is formed at the velar (e.g., Ng g). This kind of constraint may be phonetically-grounded (see note 9),

The implicational distribution of prenasalized stops in Japanese

137

and is motivated by a theoretical schema of specific-to-general constraint hierarchy. MAX(NAS)/VEL is a special version of MAX(NAS) proposed in (20b). The violation of MAX(NAS)/VEL would always violate MAX(NAS). This kind of partitioning of faithfulness constraints is not new: positional faithfulness (J. Beckman 1997), HEADMAX-BA >> MAX-BA (Kager 1999: sec. 6.4), and specific lexical strata faithfulness (Itô & Mester 1995, 1999, 2001, 2003) are the outcomes via splitting a special constraint from the general faithfulness. The hierarchy of the proposed constraints is determined from the schema in (22a) (Itô & Mester 2003: 165–168), and interact in two ways as in (22b–c). (22) a. FAITH(SPECIAL) >> M >> FAITH(GENERAL) b. MAX(NAS)/VEL >> DEP(NAS) >> MAX(NAS): system A–C c. DEP(NAS) >> MAX(NAS)/VEL >> MAX(NAS): system D According to the schema (22a), some markedness constraint should intervene between a special F and a general F. DEP(NAS) is commonly known as F rather than M in OT, but for the present purpose, I only speculate that this constraint could work as M in the sense that it demands that the oral release should not be suppressed by the extended timing of the velic aperture. Importantly, the ranking in (22b) is invariant among the systems A–C, where thanks to the special effect only Ng turns into N (rather than g). Another ranking, (22c), where no relevant constraint intervenes between the two kinds of MAX, makes the special effect muted in system D (thus the velar comes to pattern with the labial and the coronal). This is confirmed as below. (23) a. Relevant constraint (i.e., DEP(NAS)) intervenes: N is optimal. Systems A, B, C toNge, toNke

N

to ge toge ܏ toNe

* Ng

MAX(NAS)/VEL

DEP(NAS)

MAX(NAS)

*! *!

* *

138 Noriko Yamane-Tanaka b. No relevant constraint (i.e., DEP(NAS)) intervenes: g is optimal. System

*Ng DEP(NAS)

D N

to ge ܏ toge toNe

toNge, toNke

MAX(NAS)/VEL

MAX(NAS)

*

*

*! *!

In system A–C in (23a), where DEP(NAS) intervenes between MAX(NAS)/ VEL and MAX(NAS), N is selected as an optimal output. On the contrary, in system D in (23b), where DEP(NAS) does not intervene, which makes the special effect immune, and g is selected. In both systems in (23a–b), the ranking MAX(NAS)/VEL >> MAX(NAS) remains fixed, with only MAX(NAS)/ VEL can be demoted. The Max demotion analysis holds true also here. (The input contains voiced and voiceless versions of PNS. It means that either of them would work for the present purpose. This point will be touched in the next section.) The overall rankings of constraints in systems A–D are constructed by combining the ranking in (17) and the ranking in (23). The unified rankings are given in (24). MAX(NAS)/VEL and MAX(NAS) are highlighted so that it will be clear that in all systems (i) the ranking between them is fixed, (ii) the rerankable constraint are only one of them, and (iii) the MAX demotion is minimal. (24) a. Old Japanese / Tohoku: {mb, nd, N} *Ng MAX(NAS)/VEL

System

A

kambe, kampe

܏ kambe kabe kame

n

ha da, hanta toNge, toNke

DEP(NAS) MAX(NAS)

hana toNge

toge ܏ toNe

*nd

* *! *!

܏ handa hada

* mb

* *! *! *! *!

* *

The implicational distribution of prenasalized stops in Japanese

139

b. Middle Japanese / Kochi: {nd, N} – Demotion of MAX(NAS) *Ng MAX(NAS)/VEL DEP(NAS) *mb MAX(NAS) *nd ka be *! kambe, ܏ kabe * kampe kame *! ܏ handa * handa, hada *! n ha ta hana *! toNge *! N to ge, toge *! * toNke ܏ toNe * System B

m

c. Modern Japanese / Tokyo: {N} – Demotion of MAX(NAS) *Ng MAX(NAS)/VEL DEP(NAS) *mb *nd MAX(NAS) ka be *! kambe, ܏ kabe * m ka pe kame *! handa *! handa, ܏ hada * hanta hana *! toNge *! toNge, toge *! * toNke ܏ toNe * System C

m

d. Present-day J / younger generation in part of Tokyo: { } – Demotion of MAX(NAS)/VEL *Ng DEP(NAS) MAX(NAS)/VEL *mb *nd MAX(NAS) ka be *! kambe, ܏ kabe * m ka pe kame *! handa *! handa, ܏ hada * n ha ta hana *! toNge *! toNge, ܏ toge * * toNke toNe *! System D

m

System A in (24a), such as Old Japanese or Tohoku dialects, ranks MAX(NAS) below *Ng and above *mb, thus {mb, nd, N} is correctly realized. System B in (24b), such as Middle Japanese or Kochi dialect, demotes

140 Noriko Yamane-Tanaka MAX(NAS) below *mb, thus {nd, N} is correctly realized. System C in (24c), such as Modern Japanese or Tokyo dialect, demotes MAX(NAS) further below *nd, thus {N} is correctly realized. System D in (24d), such as dialects in younger generation in part of Tokyo, ranks MAX(NAS)/VEL below DEP(NAS), thus none of the PNS or [N] is allowed in this system. All systems are captured with the reranking of MAX only. It was also ascertained that its minimal demotion matches the chronological order in which each system turns up. Once the M hierarchy is determined as above, the predictions below would follow. (25) a. Synchronic prediction about the loss of PNS: (i) N appear as frequently or more frequently than nd. (ii) nd may appear as frequently or more frequently than mb. b. Diachronic prediction about the loss of PNS: (i) mb may be lost before nd, but nd will never be lost before mb. (ii) nd may be lost before N, but N will never be lost before nd. Observations (6)–(8) would be sufficient to claim that predictions in (25) are true in Japanese PNS.

3.

Prenasal loss induces voice contrast?

3.1. Prenasal/voice controversy This section discusses how the prenasal loss is related to the history of voicing in Japanese. It is widely admitted that the feature voice has been distinctive throughout the history of Japanese (for example, see Itô & Mester 2003: 211–212), but there is another view that only prenasal was distinctive in early Japanese (Wenck 1959; Hayata 1977a,b; T. Takayama 1993; Hamano 2000). As for the former view, the prenasalization in OJ would be considered to be allophonic. Supporters of this view may agree with the idea that the prenasalization could be a phonetic mechanism for facilitating the expression of voicing on a stop.19 The functional reasoning of prenasalization maybe plausible in terms of phonetics, but its theoretical consequence may bring questions in treating phonology. Why do current prenasalization dialects still have minimal pairs such as hada ‘flag’ vs. handa ‘skin’, where prenasal can not be regarded as allophonic? Why do

The implicational distribution of prenasalized stops in Japanese

141

current non-prenasalization dialects show minimal pairs such as hata ‘flag’ vs. hada ‘skin’, where only voice should be distinctive? And how do these dialects are related with each other? If we admit the view that the prenasal was distinctive rather than allophonic in OJ and was lost gradually, these questions could be answered. The prenasalization dialects are the remnants of the past system where prenasal was distinctive, while the non-prenasalization dialects emerge from it through the diachronic chain shift involving prenasal, voiced and voiceless stops. The two sets of dialects are both attested in light of the geographical and chronological continuums as shown previously. An OT analysis in the previous section is not the one I believe would lend a strong support to the prenasal-distinctive view. But at least it could question the view that the prenasal in OJ is mere allophonic, suggesting the possibility that the contrastive feature of stop consonants may have shifted from prenasal to voice in a systematic way. To put it another way, the prenasal loss may have gradually induced the voice contrast from MJ, which was not observed in early OJ. The scenario I intend to show here is that the voice contrast on one PoA didn’t appear until the prenasal on the same PoA is lost. This is summarized as below. (26) OJ MJ ModJ PJ

Voiceless stops p, t, k p, t, k p, t, k p, t, k

PNS b, nd, N n d, N N

Voiced stops

m

b b, d b, d, g

The only thing that is certain here is that PJ has the voice contrast without prenasal, and the segmental distribution in the other eras is hypothetical. The question is how PJ has attained such a system. Imagine when one prenasalized stop is lost, it is replaced by the plain voiced stop; say when nd is lost, nd is replaced by d. Then, the original d may become t in order to avoid merging with the original d. Given that such a diachronic chain shift proceeds from labial, coronal and velar in this order, we could further assume that the voice contrast didn’t emerge at all PoA simultaneously. In other words, in early OJ the contrastive feature pertaining to stops was prenasal, but the contrastive feature may have shifted from prenasal to voice gradually from MJ through PJ.

142 Noriko Yamane-Tanaka This line of thought agrees with other assumptions shown below. Hayata (1977) holds the view that the OJ intervocalic stops are phonetically voiced thus the contrast of voiced stops are prenasal rather than voice, arguing that it could give a unified account for two consonantal changes which were previously considered to be unrelated; what was considered as the shift of voiceless labial fricative to labial glide (i.e., ha-line shift) may actually be the shift from the voiced labial fricative on one hand, and what was considered as the deletion of voiceless velar stops (i.e., i-onbin or u-onbin) may also be the shift from the voiced velar fricative to palatal glide with its subsequent deletion on the other hand. M. Takayama (1992a) agrees with this, arguing that MJ ceased postnasal voicing, when voice started to play a distinctive role; the voice in OJ must not have been contrastive like other languages that have postnasal voicing. Moreover, Hamano (2000) insists that the prenasal became redundant on labials historically earlier than nonlabials on the ground that the distribution of intervocalic stops in sound-symbolic stratum shows an asymmetrical pattern; nonlabial stops are predominantly voiceless but labial stops are predominantly voiced.20 Also, the UCLA Phonological Segment Inventory Database (briefly, UPSID) (Maddieson 1984) can tell us that the markedness scale may vary according to the manner of articulation.21 Typological survey about the segment distribution of voiced stops would assume PoA markedness hierarchy as below (see also Itô & Mester 1999). (27) *g >> *d >> *b Interestingly, the ranking in (27) is in a mirror image of the ranking on PNS – *mb >> *nd >> *N – shown in the previous sections; labial is most marked in the PNS scale, while in contrast it is least marked in the voiced stops scale. Nonetheless, they are both sides of the same coin; the ranking of PNS may state that mb is most likely to be missing from the segment inventory, and the ranking of the voiced stops may state that b is most likely to be present. These implications would make us believe strongly that mb is the first target of being lost and subsequently shifting to b. In short, two different rankings may support the assumption that the appearance of a voice contrast proceeded with labial, coronal and dorsal in this order. Some physiological and aerodynamic reasons could also support PoA effect on voicing. Hayes and Steriade (2004) admit hierarchy (27), giving it the reason that it is more difficult to sustain vocal vibration, as the size of the cavity behind the oral constriction is smaller. (This account is not totally new. see also note 9).

The implicational distribution of prenasalized stops in Japanese

143

3.2. Gapped Inventory The discussion so far may seem to suggest that the sound changes may not affect every segment in the same feature class all at once. This thought would lead us to take the view that the synchronic segmental inventory may be gapped during the course of sound changes. It may be against the hypothesis that segmental modification such as prenasalization, aspiration and labialization occur simultaneously on a natural class of segments, rather than on one individual segment (Hinskens & van de Weijer 2003). What I meant to show here, however, is that even if diachronic changes caused some stage to have gapped inventories, harmonically incomplete systems in (13b) are rarely attested: 46 out of 47 regional dialects in Japanese fit into the harmonically complete systems in (13a), with only one exception of Tokushima dialect (cf. Appendix 3: No. 37).22 The gapping pattern along with the harmonic completeness may be the unmarked case, but at the same time, we should bear in mind that other types of gapped inventory do exist, which would seek for a different explanation (for example, de Lacy 2002, Mielke 2004). 3.3. Complexity in old systems PNS may be marked compared to simple (voiced) stops, in terms of segmental complexity. Then it would be reasonable to suppose its loss is a natural direction. But the interesting question may possibly be cast on the fairness to assume that such a complexity is held only in the old system. One answer would be that every synchronic system has some marked trait, which may differ from system to system. Compare the overall underlying consonant inventories for OJ and PJ: (28) a. OJ: prenasal is contrastive 23 OJ

Labial Alveolar Palatal Velar p t k Stops m n N p t k Obs s Fricatives n s Nasals m n Son Liquids r j Semivowels w

144 Noriko Yamane-Tanaka b. PJ: voice is contrastive 24 PJ

Labial Alveolar Palatal Velar Glottal p t k Stops g b d Obs s h Fricatives z Affricate c Nasals m n Son Liquids r j Semivowels w Suppose that in the OJ inventory, all obstruents would be systematically voiceless, while all sonorants are systematically sonorants. The fact that this pattern is observed in many languages (e.g. Pulleyblank 1997: 77–85) is phonetically grounded, in that it is more difficult to maintain voicing when the oral constriction is greater. Archangeli & Pulleyblank (1994) and Pulleyblank (1997, 2003) characterize this state of unmarkedness as feature cooccurrence restrictions. The constraint governing the feature cooccurrence relevant here is represented as OBSVOI in OT term, which would interact with FAITH(VOI). (29) a. OBSVOI: Obstruent should be voiceless. b. FAITH(VOI): Voice should be identical in the input and the output. (No change of voice) These constraints can interact in the following two ways (Pulleyblank 1997: 79–80). (30) a. OBSVOI >> FAITH(VOI) (i.e., Voiced obstruents are excluded from the inventory) b. FAITH(VOI) >> OBSVOI (i.e., Voiced obstruents are attested in the inventory) The ranking in (30a) would force all the surface obstruents to be voiceless, while the ranking in (30b) would force voice in the input and output to be identical. This is summarized as below.

The implicational distribution of prenasalized stops in Japanese

145

(31) a. Languages with a voice contrast (=30b) FAITH(VOI) OBSVOI ܏ TA DA TA DA ܏ DA TA

*! *!

* *

b. Languages with no voice contrast (=30a) OBSVOI FAITH(VOI) ܏ TA TA DA ܏ TA DA DA

*!

* *

*!

(T = p, t, k; D = b, d, g; A = vowel) Recall that as far as word-initial position is concerned, OJ does not allow voiced obstruents, while PJ does. If OJ can be categorized into type (30a) and PJ as type (30b), then the asymmetry of such a voice contrast between OJ and PJ should be already captured in (31). The hierarchies in (30) should be compared to those given below, which could be developed from the hierarchy in (17). (32) a. Early J: MAX(NAS) >> *PNS (i.e., Prenasal obstruents are attested in the inventory) b. ModJ: *PNS >> MAX(NAS) (i.e., Prenasal obstruents are excluded from the inventory) Since in early J, it is more important to retain nasal than to avoid PNS, PNS can be surfaced in the output. But in PJ, it is more important to avoid PNS than to retain nasality, the underlying PNS cannot be surfaced in the output. It would easily be found that OJ (or early J) and PJ (or Mod J) show different markedness. The former is marked in segmental complexity, but unmarked in inventory. On the contrary, the latter is marked in inventory, but unmarked in segmental complexity. As is clear from the trade-off relation of markedness shown in (30) and (32), Mod J seems to have reduced the segmental complexity, sacrificing the unmarked status of obstruents.

146 Noriko Yamane-Tanaka 3.4. Voicing in Old Japanese The viewpoint that OJ has all voiceless series of obstruents with only prenasal distinctive would not only answer the question on the fairness of the segmental complexity in the way above, but also capture how PNS as well as plain voiceless stops are voiced in the output. The important idea underlying this stance is that it is voicing rather than prenasalization which is allophonic in OJ, in that the distribution of voicing is predictable and complementary: a voiced stop occurs after a nasal or a vowel, and a voiceless stop occurs elsewhere. Such an allophonic variation will appear in terms of the following schema (Pulleyblank 1997: 85). (33) SPECIAL CONDITION ON FT >> *FT >> FAITH[FT]

([Ft = Feature])

SPECIAL CONDITION ON FT means a context-specific M on a certain feature, and *FT is a context-free M on a feature, and FAITH[FT] is a F on a feature. In OJ, SPECIAL CONDITION ON FT would be POSTSONVOICE (as for the role of SONVOI, see Itô, Mester & Padgett 1995). (34) POSTSONVOICE: Obstruents after a vowel or a nasal should be voiced. This is a constraint family of positional markedness, which force a certain feature to be surfaced in some specific context. POSTSONVOI allows voice to be surfaced on a obstruent when it follows a vowel or a nasal. (For a different view of the voice contrast in OJ, see Itô & Mester 2003: 211–212.) Then the schema in (33) would be filled with POSTSONVOICE >> OBSVOI >> FAITH(VOI). This constraint hierarchy can be interpreted as ‘feature voice does not occur in the inventory because of OBSVOI >> FAITH(VOI), but the allophonic voicing occurs after a nasal as well as a vowel because ICC[VOICE] overrides the general prohibition’. Here is the summary. (35)

POSTSONVOICE OBSVOI FAITH(VOI) A TA *! N TA N * * ܏ A DA N A TA *! * N DA N * ܏ A DA OJ

N

The implicational distribution of prenasalized stops in Japanese

147

Whether the input is voiceless or voiced, the obstruents in this environment are consistently voiced in the output. The outcome attained here is coherent with the one attained by the tableau (24a). (Then the tableau here may be ranked above the one in (24a).) It should also be noted that POSTSONVOICE should work also for the sequence of a moraic nasal and a voiceless stop (thus ANTA ANDA), as well as for the sequence of a vowel and a voiceless stop (thus, ATA ADA). It is known that postnasal voicing was productive in OJ. It might call for some explanation, because it would suggest that the feature voice, irrespective of its noncontrastive status, would play an active role in postnasal voicing, contrary to the standard prediction. Pulleyblank (2003), however, shows that both ‘overt’ features and ‘covert’ features can certainly trigger and target phonological phenomena, and they are the basis of phonological constraints. This hypothesis would make the story in OJ more reasonable that the noncontrastive voice can be targeted for postnasal voicing. What happened later on is, borrowed from M. Takayama (2002), “When the feature [pre]nasal disappeared from the distinctive feature set in central dialects, postnasal voicing necessarily came to cease. … given that the distinctive features for obstruents began to shift in Middle Japanese, some phonological changes would be offered a more reasonable and principled explanation (translated by N.Y-T).” Thus, the voice contrast may have historically emerged through the loss of prenasalized stops.

4. Conclusion We have discussed the connection between the loss of PNS and the history of voice of obstruents. Based on the observation that the loss of PNS proceeded along the harmonic scale of PoA, I gave it a principled account with the minimal demotion of MAX(NAS). It was suggested that the demotion may indicate the historical emergence of the voice contrast. The parallelism between synchronic variation and diachronic change is attested in the form of an implicational relation. Among 47 regional dialects surveyed here, 46 systems involving PNS fit into the factorial typology that emerged from the fixed markedness hierarchy with rerankable MAX(NAS); the only exception is the Tokushima dialect (cf. Appendix 3; No. 37). Historical surveys demonstrate that the possible grammatical change proceeds from implicans to implicatum, which is mirrored in the demotion of MAX(NAS).

148 Noriko Yamane-Tanaka The empirical and theoretical observations suggest that as the system gradually loses PNS from the segment inventory, voice contrasts come to play a more important role. The analysis presented here is in accordance with the assumption that the prenasal became redundant on labials earlier than nonlabials (Hamano 2000), and voice started to play a distinctive role in Middle Japanese (M. Takayama 2002). I hope the study here could serve to shed light on the historical shift of the status of voice in Japanese.

Acknowledgements Part of this article was first read at the workshop ‘Voicing in Japanese’ on Linguistics and Phonetics (LP) held at Meikai University on September 3, 2002. I am deeply indebted to Kensuke Nanjo, and Tetsuo Nishihara, and Jeroen van de Weijer, who organized this project and edited this book. Other presentations of this research include the ones at the meeting of the Tokyo Circle of Phonologists (TCP) at Seikei University on May 25, 2003 and the informal research meetings as well as a presentation session in LING 507 at the University of British Columbia from September to December, 2003. I express my profound gratitude to Sonya Bird, Atsushi Fujimori, Shosuke Haraguchi, Ayako Hashimoto, Takeru Honma, Junko Itô, Itsue Kawagoe, Masahiko Komatsu, Haruo Kubozono, Masao Okazaki, Ruangjaroon, Keiichiro Suzuki, Timothy Vance, Ian Wilson and my classmates, who provided me with insightful comments and discussions. Special thanks go to Linda Lombardi, Kan Sasaki, Michiaki Takayama, and Tomoaki Takayama, who kindly sent me various important materials related to the issues I’m interested in. I am also grateful to Hiroyuki Maeda, who wrote a critique of an earlier version of this paper (Maeda 2004). Last but not least, I would express my deepest thanks to Gunnar Ólafur Hansson, Douglas Pulleyblank, Shin-ichi Tanaka, and Jeroen van de Weijer, and anonymous reviewers, who read earlier versions of this paper and made valuable comments and suggestions. The research I have conducted since September 2003 is supported by SSHRC Standard Research Grant [#410-2002-0041], which was awarded to Douglas Pulleyblank. I’d like to dedicate this paper to the memory of my father and my fatherin-law.

The implicational distribution of prenasalized stops in Japanese

149

Notes 1. According to the dialectal division of Tôjô (1954), Tohoku dialects consist of Aomori, Iwate, Akita, Miyagi, Yamagata, Fukushima and North of Niigata. 2. For an OT analysis of the chain shift, see Yamane-Tanaka (2003). 3. /p/ didn’t participate in voicing. /p/ turned into [] or [h] word-initially, or turned into [w] or was deleted word-medially (F. Inoue 2000: 421). 4. As an anonymous reviewer pointed out, some readers may wonder if PNS in this dialect contrasts with NC clusters intervocalically. It seems cross-linguistically rare for NC clusters and PNS to contrast (Maddieson & Ladefoged 1993). Also, from the point of view that Tohoku dialects is a “syllabeme dialect” (Shibata 1962: 140 –141), where light and heavy syllables do not show weight contrast, it may be hard to believe there is a distinction. However, there are several reasons to believe so. First, this dialect has minimal pairs to show this contrast. For example, /samba/ ‘mackerel’ vs. /samba/ ‘midwife’, /handa/ ‘skin’ vs. /handa/ ‘solder’, and /kaN N o/ ‘basket’ vs. /kaN N No/ ‘nursing’. Second, the moraic nasal has longer duration than prenasals. So far I have not attained any phonetic measurement data to show the durational contrast between the prenasal (e.g., /m/ in /..mb../) and the moraic nasal on the same place of articulation (e.g., /m/ in /..mb../), but according to Oohashi (2002: 210–215), the duration of the nasal murmur showed 130 ms. for /N/ (in /teNki/ ‘weather’), which contrasted with 36 ms. for /m/ (in /ombi/ ‘kimono belt’) and 100 ms. for /n/ (in /handa/ ‘skin’). Third, the duration of the moraic nasal is longer compared to the second element of long vowels or geminates (Oohashi 2002: 317–349). 5. Generally, PNS only appear in Yamato items, not in Sino-Japanese (SJ), Mimetics and Foreign items. There are also phonetic environments which prohibit PNS: (a) V1. _ V2 : where V1 = long, (b) C1V1. _ V2 : where C1 = [-voice], V1 = [+high], V2 = [-high], (c) V1. _ V2. C3V3 : where V2 = [+high], C3 = [-voice], V3 = [-high], (d) after nasal. Nonetheless, some prenasalized items in SJ as well as Yamato are found in these environments. d

N

[ko˘ bo˘daisi]

[ko˘ndo:]

[do˘N N u]

‘Saint Kobo (SJ)’

‘activity, hall (SJ)’

‘tool (SJ)’

m

Environments

b

m

a

VV _

n

[u de]

[çiN N aN]

‘writing brush (Yamato)’

‘equinoctial week (SJ)’

n

b

C-voi V+hi . _ V-hi

none

c

_V+hi . C-voi V-hi

none

none

d

/N/ _

none

none

[miN N ite] ‘right hand (Yamato)’ [riNN N o] ‘apple (SJ)’

(Based on F. Inoue 2000: 361)

150 Noriko Yamane-Tanaka

6.

7.

8. 9.

10.

Interestingly, [N] can appear in all environments (a-d), [nd] in limited environments (a, b), and [mb] in only one environment (a). This observation also seems to match the assumption that [mb] is least likely to survive. In this paper the vowel phonetic symbols are represented by simplified 5 vowels [a, i, u, e, o] rather than the narrow transcription. In Aomori dialect, /u/ is [μ_] (unrounded and centralized), which is common to the most dialects in Eastern Japan. Tohoku dialects in general have no contrast between /i/ and /e/, merged as [e≥] (raised). /..di/ and /..du/ are left out from the column of the examples, because they are exclusively observed in Foreign items (e.g., [torendi˘] ‘trendy’, [andu˘towa] ‘un, deux, trois (Fr)’) at present. Relevant to /di/ or /du/ in Yamato items, there is something worth noting. It is assumed that historically /di/ vs. /zi/ were pronounced respectively as [di] vs. [Zi], and /du/ vs. /zu/ were as [du] vs. [zu]. Such a distinction is still kept in only some areas in Kochi, well known as “yotsugana dialects” [dialects with four different ways of pronounciation for four kana letters] (e.g., [uZi] for /fuzi/ ‘name of area or person’, [undi] for /fudi/ ‘wisteria’, [kuzu] for /kuzu/ ‘arrowroot’ vs. [kundu] for /kudu/ ‘trash’). In contrast, the North of Tohoku areas neutralize such four forms into [dzï] ([ï] = centralized [i]), which is characterized as “hitotsugana dialects” [dialects with only one way of pronunciation for four kana letters] or “dzii dzii dialects”. This dialect also neutralizes /ti/ and /tu/ into [tsï], which turns up as [dzï] intervocalically. Thus, /tizi/ ‘governor’ and /tizu/ ‘map’ are both realized as [tSindzï], and /titi/ ‘milk’ and /tuti/ ‘soil’ are both [tsïdzï] (see R. Sato 2002). As for the structurally complex segments, see van de Weijer (1996). As for the markedness ranking of features, see sec. 2.1. The other way of adjustment is turning it into simple voiced stops (i.e., [g]), as it has happened to [mb] and [nd] in younger generation. But only [Ng] didn’t show this option. It might be ascribed to Boyle’s Law: voicing is difficult to maintain when the supraglottal cavity is small (Ohala 1983; Vance 1987). McCarthy & Prince (1995: 353) express this effect in terms of a constraint POSTVCLS “posterior stops (i.e., velars) be voiceless”. In their system treating [g] ~ [N] alternation in Tokyo Japanese, *[N] >> POSTVCLS >> IDENT-IO(NAS) turns [k] to [g] only where [N] can’t. Itô & Mester (1997) postulates *g, which can be interpreted as ‘trigger-constraint’, producing the impetus for nasalization of [g] into [N] to occur (Kager 1999: 241). Such an avoidance of [g] may be aerodynamically or physiologically grounded, as the oral constriction at the velar could lower the velum so easily that the airflow may be let out of the nasal cavity. It may be worth exploring in terms of Grounding Theory (Archangeli & Pulleyblank 1994). From the viewpoint of the long-term sound changes, this change can be taken as the first stage of the consonant shift; [Ng] > [N] > [g] > [F]. As for the labial series, [mb] > [b] > [B] is assumed. The change in the coronal series is divided

The implicational distribution of prenasalized stops in Japanese

151

into two: [nd] > [d] before nonhigh vowels, and [nd] > [ndz] > [dz] > [z] before high vowels (See T. Takayama 1993). It should be noted that velars had [N] before reaching [g], while labial and coronal did not have [m] or [n] respectively in any stage of the consonant shift. 11. [Ng] and [N] may not stand in phonemic distinction, so the latter is parenthesized. However, positing the two systems of A1 and A2 seems to cover the observation so far. Thanks for Gunnar Ólafur Hansson for this suggestion. 12. This may be similar to the wave theory in the European tradition of dialectology. However, “Yanagita’s theory lays more emphasis on the aspect that the spread of newer forms takes place in a circular pattern just like ripples with its center located in the cultural center” (Shibatani 1990: 201). 13. This discussion is developed based on the following chronological division (cf. Miller 1967: ch. 1). Division Old J Late Old J

Centuries – 8c. 9c. – 12c.

Middle J

13c. – 16c.

Early Modern J Modern J

17c. – 1868 1868 –

Rough Correspondence of Eras Asuka, Nara Heian Kamakura, Muromachi, Sengoku, Azuchi-Momoyama Edo Meiji, Taisho, Showa, Heisei

14. We focus on the prenasalization of voiced stops, and therefore do not treat [ndz] here. However, [ndz] likewise underwent loss of PNS, and subsequently merged with [z] around the 16th through 17th century. For details, see T. Takayama (1993). 15. For another OT analysis treating dialectal differences of consonant voicing, see Nishihara (2002). 16. For references, see de Lacy (2002: 193–194). In fact, not only the ranking between *Lab and *Dor, but also the whole ranking has been a matter of debate. See Hume & Tserdanelis (2002). For a restriction on M, see de Lacy (2004). 17. The relation between implicans and implicata seems to show the basic pattern of phonological asymmetries. In the pattern of consonant harmony in acquisition (Pater & Werle 2001), implicans and implicatum are both instantiated in the early stage, but only implicata are observed in the later stage (e.g. coronals are targets of harmony as frequently or more frequently than non-coronals.). 18. Anttila and Cho (1998) divide the systems into invariant and variable systems. Variable systems are expressed as combinations of two invariant systems, such as A+B, B+C, and C+D. Thus, 7 systems will be logically possible. Since we assume that diachronic change originates in synchronic variation, we must allow for those variable systems. 19. Thanks for Douglas Pulleyblank for raising this issue. 20. See Nasu (1999) for more data with an emphasis on the markedness of [p].

152 Noriko Yamane-Tanaka 21. As for the scale of voiceless stop, *p >> *k >> *t could be posited based on the typological investigation of segment inventory (Itô & Mester 1997, 2003). Hayes & Steriade 2004: 26) states that “[p] is the most difficult obstruent to keep voiceless (particularly in voicing-prone environments, such as intervocalic position)”. 22. I cross-referenced the linguistic map of M. Takayama (2002) with the original map in Hirayama et al. (1992). Hokkaido was not included in the original map. But Hokkaido is categorized into system C (cf. Kamei et al. 1997: 280). I marked three kinds of checks according to the area size. a. ‫ݰ‬: observed in all area + b. ‫ ݰ‬: observed in roughly more than half of the area – c. ‫ ݰ‬: attested in roughly less than half of the area 23. This is modified by the inventory of F. Inoue (2000: 431–432). The original versions of prenasal stops were all voiced, with noting that it would be safe to say that they are all voiceless. See also Hamano 2002: 209). 24. This is from Kamei et al. (1997: 216)

The implicational distribution of prenasalized stops in Japanese

Appendix 1: Regional division (i.e., Todôfuken) in Japan

153

154 Noriko Yamane-Tanaka Appendix 2: Geographical distribution of intervocalic prenasalized stops in Japan

[1] [N, g ‚ ,Ng]

[3] [mb]

[2] [Nd]

The implicational distribution of prenasalized stops in Japanese

155

Appendix 3: System information in each prefecture

1. 2. 3.

Prefectures Hokkaido Aomori Iwate

4.

Akita

5.

Miyagi

6.

Yamagata

7.

Fukushima

8. 9. 10. 11. 12. 13. 14.

Ibaraki Tochigi Gunma Chiba Saitama Tokyo Kanagawa

15.

Niigata

16. Nagano 17. Toyama 18. Ishikawa 19. Fukui 20. Gifu 21. Yamanashi 22. Shizuoka 23. Aichi 24. Shiga 25. Mie 26. Kyoto 27. Nara 28. Osaka 29. Wakayama 30. Hyogo 31. Tottori 32. Shimane 33. Okayama 34. Hiroshima 35. Yamaguchi 36. Kagawa

N

g

–

–

–

m

n

+ – –

– –

b

d

N + + + + +

–

–

–

–

–

– – + + + + +

–

–

–

–

– + + + –

–

System C A2 A1 A2 A1 A2 A1 A2 C D C D C A1 A2

C

D C A2 C B C B C C D C D

156 Noriko Yamane-Tanaka

37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47.

Prefectures Tokushima Ehime Kochi Fukuoka Oita Saga Nagasaki Kumamoto Miyazaki Kagoshima Okinawa

N

g

m

n

b

d – –

–

N –

System ? A2 B D

–

–

–

A2 D

–

–

–

–

A2 C

The correlation between accentuation and Rendaku in Japanese surnames: a morphological account Hideki Zamma

1. Introduction Although Rendaku (or Sequential Voicing) has been extensively studied in the literature, its relation to accentuation has not. Sugito (1965) is a unique study, which, after conducting a limited research on person names which end with the morpheme ‘⏛ (rice field),’ claims that words which undergo Rendaku tend to be accentless. This correlation is intuitively supported, as some researchers follow her on various occasions (cf. H. Sato (1989), Kubozono (1998), Kubozono (this volume), Tanaka (this volume), etc.). This paper examines the extent to which Sugito’s generalization applies in Japanese, investigating more thoroughly other person names. It will become clear that the correlation in question is observed to some extent, but is not overwhelming. Moreover, it will be shown that obedience or nonobedience to it is lexically determined by the rightmost head morpheme of the name (e.g. ta in Yoko-ta), and further, that each morpheme shows quite a diverse behavior in accentuation and Rendaku. The paper is organized as follows: in the next section we first review the generally-held view of the relationship in question, giving a summary of Sugito’s (1965) investigation on names with ta. In Section 3, we investigate other Japanese names to see to what extent the relevant observation applies in general. It soon becomes evident that the pattern differs significantly depending on the last head morpheme in the name; in Section 4 such various morpheme-specific patterns will then be illustrated. Section 5 concludes the paper.

158 Hideki Zamma 2. Sugito (1965): voicing and accentuation in names with ta The generalization first given in Sugito (1965) – and often intuitively supported – can be summarized as follows: (1)

a. Accented names do not tend to undergo Rendaku. b. Accentless names tend to undergo Rendaku.

Sugito investigated names which end with the morpheme ta ‘rice field’ – one of the most productive morphemes for Japanese surnames – and examined if ta is subject to this generalization, referred to as Sugito’s Law in this paper. As shown in the examples below, names which conform to this generalization are abundant. (2)

a. non-Rendaku pattern: mostly accented Fu’ji-ta, Mo’ri-ta, Shi’ba-ta, Ku’bo-ta, Yo’ko-ta, To’mi-ta, A’ki-ta b. Rendaku pattern: mostly accentless Yoshi-da, Yama-da, Ike-da, Mae-da, Oka-da, Matsu-da, Wa-da

As exemplified in (3), however, this correlation is not absolute. There are quite a few exceptions. (3)

a. non-Rendaku but accentless Oo-ta, Mura-ta, Naka-ta, Hira-ta, Iwa-ta, Miya-ta, Naga-ta b. Rendaku but accented Ha’ra-da, Ni’shi-da, Ku’ro-da, Ha’ma-da, Tsu’no-da, Ka’ne-da

The names in (3a) do not undergo Rendaku although they are accentless. On the other hand, those in (3b) get accented even though they are subject to Rendaku. After investigating 362 names with this morpheme, Sugito found that the generalization applies to more than half of those ending with ta. Below are the results in which Sugito counted the number of names with respect to accentedness and Rendaku sensitivity [slight modifications are mine].1

The correlation between accentuation and Rendaku in Japanese surnames

(4) non-Rendaku Rendaku both total

accented 94 64 8 166

accentless 13 95 0 108

both 10 56 22 88

159

total 117 215 30 362

According to (4), 189 names – that is, 52.2% of all the ta-names and 71.0% of names which do not have alternating patterns (i.e., all the names excluding those listed in ‘both’ cells) – conform to the generalization in (1), as highlighted above. It is possible to conclude from this table that Sugito’s Law applies moderately, though not strictly, to names ending in ta. Sugito also pointed out that both voicing and accentuation are influenced by the onset segment of the last mora of the preceding morpheme, which I will call the base. If the segment is voiced – including sonorants but not nasals – the name tends to be accented and exempt from Rendaku (as in Fu’ji-ta). If the segment is voiceless – including nasals – or the mora has no onset, names tend to be accentless and undergo Rendaku (as in Yoshi-da).2 The table below illustrates this point, where [–v] stands for voiceless and [+v] for voiced in the above-mentioned classifications regarding sonorants and nasals. Note also that the counting of names is slightly different from (4), mainly because Sugito excluded names with alternative voicing pronunciations. (5) non-Rendaku Rendaku total

accented –v +v a 23 55 58c 2 81 57

accentless –v +v 3 3 87 3 90 6

both –v 12b 56 68

total +v 14 0 14

110 206 316

a: 19 of these are names whose last onset of the base is /k/. b: All contain /k/ as the onset in question. c: 43 of these are names with nasals as the last onset of the base. The cells highlighted in gray represent areas predicted by Sugito’s analysis – accented without Rendaku when the segment in question is voiced, and accentless with Rendaku when the segment is voiceless. This is also observed in names without alternating accent patterns, highlighted in black. These suggest that even when the accentuation is unpredictable, the application or non-application of Rendaku can be predicted from the last onset of the base: when it is voiceless, the name undergoes Rendaku; when voiced, it does not.

160 Hideki Zamma Exceptions to this generalization seem rather abundant, but if we assume /k/ can be exceptionally regarded as voiced – as does Kubozono (this volume) – 19 out of 23 accented non-Rendaku names with a voiceless onset can be correctly predicted, increasing the number of accented non-Rendaku names from 55 to 74. All 12 of the non-Rendaku names with alternative accentuation also contain /k/ as the onset. This exceptional treatment of /k/ should not be applied to Rendaku names, however, which already have a high enough number (i.e. 87 accentless names and 56 names with both accentuations) under a calculation which regards /k/ as voiceless. As Kubozono (this volume) also suggests, the sensitivity to Rendaku which depends on the voicing of the segment in the base can be regarded as another case of the OCP, or Lyman’s Law. The OCP itself does not force names with a voiceless onset to undergo Rendaku, although it does prohibit those with a voiced segment from undergoing it. This might be another reason why names with exceptional Rendaku are rather large in number when the onset is voiceless. Exceptionally accented Rendaku names with a voiceless onset are rather abundant (58), but again, the OCP itself does not prohibit them from undergoing Rendaku. Note that such names are scarce when the onset is voiced (only 2). They do violate Sugito’s Law, suggesting that some other mechanism forces them to override it. Interestingly, 43 of these exceptional names contain a nasal as the base-final onset. A possible explanation is that something similar to Post Nasal Voicing is at work here, by which a coda nasal voices the following onset. This speculation of course needs further investigation, so I will just suggest the possibility and leave it open to question. In sum, according to Sugito’s research, the tendency in (1) is preserved to some extent in names with ta. It is further possible to predict Rendaku sensitivity by the last onset in the base, although it is necessary to treat nasals, sonorants and /k/ in special ways. Since both are somewhat predictable, we can assume that ta is not lexically specified with any particular information regarding accentedness and Rendaku sensitivity, and that they are both determined by mechanisms such as the OCP and Sugito’s Law. As the voicing of a segment is a lexical property of the base, it is also possible to assume that Rendaku sensitivity is first determined by the segment included in the name, and that accentuation is then determined by the voicing of the head morpheme. This difference in determination order might lead to the predominance in (4) of the number of names with alternative accentuation (10 + 56 = 66) over those with alternative Rendaku sensitivity (8 + 0 = 8). Although Sugito did not include names with a monomoraic base in her investigation, this type exhibits a distinct pattern in accentuation and Rendaku

The correlation between accentuation and Rendaku in Japanese surnames

161

sensitivity. Below are examples of names whose base consists of one mora, and they are all accentless: (6)

I-da, U-da, E-da, O-da, Ki-da, Su-da, Ta-da, Tsu-da, To-da, No-da, Hi-da, Ya-da, Yu-da, Yo-da, Wa-da; Se-ta, Ha-ta, Mi-ta

Moreover, except for the last three, names in (6) almost all undergo Rendaku. This might result from the fact that voiced obstruents are scarce in the last onset of the base in this type. Before moving on to the next section, let us summarize the characteristics of ta in the following table: (7)

characteristics of ta: specification ta

—

Rendaku/Accent correlation Yes

OCP Yes

Peculiarity of monomoraic base Yes: –A, +R

The first column in (7) is lexical specification of a morpheme, which is absent for ta: Rendaku sensitiveity can be predicted from the last onset of the base, and accentedness from its voicing. The second column shows that ta observes Sugito’s Law. The third column is for the subjectivity to the OCP. The rightmost column is for peculiar behavior of monomoraic bases, with the property assigned to the name. ‘A’ is for accentedness and ‘R’ for Rendaku, with plus and minus indicating whether the entire name has a positive or negative value for the property in question. As we will see in the following sections, morphemes show distinct characteristics as to accentedness and Rendaku sensitivity. Keeping those of ta in mind, we will observe cases in which other morphemes constitute the head of the name, and consider how similar and different they are to ta. First, we examine to what extent the observation in (1) applies to Japanese surnames in general.

3. Investigation: overall tendency For the purpose I have just mentioned, an investigation was made on Japanese surnames which contain various morphemes. Names examined are taken from the first 1,000 popular names in Murayama (2000), excluding (i) names with ta (which are extensively investigated by Sugito); and (ii)

162 Hideki Zamma names which cannot undergo Rendaku by definition (that is, those in which the rightmost head morpheme begins either with a vowel, a voiced obstruent or a sonorant). Those morphemes which have the same pronunciation and meaning are counted as one (e.g. shima ‘island’ can be written as ᓞ and ᔪ). The remaining 347 names are investigated in this paper. Five native speakers of standard Japanese, all in their mid-thirties, were asked to read the 347 names, which were shuffled into a random order. When more than one speaker read a name in one way and the rest in another, the name was regarded as having alternative pronunciations. When only one speaker read it in a particular way, the name was regarded as having one specific pronunciation, as in the case when all of the speakers read it in the same way. Below is the result of this endeavor: (8) non-Rendaku Rendaku

accented

accentless

both

total

105 (30.3 %)

59 (17.0 %)

5 (1.4 %)

169 (48.7 %)

89 (24.5 %)

63 (18.2 %)

1 (0.3 %)

153 (44.1 %)

3

both

15 (4.3 %)

6 (1.7 %)

4 (1.2 %)

total

209 (60.2 %)

128 (36.9 %)

10 (2.9 %)

25 (7.2 %) 347

The areas predicted by the generalization in (1) are shaded. The percentage shows the rate of appearance among all the names in question (i.e. 347 names). The high percentage of accented non-Rendaku names is in accordance with Sugito’s Law, in that it is higher than those of both accented Rendaku names and accentless non-Rendaku names. That is, if a name is accented, it is most likely to be exempt from Rendaku (i.e. 30.3% to 24.5%). If it does not undergo Rendaku, it is most likely to be accented (i.e. 30.3% to 17.0%). The percentage of accentless Rendaku names, on the other hand, does not seem to be consistent with Sugito’s Law. The percentage is only slightly higher in the Rendaku group (i.e. 18.2% to 17.0%), and even lower in the accentless group (i.e. 18.2% to 24.5 %). This tendency described above is even clearer in another calculation of the appearance rate. The percentages in (9a, b) are calculated by the number among one particular group, not among the entire set of names. For example, the 105 accented non-Rendaku names in (9a) comprise 62.1% of all the nonRendaku names (i.e. 105 to 169). On the other hand, in (9b) they comprise 50.2% of all the accented names (i.e. 105 to 209). In both (9a) and (9b), accented non-Rendaku names comprise more than 50%, which suggests that to some extent they conform to Sugito’s Law.

The correlation between accentuation and Rendaku in Japanese surnames

(9)

a. non-Rendaku Rendaku b. non-Rendaku

accented

accentless

total

105 (62.1 %)

59 (34.9 %)

5 (3.0 %)

169

89 (58.2 %)

63 (41.2 %)

1 (0.7 %)

153

accented

accentless

105 (50.2 %)

59 (46.1 %)

Rendaku

89 (42.6 %)

63 (49.2 %)

both

15 (7.2 %)

6 (4.7 %)

total

both

163

209

128

Accentless Rendaku names, on the other hand, do not exhibit such prevalence in any group. Among Rendaku names (9a), they comprise 41.2%, which is less than accented Rendaku names (58.2%). They are prevalent among the accentless names, but with only 4 names exceeding nonRendaku accentless names (9b). Though this is a very simple comparison, from these data we can conclude that certain characteristics found in ta – in particular, the generalization in (1) – do not apply to Japanese person names in general. As will become obvious in the following sections, accentedness and Rendaku sensitivity are actually morpheme-dependent properties. Moreover, the influence of the voicing of the base on Rendaku, which is operative in ta, also seems to be determined by each morpheme, as we will shortly see. In the next section, therefore, we discuss how such properties are specified for each major morpheme which appears in Japanese surnames.

4. Morpheme-specific tendencies As previously mentioned, accentedness and Rendaku sensitivity are properties specified for each morpheme independently. Morphemes can be categorized into four types as to the degree in which the properties are specified: that is, (i) those in which both accentedness and Rendaku sensitivity are specified; (ii) those in which one of these properties is specified; (iii) those in which neither is specified; and (iv) those which have peculiar pattern. In what follows, we will consider various morphemes according to this categorization, so that we can examine to what extent Sugito’s Law is upheld in them and how different they are in accentuation and Rendaku.

164 Hideki Zamma The morphemes examined here are those which have more than five entries in the list of 347 investigated names.

4.1. Names in which both accentedness and Rendaku sensitivity are specified Examples that most clearly exemplify this category are names with hara ‘field.’ As shown in (10), these are most likely to be accented without Rendaku, and should be specified as such. (10) hara: accented, non-Rendaku Shino’-hara, Kuri’-hara, Taka’-hara, Ue’-hara, Ta’-hara, Naka’-hara, Kawa’-hara, Yoshi’-hara, Tsuka’-hara, Take’-hara, Kasa’-hara, O’o-hara, Kita’-hara, Nishi’-hara The table below shows more precisely the specific behavior of this morpheme. The areas predicted by Sugito’s Law are shaded in gray. (11) non-Rendaku

accented

accentless

both

23 (63.9 %)

3 (8.3 %)

3 (8.3 %)

total 29 (80.6 %)

Rendaku

4 (11.1 %)

0

(0 %)

0

(0 %)

4 (11.1 %)

both

2 (5.6 %)

0

(0 %)

1 (2.8 %)

3 (8.3 %)

total

29 (80.6 %)

3 (8.3 %)

4 (11.1%)

36

The percentage of words which obey the lexical specification (i.e. 63.9%) might seem rather small, but it is clear that the other major pattern predicted by Sugito’s Law (i.e. accentless with Rendaku) is never produced, not even as an exception. If it is the case that this morpheme just follows the generalization – i.e. not that it is specified as accented with Rendaku – the alternative pattern should contain fairly large number of examples. Note also that the total percentages of accented and non-Rendaku groups are both 80.6%. From these observations, we can conclude that this morpheme is most typically accented without Rendaku, because it is specified as such. Moreover, the monomoraicity of the preceding morpheme seems to play a role in accentuation, as in the case of ta. In (11), exceptions with a monomoraic base can be found in the following four categories: (i) two in accentless non-Rendaku names (Mihara and Ihara); (ii) two in non-Rendaku names which have both accented and accentless patterns (Ki(‘)hara and

The correlation between accentuation and Rendaku in Japanese surnames

165

No(‘)hara); (iii) one in an accented Rendaku name (E’bara); and (iv) one in an accented name which may or may not undergo Rendaku (Ko’[h/b]ara).4 Thus, it is possible to conclude that hara is specified not only as accented without Rendaku, but also as accentless or undergoing Rendaku for names with a monomoraic base.5 On the other hand, names with hara are not influenced by the voicing of the preceding morpheme in terms of Rendaku. Names with a voiceless segment in the last onset of the base are equally exempt from Rendaku, as is obvious in (10): compare, for example, Takahara and Yoshihara with Kurihara and Kawahara. This, however, is natural because the OCP occurs only when Rendaku might occur, which does not normally happen in hara names due to their specification. Other examples which show similar behavior to hara are: (12) a. shita ‘under’ (8/8): b. tani ‘valley’ (7/8): c. se ‘shallows’ (7/7): d. saka ‘slope’ (6/7):

Yama’-shita, Matsu’-shita, Miya’-shita, Mori’-shita Mizu’-tani, Naka’-tani, Shi’n-tani, Ko’-tani, Mi’-tani Hi’ro-se, Ta’ka-se, Mu’ra-se, I’wa-se, Ka’wa-se, Na’ru-se Ko’-saka, Ho’-saka, Haya’-saka, Aka’-saka, Miya’-saka

As the rate in parentheses shows, almost all the names with these morphemes are accented and do not undergo Rendaku, which conforms to the pattern predicted by Sugito’s Law. Monomoraicity does not seem to play a role in tani and saka, since names with monomoraic base behave in the same way as others. On the other hand, there are many names which are specified as having a pattern that do not obey Sugito’s Law; that is, those which are specified either as accented and undergoing Rendaku, or as accentless and not undergoing Rendaku. Names with kuchi ‘mouth’ as head are the clearest examples which show the former pattern. (13) kuchi: accented, Rendaku Yama’-guchi, Tani’-guchi, No’-guchi, Kawa’-guchi, Hi’-guchi, Ta’-guchi, Seki’-guchi, E’-guchi, I’-guchi, De’-guchi, Mizo’-guchi, Hama’-guchi, Hori’-guchi, Hara’-guchi

166 Hideki Zamma 22 names were found with this morpheme, and interestingly enough, there were no exceptions to the pattern in question.6 Furthermore, neither the length (e.g. No’guchi, Hi’guchi, etc.) nor the voicing of the last onset of the base (e.g. De’guchi, Mizo’guchi, etc.) have an influence on the pattern. Thus, we can conclude that kuchi is specified as accented with Rendaku, and as not relevant to the OCP. Behavior similar to kuchi can be observed in names with hayashi ‘forest.’ Six names were found in this investigation, and five of them are accented with Rendaku: (14) hayashi: accented, Rendaku Waka-ba’yashi, Hira-ba’yashi, Naka-ba’yashi, Kuri-ba’yashi, Oo-ba’yashi The only exception is Kobayashi, which is accentless. The shortness of the preceding morpheme might be the reason for this exception. The morpheme kura ‘storehouse’ shows an opposite pattern to kuchi and hayashi; that is, names with this morpheme are accentless and exempt from Rendaku, although they are similar in not obeying Sugito’s Law. (15) kura: accentless, non-Rendaku Asa-kura, Ita-kura, Taka-kura, Ishi-kura, Oo-kura Among the seven names that were found, Ogura, which undergo Rendaku, and Kuma’kura, which is accented, are the only exceptions. The former might be due to the monomoraicity of the base, and the latter might be influenced by the voicing of the base-final onset (as in the case of exceptional names with ta). Obviously, these are just possible explanations, as these two names are the only examples in the relevant environments. In (16), we summarize the properties of each morpheme discussed in this section in the format employed for ta. Those which need more investigation are enclosed in parentheses. Cells which do not have decisive data are left blank. The OCP is irrelevant to names with the specification [-R].

The correlation between accentuation and Rendaku in Japanese surnames

(16)

specification

hara shita tani se saka kuchi hayashi kura

+A, –R +A, –R +A, –R +A, –R +A, –R +A, +R +A, +R –A, –R

Rendaku/Accent correlation Yes Yes Yes Yes Yes No No No

OCP

167

Peculiarity of monomoraic base Yes: –A or +R

— — — — — No No —

No No No (Yes: –A) (Yes: +R)

4.2. Names which are specified for one of the properties Several morphemes are specified for either accentedness or Rendaku sensitivity, but not both. In these morphemes, it is often the case that the property which is not lexically specified is determined by other factors. Take sawa ‘swamp’ as an example: (17) a. O-zawa, Naka-zawa, Taki-zawa, Yoshi-zawa, Fuka-zawa, No-zawa b. Fuji-sawa, Kuro-sawa, Yanagi-sawa, Naga-sawa, Furu-sawa, Hirasawa It is clear from (17) that this morpheme is specified as accentless, as the names do not have accent either when they undergo Rendaku (17a) or not (17b). Rendaku sensitivity is determined by the last onset of the base, as with ta: when it is voiceless (including nasals and a null onset), names with this morpheme undergo Rendaku; when it is voiced (including liquids and glides) they do not. In (18) we show the precise number of names for each category. Note that /k/ is counted as a normal voiceless segment here (e.g. Nakazawa), different from the exceptional cases of ta. (18) non-Rendaku Rendaku both total

accented –v +v 0 0 1 0 0 0 1 0

accentless –v +v 2 6 12 1 3a 0 17 7

total

both –v 0 0 0 0

+v 1 0 0 1

9 14 3 26

a: All contain a base-final /m/: Ume[s/z]awa, Tomi[s/z]awa, and Kuma[s/z]awa.

168 Hideki Zamma What is peculiar about this morpheme is that the accentuation is not influenced by voicing – again unlike ta. This is because sawa is lexically specified as accentless, and this specification is respected at all times. As a result, Sugito’s Law is only observed in cases where Rendaku applies, i.e. in the accentless Rendaku cell. Moreover, /m/ behaves differently among voiceless segments (note that nasals are counted as voiceless in Sugito’s calculation), as it is included in all the names with alternative Rendaku patterns.7 It may be that /m/ can sometimes be regarded as voiced in names with sawa. Such is also the case for shima ‘island,’ which is also lexically specified as accentless. Again, Rendaku sensitivity is determined by the last onset of the base.8 (19) a. Ko-jima, Ta-jima, Ii-jima, Naka-jima, Kita-jima, Nishi-jima, No-jima b. Kawa-shima, Naga-shima, Tera-shima, Mizu-shima, Toyo-shima, Fuji-shima (20) non-Rendaku Rendaku both total

accented –v +v 3 0 0 0 0 0 3 0

accentless –v +v 3 6 7 2 1 0 11 9

both –v +v 0 0 0 0 0 0 0 0

total 12 9 1 22

As in the case of sawa, /k/ is counted as voiceless in (20). The Sugito Law pattern is again only observed in the accentless Rendaku cell. Names with tsuka ‘mound’ also show a similar pattern to sawa and shima in that the name with this morpheme becomes accentless.9 (21) Oo-tuska, Hira-tsuka, To-tsuka; Ii-zuka, Ishi-zuka, Te-zuka The difference between this and the previous two is that voicing is not determined by the last onset of the preceding morpheme; cf. Totsuka vs. Tezuka. As long as there are no Rendaku names with voiced base-final onsets, it can be said that this morpheme also respects the OCP. Only seven examples with this morpheme were found, however, and thus this tendency remains to be examined more thoroughly.

The correlation between accentuation and Rendaku in Japanese surnames

169

Saki ‘cape’ is also similar to sawa, shima and tsuka in the sense that accentedness is fixed lexically; still, the value of the specification is opposite – that is, specified as accented. Rendaku sensitivity is again determined by the last onset of the base: (22) a. Miya’-zaki, Oka’-zaki, Matsu’-zaki, No’-zaki, Shino’-zaki, Ishi’-zaki, Shima’-zaki b. Iwa’-saki, Fuji’-saki, Naga’-saki, (Kawa-saki) As shown in (22b), a name does not undergo Rendaku when the last onset of the base is voiced, including exceptionally accentless Kawasaki. This fact is observable from the following table: (23) non-Rendaku Rendaku both total

accented –v +v 1 3 8 1 4a 1 13 5

accentless –v +v 0 1 0 0 0 0 0 1

both –v 1 0 0 1

total +v 0 0 0 0

6 9 5 20

a: All contain a base either (i) with [m] (e.g. Yama’-[s/z]aki and Hama’-[s/z]aki) or (ii) of monomoraic shape (e.g. Ta’-[s/z]aki and E’-[s/z]aki)

As in the case of sawa, /m/ behaves differently from other voiceless segments, as well as monomoraic bases. Another morpheme that is specified as accented is hata ‘farm’: (24) O’o-hata, Taka’-hata; Ta’-bata, O’-bata; (Kawa-bata) The OCP effect on Rendaku seems to be absent here – note that Takahata does not undergo Rendaku even though the last onset is voiceless [k]. It may be that names with a monomoraic base undergo Rendaku, as in Tabata and Obata, although the data is limited to only these two. Kawabata is exceptional not only in that it is accentless but also in that it undergoes Rendaku – even though the base-final onset is voiced, violating the OCP. The morpheme ki ‘tree’ shows a unique pattern. A name with this morpheme shows one of the following three patterns: (i) accentless without Rendaku as in (25 a); (ii) accentless with Rendaku as in (25b); and (iii) accented without Rendaku as in (25c).

170 Hideki Zamma (25) a. Suzu-ki, Sasa-ki, Ao-ki, Oo-ki, Masa-ki, Fuji-ki b. Taka-gi, Ya-gi, Mote-gi, Kashiwa-gi, Aka-gi c. A’ra-ki, Ku’ro-ki, Shi’ra-ki, Mu’ra-ki, Ta’ma-ki, Mi’-ki Interestingly, the fourth possible category of ‘accented with Rendaku’ does not have any members. Moreover, the OCP does not seem to play a role in deciding Rendaku sensitivity, as shown below: (26) non-Rendaku Rendaku both total

accented –v +v 3 4 0 0 0 0 3 4

accentless –v +v 5 2 3 2 0 0 8 4

both –v 0 0 0 0

total +v 0 0 0 0

14 5 0 19

Both accented and accentless names are found in non-Rendaku and Rendaku cells. Possible explanations would be either: (i) accentedness is fixed as ‘accentless’ and the names in (25c) are exceptions; or (ii) Rendaku sensitivity is fixed as negative and the names in (25b) are exceptions. If we look at the data more closely, it becomes clear that the former is a better analysis. Note that the names which belong to (25c) have a base with [r] or [m] as the last onset. Only when the base has one of these particular segments, does the name gets accented instead of its [–A] specification.10 It is worthwhile to note that the exceptional names in (25c) obey Sugito’s Law: when the name is exceptionally accented, it is always exempt from Rendaku. Thus, we can conclude that the generalization is partly preserved in names with this morpheme. We summarize this section with the list in (27). As in the previous sections, each morpheme is supplied with information concerning its lexical specification, Rendaku/accent correlation, the OCP, and peculiarities regarding monomoraic bases. In addition, the idiosyncratic behavior of some morphemes is given in the rightmost column. Some morphemes either undergo or fail to undergo Rendaku when the base has [m] as the last onset, which means that [m] must be regarded as both voiced and voiceless.

The correlation between accentuation and Rendaku in Japanese surnames

(27)

specifiRendaku/ cation Accent correlation

OCP

Peculiarity of monomoraic base

special treatment [m]: [±v]

sawa

–A

No

Yes

(No)

shima

–A

No

Yes

(No)

tsuka

–A

No

(Yes)

(No)

saki

+A

No

Yes

(Yes: ±R)

hata

+A

No

(No)

(Yes: +R)

ki

–A

Yes (in subgroups)

No

(No)

171

[m]: [±v] [m] and [r]: +A

Note that specified accentedness and OCP-driven Rendaku often produce patterns which do not follow the generalization in (1). For example, when sawa – specified as accentless – takes a base whose last onset is voiced, the result is an unpredicted accentless non-Rendaku name, such as Fujisawa.

4.3. Names in which neither is specified Several morphemes are specified for neither accentedness nor Rendaku sensitivity. In other words, names with such a morpheme are randomly accented and subject to Rendaku. Hashi ‘bridge’ is one example: (28) Taka’-hashi, Mitsu’-hashi, Moto’-hashi; Ita’-bashi; OO-hashi; Ishi-bashi In (28), the first three names are accented without Rendaku, the fourth one is accented with Rendaku, the fifth one is accentless without Rendaku, and the last one is accentless with Rendaku. It is impossible to predict which name will have which pattern, as they all have a two-mora base with a voiceless final onset. Although it is not entirely clear for the seven examples found, it seems that this morpheme does not follow Sugito’s Law. A more complex case comes from kawa ‘river.’ First, let us observe the case where the base consists of two morae: (29) a. Furu’-kawa, Ichi’-kawa, Yoshi’-kawa, Nishi’-kawa, Mae’-kawa, Hoso’-kawa, Mori’-kawa, Kuro’-kawa, Yama’-kawa, Tachi’-kawa b. Hase-gawa, Kita-gawa, Tani-gawa, Taki-gawa, Sasa-gawa, Asa-gawa, Yana-gawa, Ima-gawa, Shina-gawa

172 Hideki Zamma (30) non-Rendaku Rendaku both total

accented –v +v 11 8 1 0 0 0 12 8

accentless –v +v 1 1 9 1 1 1 11 3

both –v 0 0 3 3

total +v 0 0 0 0

21 11 5 37

Examples with a monomoraic base are excluded in (29) and (30). It is obvious from (29) and (30) that this morpheme follows the generalization (1) when the base is bimoraic: that is, the name is exempt from Rendaku when accented and subject to it when accentless. This pattern is also observed in three names categorized as having both kinds of accentuation with/without Rendaku, as they are either accented without Rendaku or accentless with Rendaku (e.g. Shimo’-kawa/Shimo-gawa). This distribution can be accounted for if we assume that kawa is not assigned any specific value for accentedness or Rendaku sensitivity. In this case, a name randomly takes a value for accentedness – but not for Redanku, for reasons we will see shortly – and if it is accented, obedience to Sugito’s Law prohibits it from undergoing Rendaku. Conversely, if it is accentless, the name undergoes Rendaku for the same reason. Also interesting in (30) is the fact that accentless Rendaku names are restricted to names with bases having final voiceless onsets, but accented non-Rendaku ones do not have such a restriction. This is because the OCP only prohibits two consecutive voiced consonants, but not a voicelessvoiceless sequence. Suppose a name with a final voiced onset (say, paba) has a [–A] value. Sugito’s Law would predict an illegal voiced-voiced sequence *Pabagawa. On the other hand, obedience to the OCP produces a pattern which violates Sugito’s Law when it preserves the [–A] value: *Pabakawa. Satisfying both the [–A] value and the OCP is thus impossible for a name with a final voiced onset, which leads to the near nonexistence of [+v] names in the accentless Rendaku cell of (30). On the contrary, a base with a final voiceless onset does not violate the OCP if it does not undergo Rendaku: a sequence of two voiceless segments is not ill-formed in itself with respect to the OCP. Thus, if a name with a base-final voiceless onset is given a [+A] value, Sugito’s Law prohibits it from undergoing Rendaku, resulting in many [–v] – as well as [+v] – names in the accented non-Rendaku cell of (30).

The correlation between accentuation and Rendaku in Japanese surnames

173

When the base is monomoraic, the morpheme shows a distinct pattern: the name is accented and subject to Rendaku: (31) Se’-gawa, Ta’-gawa, E’-gawa, Ka’-gawa, I’-gawa, Sa’-gawa It is clear from these examples that kawa receives special treatment in names with this kind of base, so that they must be assigned with [+A, +R] values. Interestingly, the pattern varies slightly when the name literally refers to a river – not a person – even though the same morpheme is used as a head. River names are always subject to Rendaku, and moreover, they are typically accented as shown in examples of longer shape (32b). Four-mora names are always accentless as in (32a), due to a restriction against accented four-mora nouns (cf. Kubozono 1996; Zamma 2001, 2003). (32) kawa in river names: a. Kamo-gawa, Yodo-gawa, Shuku-gawa, Kako-gawa, Ibi-gawa b. Katsura’-gawa, Takase’-gawa, Nagara’-gawa, Temuzu’-gawa This difference suggests that the specification is determined not only by the morpheme itself, but also by the type of noun it produces. We summarize this section with the table in (33). (33) hashi kawa

specification — —

Rendaku/Accent correlation No Yes

OCP

Peculiarity of mono-moraic base

Yes

Yes: +A, +R

Although the number of attested morphemes is small in this limited study, it seems to be the case that some morphemes are not assigned any accent/Rendaku values.

4.4. Names with peculiar patterns Some morphemes show quite a bit of idiosyncratic behavior with regards to accentuation and Rendaku. One is the morpheme too ‘wisteria’. Below is an exhaustive list of names with this morpheme found in this study (15 in total):

174 Hideki Zamma (34) a. b. c. d.

after nasal: after [u]: after [(a)i]: others:

+A, +R +A, +R –A, –R +A, –R

E’n-doo, A’n-doo, Shi’n-doo, (Kon-doo) Ku’-doo, Su’-doo, Shu’-[t/d]oo, (Mu’-too) Sai-too, Nai-too, (I-too) Sa’-too, Ka’-too, E’-too, (Go-too)

Those which are exceptional in each category are enclosed in parentheses. The category in (34 c) may be described as being after moraic [i] rather than syllabic [ai], as similar behavior to Saitoo and Naitoo is observed in Itoo, though being the only example. The voicing in (34 a) might have resulted from Post Nasal Voicing, by which the voicing of the base-final nasal might spread to the following morpheme-initial /t/. Other specifications cannot be attributed to other, clear-cut properties, and seem quite arbitrary. Another fact worthy of comment is that this morpheme typically takes a monosyllabic base. This is quite remarkable for such a non-productive morpheme as too. Although it is not unusual for productive morphemes (such as ta and kawa) to have a monosyllabic base, bisyllabic (or more) bases are much more common even for such morphemes. Too, on the other hand, rarely takes bisyllabic bases: only two examples come to mind (i.e. Kawatoo and Sugitoo). This preference for a monosyllabic base is also a property specified for too. 5. Concluding remarks Japanese surnames provide interesting data as to accentuation and Rendaku, much as other word classes do. Their behavior differs depending on the head morpheme of the name, i.e. the rightmost element. As shown in Section 4, this variation can be quite diverse. Sometimes such behavior is observed only in a single morpheme, as each morpheme determines its own phonological behavior according to its lexical specification. Sugito’s Law represents one kind of morpheme-specific constraint. It is satisfied by many morphemes, producing the overall tendency of preserving the law as we saw in Section 3, but not always. As is obvious from the lists in (16), (27) and (33), quite a few morphemes show a pattern which violates Sugito’s Law. One naturally wonders how such variation among morphemes can be theoretically treated. Clearly it is far beyond the scope of theories based on a simple dichotomy or on groupings of some kind, such as classhood (cf. Siegel (1974), Kiparsky (1982), Benua (1998), etc.) or lexical strata (cf. Itô

The correlation between accentuation and Rendaku in Japanese surnames

175

and Mester 1995b, etc.). Consequently it should be handled within a theory which allows considerable variation (e.g. Inkelas (1998), Orgun (1998), Anttila (2002), etc.). Still we must await future studies to see what such analyses would be like.

Acknowledgements A very preliminary version of this paper was presented at the annual meeting of PAIK (Phonological Association in Kansai) held at Kobe College on July 13, 2002. I would like to thank PAIK participants (especially Shigeto Kawahara, Haruo Kubozono, Kazutaka Kurisu, Michinao Matsui and Akio Nasu) and Jeroen van de Weijer for their valuable comments and discussion. I am also grateful to Mark Campana, who suggested stylistic improvements of this paper.

Notes 1. The original list is more complicated because Sugito also made a comparison between the dialects spoken in Tokyo and Osaka. 2. Kubozono (this volume) also discusses this issue. 3. These four names in fact follow Sugito's Law, as they undergo Rendaku when they are accentless, and do not when accented. As none of the factors – accentedness and Rendaku sensitivity – are fixed for them, they are included in this category. 4. Ta’hara is the only name with a monomoraic base which satisfies the general specification of hara: i.e. accented without Rendaku. 5. In addition, one of the accented Rendaku names has a base which ends with a moraic nasal (Ka’mbara). It might be possible to attribute this Rendaku to the nasal segment – that is, to Post Nasal Voicing, in which a moraic nasal acts as the trigger. 6. Only Mizu’[g/k]uchi has an alternative pronunciation in which Rendaku does not apply. 7. The only name with /m/ which does not belong to this category is Misawa, which has a monomoraic base. 8. The pattern with shima is quite different from that found in river names, where non-Rendaku names get accented (cf. Tanaka, this volume). In this case, morever, the voicing of the last onset of the base is not the trigger of Rendaku: e.g. Sakura-jima and Itsuku’-shima.

176 Hideki Zamma 9. The only exception is Naka’-tsuka. 10. The only exceptions are Namiki, which is accentless with [m], and U’eki, which is accented without [m].

A survey of Rendaku in loanwords Tomoaki Takayama

Introduction The aim of this article is to present a survey of rendaku (sequential voicing) in loanwords. There are some differences in phonological behavior between native words and loanwords. It is often said that rendaku is one of those differences, since rendaku hardly occurs in loanwords. However, we find a number of exceptional occurrences. In this article, we take up some problems concerning such examples. Investigation into those exceptions sheds light on some aspects of lexical stratification in Japanese. The question of lexical stratification is one of the central issues in recent research in generative phonology, and some of the principal studies (Itô and Mester (1999a, 2003); Fukazawa, Kitahara, and Ota (2002); among others) have paid attention to phenomena such as rendaku in the Japanese lexicon. This article, however, intends to survey the relationship between lexical stratification and rendaku from a different viewpoint. If we try to answer the question of what is the loanword stratum, or what is the relationship between lexical stratification and phonological phenomena, we need to look further into the background behind the lexical stratification. Especially we have to recognize the significance of stylistic and sociolinguistic aspects. Paying serious attention to these aspects helps us to understand what the occurrence of a phonological phenomenon depends on. In our opinion, such a consideration is useful even for theoretical research. The borrowed vocabulary of the Japanese language consists of two main groups: the group of Sino-Japanese (SJ) words,1 and the group of foreign words that are largely borrowed from European languages. These two groups differ from each other with respect to rendaku occurrences. In the following sections, we look at this difference by examining loanword rendaku examples, and discuss some issues in loanwords in Japanese. Without intending to reach a conclusive argument, this article emphasizes the importance of stylistic or sociolinguistic aspects when dealing with phonological phenomena.

178 Tomoaki Takayama In the following discussion, we will use ‘word group’ or ‘vocabulary’ instead of ‘lexical stratum’ in order to avoid undesirable confusion, because the latter is often used in a restricted sense in recent works of generative phonology.

1. Rendaku exceptions in foreign loanwords If we put aside SJ loanwords (see section 2), rendaku does as a rule not occur in foreign loanwords that have been borrowed mainly from European languages. Nevertheless, there seem to be some exceptions, as illustrated in the examples (1) and (2) which were borrowed from Portuguese. (1)

karuta ‘Japanese style card game’ iroha garuta2 ‘iroha3’ ‘karuta’ (iroha card game) haikai garuta ‘haiku’ ‘karuta’ (haiku card game)

(2)

kappa ‘rain jacket, rain wear, raincoat’ ama gappa ‘rain’ ‘kappa’ (rain wear) bin l gappa ‘plastic, bin l from vinyl’ ‘kappa’ (rain wear made of plastic)

Interestingly, Japanese native speakers have a kind of intuition about whether some word is foreign or not. Of course, this intuitive judgment does not always agree with the etymological facts. There are a number of foreign loanwords that are thought of as non-foreign by the majority of native speakers, except by people who have some special knowledge of etymology. The words ‘karuta’ and ‘kappa’ in (1) and (2) above are members of this type. There seem to be two reasons why some foreign loanwords easily merge into the native word group (or into the SJ word group). First, some foreign loanwords have the same phonotactic arrangement as native (or SJ) words. For this reason, they apparently do not look like foreign words, and thus have a natural tendency to merge into the non-foreign vocabulary. Second, they have already lost any connection that would associate them with a foreign culture. Let us illustrate this point with a few examples. Ikura ‘salmon roe’ originated from the Russian ikra. The great majority of Japanese people think that ‘ikura’ is a purebred native word. The word form itself provides no phonotactic clues that would lead native speakers to

A survey of Rendaku in loanwords

179

know it came from a foreign language. Moreover, ikura has no cultural association with Russia. Japanese people regard it as a typical Japanese seafood. In contrast, for example, pirosiki refers to the Russian style pie piroshki, which many bakeries sell in Japan. Since its form has an initial /p/ that we rarely find in the common native words, people easily recognize it as a foreign word. In addition, this food has some cultural ties with Russia. Another example is okura ‘gumbo’, which entered the Japanese lexicon via the English okra. This vegetable has spread across the nation during the last couple of decades and is now found all over Japan. If sellers had wanted consumers to notice that this is a foreign vegetable, they could have adopted a form that looks like a foreign word such as kur. Instead, they adopted the native-like okura. This choice has successfully won a great number of consumers who believe that okura is a domestic vegetable. On the other hand, if a loanword connotes a foreign cultural background, as in example (3), native speakers do not believe that it is a native word even if its form apparently looks like a native word. Thus, we have to take into consideration both word forms and connotation. (3)

sonata4 ‘sonata, a form of classical music’

Let us return to rendaku in words like karuta, kappa. What was said about ikura equally applies to karuta and kappa. Although both words were originally borrowed from Portuguese in the 16 th century, they are not foreign in terms of a native speaker’s intuition. First, they look like non-foreign words in terms of their forms.5 Second, neither of them has any connotation with something foreign. As for karuta, it refers to a Japanese style card game, and Japanese people believe that playing karuta belongs to their own tradition. We can also regard kappa as a non-foreign word. In fact, it has a foreign counterpart rein k to, which comes from the English word raincoat. Although both kappa and rein k to are daily expressions in present Japanese, they are subtly different from each other in connotation. People prefer rein k to to kappa in some contexts, because the latter suggests cheaper or less fashionable quality. Another example, illustrating the history of rendaku in loanwords, is (4) karuka, which refers to the stick for loading a bullet into the barrel of a matchlock gun from the muzzle. (4)

karuka ‘stick for matchlock gun’ kae garuka ‘spare or alternative’ ‘karuka’ (alternative stick for matchlock gun)6

180 Tomoaki Takayama This word is considered a loanword from the Portuguese calcador, which means a tool to press something. It seems that karuka went through truncation in the process of its nativization. Its rendaku form garuka is attested in Z hy Monogatari7 published in the middle of the 19th century during the Edo period, and it is natural to assume that its rendaku form dates back to an even earlier time. Karuka was conventionally spelt by two specific Chinese logographs with the usage for native words (kun-reading), which is attested in an older manuscript of Z hy Monogatari written in the 17th century. This spelling convention suggests that karuka had already merged into the native word group. Probably it did not take much time to generate the rendaku form after the truncation. The above examples show that there is a correlation between the merging of foreign loanwords into the non-foreign vocabulary and their rendaku forms. In addition to these examples, the last example we will examine in this section allows us to discuss a somewhat complicated semantic aspect of nativization. Ketto in example (5), which comes from the second half of blanket, referred to a kind of blanket or a kind of blanket-like cloth that gained nationwide currency in the late 19th century. (5)

ketto ‘a kind of cloth’ aka getto ‘red’ ‘ketto’ (red coloured ketto)

An investigation of its written form in Chinese logographs shows that ketto reminded native speakers of the native word ke, which means ‘wool’. It is probable that this folk-etymological interpretation of ketto provided a moment of deviation from the foreign vocabulary, after which this deviation in turn caused the rendaku aka getto. However, we also have to take into account that aka getto had a kind of pejorative connotation as well. This word also referred to some people in urban areas who were regarded as unsophisticated because they originally came from the countryside. It is pointed out that this metonymic meaning originated from the fact that people from the countryside often came to urban areas being clothed in aka getto ‘red ketto’ instead of a cloak or an overcoat. It is important to notice that this compound with an exceptional rendaku in the foreign element refers to an unsophisticated object. We may speculate that such a nativized form was suitable for stigmatizing someone or something as lacking sophistication; by contrast, authentic western objects match foreign words that are less nativized. Although further investigation is needed, aka getto suggests that a connotative effect could be brought into a loanword by rendaku, one of the nativizing

A survey of Rendaku in loanwords

181

processes. This problem is worthy of further investigation in order to clarify the relationship between phonological phenomena and semantic aspects in loanwords. To sum up this section, rendaku in foreign loanwords takes place only in the words that merged into the non-foreign word group (including both native words and SJ words). Therefore, we conclude that the occurrence of rendaku essentially depends on the difference between the foreign word group and the non-foreign word group. In addition, there are still other problematic rendaku cases, of which some examples will be mentioned in (19) of section 4.

2. Rendaku in Sino-Japanese words Rendaku takes place not only in the above-mentioned foreign loanwords (in the etymological sense) but also in SJ loanwords. However, cases in SJ are more complicated than in foreign loanwords. We easily find a number of cases similar to the examples discussed in section 1, as shown in (6). (6)

kiku ‘chrysanthemum’ sira giku ‘white’ ‘kiku’ no giku ‘wild’ ‘kiku’

Although kiku originates from the Classical Chinese kïuk, this word had already joined the native word group in the Heian period (ca. 9th–12th century). One striking evidence is that in those days kiku was commonly and quite often used in Japanese poetry ‘Yamato uta’ or ‘waka’ where the usable expressions were confined to the native word group except for extraordinarily licensed cases. This indicates that the rendaku in compounds with kiku in (6) reflects its membership in the native word group. We therefore can explain rendaku in kiku in the same manner as in section 1, where it was concluded that the foreign loanwords that undergo rendaku are limited to words which have merged into the non-foreign word group. In fact, the form of kiku looks like a native form in terms of phonotactics. This property must have been one of factors that led kiku to join the native word group. However, not all cases of rendaku in SJ can be treated in the same manner as kiku. We also find a special type of rendaku words that does not follow the pattern we looked at in section 1. Some examples are given in (7) below.

182 Tomoaki Takayama (7) a. h k ‘service, labor’ nenki b k ‘years of employment’ ‘h k ’ (labor or apprenticeship) detti b k ‘apprentice’ ‘h k ’ (apprenticeship) b. hyakusy ‘peasant, farmer’ mizunomi byakusy ‘water drinking’ ‘hyakusy ’ (lower class peasant) c. kko ‘practice, exercises’ hatu gko ‘first, starting’ ‘kko’ (practice at the beginning of a year) kan gko ‘the coldest season’ ‘kko’ (practice in the coldest season) d. kenka ‘quarrel, blows’ ky dai genka ‘brother or sister’ ‘kenka’ (quarrel or fight between siblings) kuti genka ‘mouth’ ‘kenka’ (quarrel, dispute) e. kesy ‘makeup’ atu gesy ‘thick’ ‘kesy ’ (heavy makeup) usu gesy ‘thin’ ‘kesy ’ (light makeup) f. suiry ‘conjecture, surmise, guess’ ate zuiry ‘to shot at random’ ‘suiry ’ (guess) g. syasin ‘photograph’ kao zyasin ‘face’ ‘syasin’ (photograph of face) ao zyasin ‘blue’ ‘syasin’ (blueprint) h. tepp ‘gun’ mizu depp ‘water’ ‘tepp ’ (water pistol, squirt gun) kara depp ‘blank’ ‘tepp ’ (a blank shot of gun) i. t r ‘lantern’ isi d r ‘stone’ ‘t r ’ (lantern made of stone) mawari d r ‘rotative’ ‘t r ’ (rotative lantern) j. t ry ‘stay’ naga d ry 8 ‘long’ ‘stay’ (long stay) As pointed out in section 1, the application of rendaku essentially depends on whether a word is foreign or not (not in the etymological sense).9 As far as we look at cases such as kiku in (6), the same situation seems to apply to SJ. However, the examples in (7) are not likely to adapt to the same treatment. These examples have the following phonotactic properties that seldom appear in native words but quite frequently occur in SJ words.

A survey of Rendaku in loanwords

183

(i)

palatalized elements; e.g. in hya, sya, sy , ry , ry; especially, those after non-coronal consonants; e.g. in hya, ry , ry; moreover, multiple palatalized elements normally occur in a word; e.g. in hyakusy . (ii) multiple long vowels in a word; e.g. in h k , t r , t ry. (iii) the sequence NT (voiceless obstruents after nasal); e.g. in kenka. In the examples in (7) we do not detect any tendency towards native word forms, while we do find such a tendency in (6) kiku and in the examples examined in section 1. This indicates that, unlike foreign words, SJ words undergo rendaku even if they have no similarity to native forms.10 In order to understand the background behind these examples, we need to look at the complex status of SJ in the Japanese lexicon. Of course, it is often said that one of the differences between SJ words and native words relates to the degree of formality or the stylistic diversity. For instance, a native word such as sakana ya in (8) is preferred in informal or colloquial contexts, whereas the SJ word sengyo ten in (8) is preferred in formal contexts. In general, a great number of SJ words constitute nomenclatures of many scholarly fields, official expressions in current topics like politics or economics, and dignified expressions unique to various ceremonies. (8)

a. sakana ya (fish retailer shop) b. sengyo ten (fish retailer shop)

However, it is important to also pay attention to the fact that the great majority of SJ words are not uniquely used in formal contexts. Even among SJ words, there are differences in formality or style and we find a great number of SJ words that are more compatible with informal contexts. The examples (9), (10), and (11) illustrate this point. (9)

a. isya (doctor, physician, surgeon, practitioner) b. isi (doctor, physician, surgeon, practitioner)

(10) a. tepp (gun) b. zy (gun) (11) a. hyakusy (farmer, peasant)11 b. n min (farmer, peasant) Although both isya and isi in (9) are SJ, isya is a more informal expression preferred in daily colloquial contexts; isi is a stiff expression mostly used in

184 Tomoaki Takayama formal contexts such as official statements or documents. A similar difference is observed between tepp in (10) and zy in (10), or between hyakusy in (11) and n min in (11). Thus, contextual pluralism exists even inside the SJ vocabulary itself. We tentatively call the informal or colloquial side of the SJ vocabulary vulgarized Sino-Japanese12 (the reason why we avoid applying the term group for the vulgarized SJ will be mentioned below). Apart from words such as kiku that merged into the native word group, possible targets of rendaku in the SJ vocabulary are to be sought in the vulgarized SJ words. For example, tepp in (10) and hyakusy in (11) undergo rendaku as already illustrated in (7) mizu depp and in (7) mizunomi byakusy , respectively. All SJ words that undergo rendaku are informal or colloquial expressions, and we can recognize them as vulgarized SJ words (Some words with rendaku like mizunomi byakusy in (7), are seldom heard in present Japanese but they were used in informal or colloquial contexts). The relationship between SJ words and rendaku is summed up in (12). (12) Native

Sino-Japanese Vulgarized SJ Formal SJ Possible targets of rendaku

Since the difference between Vulgarized SJ and Formal SJ depends on the degree of formality or the stylistic diversity, we cannot exclusively sort all SJ members into two separated groups, and we have to recognize a gray area between these two sides. However, the important thing is that this point does not contradict the fact that all the SJ words that undergo rendaku are or were used in informal or colloquial contexts, i.e., they must be vulgarized SJ words. Finally, two kinds of voicing phenomenon need to be mentioned. One is the voicing at the morpheme level, as shown in (13). This type of voicing occurs regardless of the vulgarization in SJ words, but it is limited to sporadic cases. Another is the post-nasal voicing in SJ morphemes, as shown in (14), which is attributed to the voicing of obstruents after a nasal element in the native phonology.13 (13) san ‘mountain’ ka zan ‘fire’ ‘mountain’ (volcano) to zan ‘climbing’ ‘mountain’ (mountain climbing)

A survey of Rendaku in loanwords

185

(14) sya ‘person’ en zya ‘to play, to perform’ ‘person’ (actor or actress, performer) Cf. saku sya ‘to write, to make’ ‘person’ (writer, maker) It appears that the former phenomenon is historically related to the latter. Further research on these phenomena from the diachronic viewpoint is needed.

3. A difference between foreign and SJ words Nowadays, a great number of foreign words are indispensable in various situations of daily life and frequently used in colloquial contexts. Nevertheless, the foreign word group shows no symptoms of rendaku derivation, as illustrated in (15), even if many foreign words have become common expressions in present Japanese. (15) a. *kami goppu ‘paper’ ‘koppu<cup, Dutch’ (paper cup) b. *isy gsu ‘clothing’ ‘ksu
186 Tomoaki Takayama gsu. If rendaku were blocked only by this functional factor, it could take place in some of the foreign words that have no minimal pairs, even if they were confined to a small number of words. Nevertheless, there are no essential exceptions, as we have seen in section 1. Second, the SJ word group is the same as the foreign word group with respect to minimal pairs, because initial voiced obstruents are very common. However, we do find occurrences of rendaku as shown in section 2, even though the rendaku in SJ words is limited in comparison to rendaku in native words. In order to answer the question as to why the SJ word group and the foreign word group differ from each other with respect to rendaku, we need to separately deal with each word group before trying to address this main question, and we also need to elucidate somewhat obscure aspects of the Japanese lexicon. Since this article does not intend to fully discuss all these issues, only a few brief remarks about the problems involved are made in the final section below.

4. Problems of loanwords in Japanese The final section focuses on some problems concerning the foreign word group of which the significant growth began in the late 19th century, because it is relatively easy to look at a state of affairs from the present viewpoint. In our times, Japanese people generally tend to respect the original form of words that are borrowed from foreign languages.14 However, in the past, some words were transformed in the process of their popularization, as is reported and discussed by Grootaers (1976) and Sanada (1981, 1991) who investigated the geographical distributions of word forms. Among the dialect forms they investigated, we find a kind of nativization that is rarely observed in recently introduced foreign words. (17) Variations of syaberu ‘shovel’: a. syabori (bori is the rendaku form of hori, the gerund of the verb horu ‘dig’. For details, see Grootaers 1976 and Sanada 1981, 1991) b. syabiro (biro is the rendaku form of hiro, the stem of the adjective hiroi ‘wide’. For details, see Grootaers 1976),

A survey of Rendaku in loanwords

187

(18) Variations of stsyon ‘station’ (For details, see Sanada 1981, 1991): a. stensyo (syo is a SJ morpheme, meaning ‘site’ or ‘place’), b. tensyoba (ba is a native word, meaning ‘site’ or ‘place’. Cf. the non-foreign word tsyaba ‘station’, composed of tsya ‘a stop’ and ba ‘site’ or ‘place’), A reanalysis inspired by folk-etymology or a blending with native or SJ elements played an important role in the formation of these variants. This displays a tendency for merging newcomers into the familiar existing vocabulary rather than separately constructing a new word group. The word ketto in (5) can be added as another exemplification of this trend, since native speakers associated ke- with the native word ke, as mentioned in section 1. This nativization trend was stronger in the past, though it has not gained the mainstream status. Furthermore, we cannot overlook Nakagawa (1966)’s suggestion that some compounds with foreign words occasionally undergo rendaku, as shown in (19), even though these forms are unstable in comparison to non-rendaku forms.15 (19) a. indo gar ‘Indo-’ ‘kar <curry’ (Indian curry) b. yama gyanpu ‘mountain’ ‘kyanpu
188 Tomoaki Takayama vocabulary, and we focused particularly on vulgarized SJ words. The bulk of the SJ vocabulary is not the result of daily close contact with spoken Chinese, but rather the result of learning written Chinese or Chinese logographs (so called Chinese characters), which were the sole medium of written communication in East Asia. Therefore, people using the SJ vocabulary were limited to members of the upper class in the early stages. In the history of the Japanese language, some parts of the SJ vocabulary gradually entered the vocabulary used in colloquial contexts. It is necessary to further investigate this diachronic process. These are only a few brief remarks on the relationship between phonological phenomena and lexical stratification. Further research from the sociolinguistic viewpoint is indispensable for clarifying the problems of both SJ words and foreign words. We do not claim that the state of affairs in Japanese loanwords is idiosyncratic. Rather, we believe that a similar situation may be observed in many languages, and that it is quite important to investigate the correlation between phonological phenomena and the background of borrowing in many languages.

Acknowledgements This article is a revised version of Takayama (1999). I am grateful to Jeroen van de Weijer, Kensuke Nanjo, and Tetsuo Nishihara for providing an opportunity to contribute to this volume. I am indebted to Jeroen van de Weijer and anonymous reviewers for many helpful comments. I would like to thank Paul Hoornaert for useful suggestions for improving the English expressions. Special thanks go to Yoiko Aoyama for valuable information on rendaku words. Of course, I take ultimate responsibility for all the inadequacies and errors that remain. This work was partially supported by the Grants-in-Aid for Scientific Research from the Japan Society for the Promotion of Science, No.15520285, 2003–2004.

A survey of Rendaku in loanwords

189

Notes 1. The Sino-Japanese vocabulary, which occupies an important part in the Japanese lexicon, originally derives from the Chinese language mainly before the 10th century through the learning of the Chinese logographic system (so-called Chinese characters). 2. There are various kinds of karuta and each kind is named using a compound with karuta such as iroha garuta. However, in the written forms of these compounds, we encounter non-rendaku forms such as {iroha karuta}, {haikai karuta} ({} indicates the transliteration of kana phonograms). I think that it owes to the fact that those written forms do not always directly reflect their sound forms, and/or to the fact that they are often pronounced without rendaku voicing. 3. Iroha is a series of kana syllabaries like an alphabet, which is ordered by i, ro, ha, ni, etc. 4. There is another sonata in the native vocabulary, which is an antiquated expression referring to the second (singular) person. Nowadays, this word is used only in historical plays or dramas. 5. Vance (1987: 141) argues that kappa is not a virtual native word but is rather virtual Sino-Japanese. However, in some cases, it is hard for native speakers to determine by intuition whether some word is native or SJ. Kappa is one of those cases. The boundary between these two groups is not always clear inside the non-foreign vocabulary. The boundary between foreign and non-foreign is clearer, although there are also vague cases. 6. In ‘Zhy Monogatari’ (see note 7), we find a compound like futo+karuka (thick stick). But we cannot determine whether its form was karuka or garuka. Although the kana phonogram system has a diacritic mark dakuten to indicate voiced obstruents, it was not a compulsory element in the Edo period. Without this mark, we have no decisive clue to the rendaku form. 7. ‘Zhy Monogatari’ (Tales of common soldiers) is a collection of stories told by experienced common soldiers. Probably its first manuscript appeared in the 17 th century. 8. Some speakers use a form without rendaku, naga try. 9. When saying that rendaku can apply to a word group, this does not mean that rendaku occurs to all targets belonging to that group. 10. There is a phonotactic constraint in the native vocabulary: more than one voiced obstruent per morpheme is prohibited. This constraint explains why rendaku is blocked in words that already have more than one voiced obstruent (see Kubozono, this volume, Itô & Mester 1986, Yamaguchi 1988, and Haraguchi 2002). As illustrated in (7), every SJ word that undergoes rendaku has no voiced obstruent by itself. Although a SJ word generally comprises two morphemes, it behaves like one simplex word with regard to this point. 11. Since hyakusy often has a pejorative connotation, a form with honorific elements o-hyakusy-san is preferred.

190 Tomoaki Takayama 12. It is often said that the honorific prefix o has a tendency to be added to native words, such as o-tukue (desk), o-sara (dish), o-hana (flower), whereas the other honorific prefix go has a tendency to be added to Sino-Japanese words, such as go-t tyaku (arrival), go-sy kai (introducing). The prefix o is also often added to vulgarized SJ words, such as o-keiko, o-kesy , o-syasin. At the same time, we find many examples in vulgarized SJ words, such as go-h k , gosuiry . When accounting for all these examples, we need to pay attention to the fact that the meaning of o is not exactly equivalent to that of go. 13. In Optimality Theory analysis of Japanese phonotactics, this is expressed by the constraint *NT. It is said that the contrast between voiced and voiceless obstruents is historically derived from the contrast between prenasalized consonants and plain ones. At the earlier historical stages, the sequence ND can be analyzed as the gemination of prenasalized consonants (Rice’s article in this volume also discusses this point). 14. Truncation is applied to a great number of long foreign words, as shown in paso con derived from psonaru konpjt (personal computer). This might appear incompatible with the inclination above-mentioned, but this is not the case because it involves in length in the Japanese language, which is not a subject for our discussion. 15. According to my wife’s recollection, her grandmother, who was born in 1905, used some foreign words with rendaku.

Recognizing Japanese numeral-classifier combinations Keiichiro Suzuki

1. Introduction The main goal of this paper is to present the results of a study conducted to improve the performance of Large-Vocabulary Continuous Speech Recognition (LVCSR) by modeling context-dependent pronunciation variation (i.e. morphophonemic alternation) and context-independent pronunciation variation (i.e. free variation). In particular, I report the results of performance tests run on numeral-classifier combinations in Japanese (e.g. ni-hon ‘two stick-type objects’, san-bon ‘three stick-type objects’), showing how the accuracy of our Japanese LVCSR engine was improved through modeling the context-dependent pronunciation variation and context-independent pronunciation variation. On the one hand, these numeral-classifier combinations are a typical subject of phonological/morphological study, displaying linguistically significant, “regular” morphophonemic voicing alternation patterns. On the other hand, the same set of data shows linguistically insignificant free variation involving voicing. I demonstrate that these two types of pronunciation variation are indeed captured by the same process of statistical adjustment in our LVCSR engine. The secondary goal of this paper is to introduce a glimpse of research in the area of Automatic Speech Recognition (ASR) to a phonological audience and to contribute to the knowledge transfer between the two disciplines. While much attention has been paid to the inter-disciplinary study between phonology and cognitive science, not much discussion has been generated between phonology and speech engineering. The practice of computational phonology (Bird 1995) does exist; however, it studies implementations of theoretical phonology, which is not the same as the research aimed at improving ASR systems. Note that it is not the goal of this paper to offer some particular linguistic insight. Rather, this paper presents an alternative look at the Japanese numeral-classifier combinations, a typical subject for a phonological analysis, from the viewpoint of a commercial LVCSR.

192 Keiichiro Suzuki This paper is organized as follows. In the remainder of this section, I provide a brief overview of our LVCSR. In Section 2, I describe the problem that we aimed to resolve. Next, I discuss the solution we took to resolve the problem. Section 4 gives our test results to verify that our solution was successful. Finally, Section 5 concludes the paper.

1.1. Brief overview of LVCSR LVCSR takes the incoming acoustic stream as its input and outputs some text string that matches the input. Thus, the goal of ASR systems is to maximize the probability of a string that best matches an input acoustic stream. What is the most likely sentence W out of all sentences in the language L given some acoustic input X? This can be expressed as: (1)

a. Acoustic input = sequence of observations: X = x1, x2, … xt b. Output sentence = sequence of words: W = w1, w2, … wn c. The goal of ASR: W = argmax P(W | X)

The term, argmax, in (1c) means that the formula following it (the probability of the output word sequence given the sequence of acoustic input) is maximized. The formula in (1c) straightforwardly expresses the goal of ASR: to find the string of words W that maximizes P(W | X). Rather than trying to solve the formula as is, we use Bayes’ rule to (1c) to get the mathematically equivalent formula (2): (2)

W = argmax P(X | W) * P(W) / P(X)

It is quite difficult to directly compute the posterior probability P(W | X). Bayes’ rule provides a way of calculating the posterior probability, P(W | X), from P(W), P(X), and P(X | W). P(X | W) is the posterior probability of the acoustic input sequence given the word sequence. In (2), the denominator, P(X), is a non-factor, since the probability for the given acoustic input does not change for each potential output sentence. Removing the denominator, the final formula is P(X | W)*P(W), and this is THE golden formula for any ASR system. (3)

W = argmax P(X | W) * P(W)

Recognizing Japanese numeral-classifier combinations

193

The first part P(X | W) can be calculated by what we call the Acoustic Model (AM) and the second part P(W) can be calculated by what we call the Language Model (LM). AM is a collection of probabilistic sound sequences for a given word. In our LVCSR, AM is based on the Hidden Markov Model (HMM), which gives transitions of observation sequences when no deterministic information about observed input is given (thus the name is Hidden). Many stateof-the-art ASR systems employ some form of HMM. Our AM treats each state in the HMM as the basic subphonetic unit called a senone (Hwang and Huang 1993). Senones are the units composing a triphone (context dependent phones), consisting of the left context, the phone, and the right context (e.g. /to/ consists of two triphones, <sil>-t+o and t-o+<sil>, where <sil> is silence). The parameters (probability values) for our AM can be automatically estimated by going through hundreds of hours of acoustic data. Once the AM is trained, the spectral features extracted from the acoustic input get computed and matched against the probable phone sequence for the candidate word. LM is the model for determining the probability of a word sequence w1, w2, … wn, namely P(w1, w2, …wn). This probability gets broken down into its component probabilities by the Chain Rule: (4)

P(w1, w2, …wi) = P(w1)*P (w2 | w1)*…*P (wi | w1, w2, …wi-1) n

=

P (w | w i

i1 1 )

i=1

Since it may be difficult to compute a probability of the formula P(wi | w1, w2, …wi-1) even for moderate values of i, we typically assume that the probability of a word depends only on a set number of previous words (N). This leads to an n-gram language model. n

(5)

P (w1n )

P (w | w i

iN+1 , wiN+ 2 ,K, wi1 )

i=1

If the probability of the word depends on the previous two words (N=3), we have a trigram (6c). Similarly, it is called a unigram when N=1 (6a), a bigram when N=2 (6b). The trigram language model is widely used in most commercial LVCSR systems today. n

(6)

a. Unigram:

P (w1n )

P (w ) i

i=1

194 Keiichiro Suzuki n

b. Bigram:

P (w1n )

P (w | w i

i1 )

i=1 n

c. Trigram:

P (w1n )

P (w | w i

i2 wi1 )

i=1

Applying this to Japanese, for example, consider the phrase gakkooniiku ‘going to school’. Since there is no white space to delimit words in Japanese texts, we take what we consider a morpheme as the base unit, such as gakkoo ‘school’, ni ‘to’, i ‘(to) go’, ku ‘[present tense]’. Using bigrams, the probability for this sentence is calculated as: (7)

P(gakkoo, ni, i, ku) = P (gakkoo | <s>) * P (ni | gakkoo) * P(i | ni) * P(ku | i) * P( | ku) where <s> and are the placeholders for “sentence initial” and “sentence final”

The current practice is that we use large text corpora to calculate the ngram probabilities. It follows that the larger the size of the text corpora, the better the n-gram coverage. Even with large corpora, there will always be many word sequences, especially for trigrams, that get zero probability. There are various discounting and smoothing techniques to circumvent this data sparseness problem, but I will not discuss them here (See Huang, Acero, and Hon 2001 for more details). In order for the recognizer to know the phonetic content of the word sequences stored in LM, the module called lexicon acts as the database storing the pronunciation(s) of each word counted in LM. Many LVCSR systems are equipped with the lexicon containing over 100,000 words for a given language. The recognizer is only capable of recognizing words that are listed in the lexicon. Thus, it is safer to have a large lexicon to avoid out-of-vocabulary errors. The above is a quick introduction to ASR and specifically to LVCSR. Besides AM, LM, and the lexicon, there are two more important pieces to the system: Front End and Decoder. I will not cover these topics, since these are not relevant to the main discussion of this paper (for more in-depth introductions to ASR, see Jurafsky and Martin 2000, Huang, Acero, and Hon 2001, Shikano, et al. 2001).

Recognizing Japanese numeral-classifier combinations

2.

195

The Problem

2.1. Japanese numeral-classifier combinations Classifiers (ຐᩐペ) in Japanese attach to numerals (ᩐペ) to express the type of object being counted, e.g. ni-mai (஦ᯓ) ‘two thin paper-like objects’, san-mai (୔ᯓ) ‘three thin paper-like objects’. Some numeral-classifier combinations are simple, in the sense that the pronunciation for the particular combination is just a concatenation of the pronunciations of individual parts. For example, we get ni-mai (஦ᯓ) ‘two thin paper-like objects’ by adding the pronunciation of the classifier part mai to the numeral part ni. However, the majority of numeral-classifier combinations are complex, in the sense that the pronunciation of the whole is not simply the combination of the two parts. There are three notable characteristics about these complex numeralclassifier combinations. First, some classifiers have multiple pronunciations, and the pronunciation of the particular classifier is determined by the preceding numeral. For example, the same classifier /hon/ (ᮇ) ‘stick-shape object’ is pronounced as pon in ip-pon (ୌᮇ) ‘one stick-shape object’, hon in ni-hon (஦ᮇ) ‘two stick-shape objects’, and bon in san-bon (୔ᮇ) ‘three stick-shape objects’. Second, some numerals themselves have multiple pronunciations, and the particular numeral’s pronunciation is dependent on the following classifier. For example, the same numeral /ichi/ (ୌ) ‘one’ is pronounced as ichi in ichi-ban (ୌ␊) ‘the first’, ip in ip-pon (ୌᮇ) ‘one stickshape object’, and hito in hito-tsuki (ୌ᭮) ‘one month’. Finally, some numeral-classifier combinations have multiple pronunciations that are in free variation. For example, for the term ‘one stock’ (ୌᰬ), it is equally plausible for a native Japanese speaker to pronounce it as ichi-kabu, ik-kabu, or hitokabu. Similarly for ‘eight cups (of)’ (ඳ᮴), it can be pronounced as hachihai or hap-pai. The existence of such variations was validated by consultation of native Japanese colleagues at Microsoft. The three characteristics of the complex numeral-classifier combinations are summarized below. (8)

The characteristics of the complex numeral-classifier combinations a. For some classifiers, their pronunciation varies depending on what the preceding numeral is. b. For some numerals, their pronunciation varies depending on what the following classifier is. c. Some numeral-classifier combinations are in free variation.

196 Keiichiro Suzuki The first two characteristics (8a,b) are typical context-dependent morphophonemic alternation cases, both of which can be a subject of phonological/ morphological study in theoretical linguistics. On the other hand, the third characteristic (8c) may not be something that would attract theoretical linguists’ interests, being completely context-independent. On the contrary, commercial ASR systems are required to deal with both the contextdependent and the context-independent characteristics of the complex numeral-classifier combinations, because the users of the Japanese LVCSR are likely to produce any of the free variants.

2.2. The description of the problem Our Japanese LVCSR engine was having trouble dealing with numeralclassifier combinations. The accuracy of our Japanese engine was noticeably lower when the dictated sentence included certain numeral-classifier combination(s), and so I conducted a large scale study to tackle the problem. The essence of the problem was the following. When we train our LM, billions of raw text data from various corpora are processed to 1) produce a list of words appearing in the data to be in the lexicon, and 2) produce ngram counts by calculating the frequencies of each word in the lexicon. We had a problem in each of the two points in the LM creation process. For 1), no checking was done on the lexicon to make sure that all pronunciation variants are listed in the lexicon. Thus, even though our lexicon contained the item /hon/ (ᮇ) with the pronunciation hon, it might have lacked pronunciation variants such as pon and bon. For 2), the pronunciation variation might be incorrectly modeled by treating all numeral-classifier combinations as simple. If the combination was a context-dependent one, the characteristics in (8a,b) must be resolved to get the correct pronunciation of the combination. This step was simply ignored. Moreover, even if we resolved the dependencies, the correct pronunciation may have a free variant. No mechanism existed to assign appropriate probabilities to these free variants. The three problem areas that we identified at two points during the LM training process are summarized in (9) below. (9)

The problem areas a. Lexicon creation stage: no checking was done on the lexicon to make sure that all pronunciation variants are listed in the lexicon.

Recognizing Japanese numeral-classifier combinations

197

b. N-gram count file creation stage: no attention was paid to the pronunciation variation for numeral-classifier combinations. c. N-gram count file creation stage: no mechanism existed for assigning appropriate probabilities to free variants. In the next section, I discuss how we dealt with each of the above problem areas.

3.

The Fix

3.1. Lexicon check In order to make sure that all pronunciation variations are covered in the lexicon (9a), I went through the lexicon and checked to see if all pronunciation variants for a given numeral or a given classifier were listed in the lexicon. This process was necessary, since the pronunciation for a numeralclassifier combination is not always regular. The actual method I used was the following. First, I identified 65 representative classifiers. These representative classifiers were selected from the base list manually created to cover the most frequent classifiers. Then I produced a table that lists 1) all pronunciation variations of these classifiers and 2) all pronunciation variations of the numerals 0 through 10. Exhausting possible pronunciations for each numeral and classifier was a manual process as well. Finally, I went through the lexicon and added the entries if any of the pronunciation variants in the two tables were not listed in the original lexicon. This final process was automated, as we had a tool to query the lexicon to confirm whether the pronunciation in question exists. Note that there is an alternative solution in which we add individual numeral-classifier combinations as a lexical entry in addition to the individual numerals and classifiers. Adding the numeral-classifier combinations would provide direct mappings between the particular numeral-classifier combination and its pronunciation variation. However, such a brute-force solution was avoided since we did not wish to mess up the probability distribution by introducing all sorts of numeral-classifier combinations to the lexicon. The size of n-grams would unnecessarily increase if we added individual numeral-classifier combinations like ip-pon (ୌᮇ) ‘one stickshape object’ ni-hon (஦ᮇ) ‘two stick-shape objects’, san-bon (୔ᮇ) ‘three stick-shape objects’, … hyap-pon (Ⓤᮇ) ‘one hundred stick-shape

198 Keiichiro Suzuki objects’, … sen-bon (༐ᮇ) ‘one thousand stick-shape objects’, to the lexicon. Moreover, it would be difficult to identify the appropriate upper limit of the numerals for the combination. Thus, we decided to leave the choice of the correct pronunciation variant for a given combination to n-gram. Another possibility was to introduce an intermediate level rather than the golden formula in (3) (Cremelie and Martens 1995, 1997, 1999; Strik and Cucchiarini 1999; Fukada et. al 1998, 1999) (10) Intermediate level W = argmax P (X | V) * P (V | W) * P (W ) In this formula, there is an intermediate level V that conditions the posterior probability of X. P(V | W) expresses the probability of the variants given the words, and P (W ) represents the probabilities of sequences of words. The task then is to collect pronunciation variants for the given word sequence. However, we did not take this option, since in this model the contextdependence of pronunciation variants is not modeled directly in the LM. As we saw in (8a,b), except for those in free variation (8c), the majority of complex numeral-classifier combinations are context-dependent, and thus, it is better to model the variation directly in n-gram. For other possible approaches, Strik and Cucchiarini (1999) provide an excellent overview of the approaches to pronunciation variation modeling. Exhaustively listing the pronunciation variants for each classifier in the lexicon (9a) was a prerequisite to the next step – adjustment of probability for numeral-classifier combinations.

3.1. Explicit n-gram extrapolation Adding the pronunciation variants to the lexicon does not provide “context” information about when the variant pronunciation would occur. It is necessary to calculate the n-grams with the variants, so that the specific numeralclassifier combination yields a particular pronunciation of the whole (9b). In order to directly model the numeral-classifier combinations in LM, we used explicit n-gram extrapolation. That is, we manually increased the n-gram count to cover unseen data in the corpus. Since some numeralclassifier combinations were not seen from our corpus, the n-gram would get under-trained. The sparseness of data is a common problem during the training of LMs, and individual combinations of numerals and classifiers

Recognizing Japanese numeral-classifier combinations

199

would have a probability that is too low to factor in LMs. Thus, we first divided the 65 classifiers into three tiers based on their frequency of occurrence in our corpora: Tier 1 classifiers included /hon/ ᮇ ‘stick-shape object’, /en/ ළ ‘yen’, etc., Tier 2 classifiers included /hyoo/ ⚂ ‘number of votes’, /shoo/ ົ ‘number of wins’, etc., and Tier 3 classifiers included /seki/ 㞐 ‘number of ships’, /kumi/ ⤄ ‘group of’ etc. Then, during the training, we explicitly increased the count for the numeral-classifier sequences that had a relatively lower frequency count within the tier. This resulted in equal distribution for the numeral-classifier combinations within the same tier. For example, if there are 1,000,000 occurrences of the numeral-yen sequence (i.e. (nළ) ‘n yen’ where n is 0–10), all the tier 1 classifiers will be counted as occurring 1,000,000 times. In addition to the probability adjustment for the numeral-classifier combinations within a tier, we also smoothed the probability distribution for the different numerals (0–10) for a given classifier. Most, if not all, classifiers have significantly larger counts for their combination with /ichi/ (ୌ) ‘one’ compared to the other numerals. This would cause the pronunciation variant for the one-classifier (e.g. pai in ip-pai (ୌ᮴) ‘one cup of’) to be so strong that it would inappropriately win out for the other numeral-classifier combinations (e.g. *ni-pai (஦᮴) ‘two cups of’ instead of the correct ni-hai). Thus, we took the count for the one-classifier combination to be the base count for the rest of the numbers (0-10 except 1). For example, we explicitly added ni-hai (஦᮴) ‘two cups of’, san-bai (୔᮴) ‘three cups of’, ... jyup-pai (༎᮴) ‘ten cups of’ for each occurrence of ip-pai (ୌ᮴) ‘one cup of’ in the data. Some classifiers do not take zero as a numeral (*zero-choome (0୍┘) ‘zero street address (?)’) or take it with extremely low probability (?zerohai (0᮴) ‘zero cups (?)’), so zero was discounted from the count of numeral-classifier combinations for these particular classifiers. Incorporating the explicit n-gram extrapolation, it was possible to directly model the pronunciation variants for both numerals and classifiers in the ngram, thereby resolving the problem in (9b).

3.2. Explicit n-gram extrapolation for free variation One final issue to resolve was (9c), modeling of free variants. Here again, we used explicit n-gram extrapolation. The assumption here is that the probability between the free variants is unpredictable. Based on this assumption,

200 Keiichiro Suzuki we assigned equal frequency count to all the free variants of a particular numeral for a given classifier. For example, we gave equal distribution to each of the free variants, ichi-kabu, ik-kabu, and hito-kabu for (ୌᰬ) ‘one stock’. This makes it possible to model the free variation directly in ngram, making these variants of a numeral equally probable for a given classifier. After n-gram counts are manually adjusted, we applied smoothing and adjusted the backoff weighting to minimize the side effects (see Huang, Acero, and Hon 2001 for more details on smoothing and backoff). Having rebuilt our LM using the explicit n-gram extrapolation, we tested the performance of our Japanese LVCSR to measure the improvement. In the next section, I discuss the test procedure and the results.

4.

The Test

4.1. Test Procedure The test was conducted with data consisting of sentences with 65 representative classifiers and was run against our Japanese LVCSR engine. The 65 representative classifiers were divided into two groups: 12 h-initial classifiers and 53 non h-initial ones. This was because h-initial classifiers show the most variability and the performance on the h-initial classifiers was identified as particularly critical to the overall performance. For example, the h-initial set contained sentences like ‘I had three cups of coffee this morning’ where the otherwise h-initial classifier hai appears as bai in san-bai ‘three cups of’. The test sets contained a total of 65 base sentences, and the numbers from 1-10 were substituted to produce a total of 650 (65*10) sentences: 120 for the h-initial test set (12*10) and 530 for the non h-initial test set (53*10). Then, 16kHz recordings of all 650 sentences were collected for 6 speakers (3 male and 3 female). Two wave files for each speaker were created: one for the h-initial set, the other for the non h-initial set. These wave files were fed to our automated accuracy test tool which spits out the recognition accuracy results for the given version of the engine. Both the h-initial and the non h-initial sets were tested with SI (Speaker Independent) mode.

Recognizing Japanese numeral-classifier combinations

201

4.2. Results We recorded the results of the accuracy tests in versions 1 through 4 where Version 4 was the newest incarnation of our speech recognition engine. As the version progressed, we have incrementally added fixes to improve the accuracy of numeral-classifier combinations. The test result consists of the following two numbers per system: Word Accuracy Rate (WAR) and Numeral-Classifier combination Accuracy Rate (NCAR). (11) Accuracy Results Numbers a. WAR (Word Accuracy Rate) 100 – WER b. WER (Word Error Rate) 100*(#Insertion Errors+#Deletion Errors+#Substitution Errors)/ (#Words) c. NCAR (Numeral-Classifier combination Accuracy Rate) 100*(#correct numeral-classifier combinations)/ (#numeral-classifier combinations) WAR is the rate obtained by subtracting Word Error Rate (WER) from 100 (%). WER is based on how much the output string returned by the recognizer differs from the correct string for a given test set. WER is calculated as 100*(#Insertion Errors+#Deletion Errors+#Substitution Errors)/(#Words). For a given test set, WAR is the indicator of how likely the recognizer gets the correct recognition results. For example, for the correct string “I had three cups of coffee this morning” consisting of 8 words, if the hypothetical output of the recognizer was “I hid three cups of cold feet morning” (“hid” is substituted for “had”, “cold” is inserted, “feet” is substituted for “coffee”, “this” is deleted) then the WER is 100*(1+1+2)/8 = 50%, and the WAR is 50% (100–50). NCAR represents the accuracy specific to the numeral-classifier combinations in a given test set. For each numeral-classifier combination, I gave the value “1” if the output contained the correct numeral-classifier combination; otherwise, I gave “0”. Note that the insertion errors were not counted in the calculation of NCAR. As long as the output contained the correct string, it was counted as correct. Thus, the formula for NCAR is 100*(#correct numeral-classifier combinations)/(#numeral-classifier combinations). For example, taking the previous hypothetical output that contains

202 Keiichiro Suzuki one instance of numeral-classifier combination “three cups of”, the result string “I hid three cups of cold feet morning” would yield the NCAR of 100*1/1 = 100%. The test sets contain the total of 65 (12 for h-initial, 53 for non h-initial), so the maximum number of correct numeral-classifier combinations is 65 for the two sets. The table of the overall test results is shown below in (12). The numbers for the accuracy rate for each test set (h-initial and non h-initial) against each system (1–4) are included (rounded for readability). The average of hinitial and non h-initial numbers are given in the bottom two rows. Version 1 engine did not have any fix for the numeral-classifier problem. Version 2 and 3 engines incorporated partial fixes for the problem by extending the coverage of numeral-classifier combinations. Version 4 engine implemented the LM that had the full coverage of the targeted 65 classifiers and their pronunciation variants along with the pronunciation variants for the numerals. The baseline data here was obtained by running identical test sets with a commercial 3rd party Japanese LVCSR engine. (12) Overall test results

WAR NCAR WAR NCAR

Version 1 h-ini non 77 77 23 24 Avr. 77.0 23.3

Version 2 h-ini non 80 81 59 38 Avr. 80.4 48.7

Version 3 h-ini non 82 90 67 76 Avr. 85.8 71.3

Version 4 h-ini non 88 92 76 82 Avr. 90.2 78.9

Baseline h-ini rest 80 84 47 64 Avr. 81.8 55.2

The graph in (13) below shows the improvement more clearly. As is obvious from the graph, our Version 1 engine was performing very poorly, getting lower accuracy rates for both WAR and NCAR than the baseline. As we incorporated the fix progressively version by version, gradually completing the adjustment of the frequency counts of numeral-classifier combinations, it is evident that the performance of our engine for both the WAR and the NCAR improved dramatically. By Version 3, our engine outperformed the baseline engine for both WAR and NCAR. At Version 4, as our implementation of the probability adjustment for the numeral-classifier combinations was completed, we obtained the best results. Note that the improvement of NCAR did not hinder the WAR but helped the WAR improvements.

Recognizing Japanese numeral-classifier combinations

203

(13) Accuracy rate progression against the baseline

4.3. Summary The test results reveal the successful improvement in the performance of the Japanese LVCSR engine regarding the pronunciation variability of numeralclassifier combinations by making probability adjustments using the explicit n-gram extrapolation. Three problem areas identified earlier in (9) were resolved by 1) exhaustive listing of pronunciation variants for numerals as well as for classifiers in the lexicon, and by 2) manually adjusting the counts of numeral-classifier combinations to model in the n-grams. Not only did the explicit n-gram extrapolation resolve the context-dependent pronunciation variation, but it resolved the issue with free variation as well. One of the things we did not cover with this study was the testing of zeroclassifier instances. As I mentioned earlier, not all classifiers may be used with the numeral zero. Future research will need to test whether these exceptional cases are handled appropriately. Another remaining issue is that of numerals that are larger than 10. Increasing the frequency counts of numeralclassifier combinations for the numerals 0-10 may have adverse effects on instances where the numeral is larger than 10. If so, we will need to make the appropriate modifications to handle larger numerals. Further testing is required before the engine will end up in commercial products. The goals of my future research are to expand the test cases as well as to seek other ways of improving the performance of our Japanese LVCSR engine.

204 Keiichiro Suzuki 5. Conclusion In this paper, I have presented the results of a study designed to improve the performance of our Japanese LVCSR engine regarding numeralclassifier combinations. I have demonstrated that the context-dependent pronunciation variation (8a,b) and the context-independent pronunciation variation (8c) are handled by the same mechanism, explicit n-gram extrapolation. These two types of variation are not different species from the perspective of an LVCSR. This is very different from a linguistic point of view where morphophonemic alternations are considered as linguistically significant, while free variation is considered insignificant. I have also introduced, at least minimally, the domain of ASR to the primarily linguistic audience. The research on modeling pronunciation variation for ASR systems has increased lately, and some of the new ideas have actually been inspired by theoretical phonology (see Strik and Cucchiarini 1999). I believe that there are opportunities for phonologists to make significant contributions to the field of ASR research. Just as ASR research can be informed by linguistic analyses, I also believe that phonologists can benefit from the study of ASR systems. Recently some stochastic models of phonology have been proposed (Anttila 1995; Frisch 1996; Boersma 1997; Coleman and Pierrehumbert 1997; etc.). The fully stochastic nature of the techniques used in the current ASR systems may be worthy area for phonologists to explore.

Corpus-based analysis of vowel devoicing in spontaneous Japanese: an interim report Kikuo Maekawa and Hideaki Kikuchi

1. Introduction Introductory textbooks of phonetics or pronunciation dictionaries of Japanese often state that close vowels (/i/ and /u/) are devoiced when they are both preceded and followed by voiceless consonants. This description turns out quickly to be incorrect when we look at real data. For one thing, close vowels are not always devoiced, even in the above-mentioned environment, and in addition, close vowels followed by voiced consonants can be devoiced to some extent when they are preceded by voiceless consonants. Moreover, non-close vowels like /a/ are also devoiced occasionally. These facts, which we will examine more closely in this paper, indicate that vowel devoicing is a probabilistic event: an event whose occurrence cannot be predicted with 100% accuracy. Vowel devoicing, accordingly, should be analyzed from a statistical perspective. In this perspective, phoneticians, including the first author of this paper, have in the past conducted statistical analyses of vowel devoicing in order to find out which factors determine the probability of vowel devoicing in a given phonological context. The reported results, however, have not always coincided. For example, there is disagreement regarding the influence of the manner of articulation of the following consonant. Han (1962) claimed that close vowels followed by an affricate or fricative were more likely to be devoiced than those followed by a plosive, but Takeda and Kuwabara (1987) obtained exactly the opposite result. The latter study also reported that one of the devoicing rules proposed in NHK (1985), namely a “low-pitched mora in pre-pause position is likely to be devoiced”, was almost useless in interpreting the devoicing patterns observed in a read-speech corpus. There may be several possible reasons for such disagreements. First, some descriptions of devoicing were based upon introspection. Generally speaking, introspection alone is not an appropriate analysis method for a probabilistic event like devoicing.

206 Kikuo Maekawa and Hideaki Kikuchi Second, the experimental data examined in at least some previous studies were too small to be able to arrive at stable conclusion. This problem is likely to happen when the occurrence probability of an event is inherently very low, and/or, multiple factors and their complex interactions are involved. Third, the data analyzed in different studies were not homogeneous with respect to the data collection method. At least three different methods were used in the previous studies: reading of isolated words, reading of words in a carrier sentence, and reading of prose. It is important to note, at this point, that no previous study examined devoicing in spontaneous speech. Observation of spontaneous speech is necessary because vowel devoicing may be influenced by the differences in speaking style, as is the case with many other linguistic variations. Theoretically, it is not impossible to conceive an experiment designed to solve all three problems mentioned above, but from a practical point of view, it is virtually impossible to conduct such an experiment. The cost of the experiment would be too high to be supported if the aim of the experiment is nothing but the analysis of devoicing. Recent development of speech corpora, however, has opened up a new vista for the study of vowel devoicing and other phonetic variations. Since the size and coverage of speech corpora are growing rapidly, we can use them for the study of phonetic variation. In fact, Takeda and Kuwabara (1978) and Yoshida and Sagisaka (1990) have analyzed the ATR speech database developed for speech synthesis and recognition, and have shown that the use of large-scale corpora provide a solution to the first of the two problems mentioned above. The problem of speaking style, however, has so far remained unsolved since most existing corpora contain only read speech. This last problem might be solved by a large corpus of spontaneous speech. In the rest of this paper, we will examine the distribution of devoiced vowels in a corpus of spontaneous Japanese.

2.

The data

2.1. The Corpus of Spontaneous Japanese (CSJ) The data we analyzed is an excerpt from the Corpus of Spontaneous Japanese (henceforth ‘CSJ’), which we have been developing since 1999, aiming for public release in the spring of 2004. CSJ is a large-scale speech database

Corpus-based analysis of vowel devoicing in spontaneous Japanese 207

designed mainly for the study of speech recognition and phoneticslinguistics (See Maekawa, Koiso, Furui and Isahara 2000 for the blueprint of the CSJ). The whole body of the CSJ contains about 7.5 million words spoken by native speakers of so-called Standard, or Common, Japanese. This corresponds roughly to about 660 hours of speech. The main body of the corpus is monologue taken from two sources: academic presentation speech (APS) and simulated public speaking (SPS). The APS is the live recording of academic presentations done in meetings of nine different academic societies covering both humanities, natural science, and engineering fields. The SPS, on the other hand, is the public speech on every-day topics, performed by recruited lay subjects in front of small audiences. The sex and age of the SPS speakers are roughly balanced. The speech data was recorded using a head-worn directional microphone and a DAT with the sampling frequency of 48 kHz and 16-bit precision. The speech data was then down-sampled to 16 kHz and stored in computer. All recorded speech was transcribed and morphologically analyzed in terms of word boundary and part-of-speech information. In addition to this tagging of the entire corpus, we have done extensive annotation of a number of linguistic features to a subset of the corpus; we call this subset ‘the Core’. The Core contains about 500,000 words or about 45 hours of speech, all of which have been (sub-)phonemically segmented and labeled for intonation.1 The tag set used in the segmental labeling of the Core is shown in Table 1. The tag set is a mixture of phonemic and sub-phonemic labels. This inconsistency was a deliberate choice of ours to enrich the value of the Core as resource for the study of phonetic variation. When this segment label information is coupled with the X-JToBI intonation labels that we developed for the CSJ (Maekawa, Kikuchi, Igarashi and Venditti 2002), the Core can be an excellent resource for the phonetic study of spontaneous speech. The segment labeling of the Core was preformed in three steps. First, the initial labels were generated from the transcription text and aligned automatically to the speech signal using a Hidden Markov Model based speech recognition toolkit (Young et al., 1999). The accuracy of automatic alignment in terms of phoneme boundary location, averaged over all phonemes, is currently –3.84 ms average and 21 ms standard deviation (Kikuchi and Maekawa 2002).

208 Kikuo Maekawa and Hideaki Kikuchi Table 1. Label set used for the segmental labeling of the CSJ

Vowels:

a, i, u, e, o (voiced) A, I, U, E, O (devoiced)

Plain Consonants: k, g, G[F], @[], s, z, t, c[ts], d, n, h, F, b, p, m, r [R], w, y Phonetically palatalized consonants: kj, gj, Gj, @j, sj[S], zj[Z], cj[tS], nj[¯], hj[ç] Phonologically palatalized consonants (‘youon’): ky, gy, Gy, @y, sy, zy, cy, ny, hy, by, py, my, ry

Moraic phonemes: Long vowel: H Geminate (‘sokuon’): Q Moraic nasal (‘hatsuon): N

Then, human labelers checked the appropriateness of the generated labels and their location on the time axis. Finally, trained phoneticians checked inter-labeler inconsistencies before fixing the final labels. During the course of manual corrections, the voicing of vowel segments was judged to be either voiced or voiceless. Information from the wide-band spectrogram, speech waveform, extracted speech fundamental frequency, peak value of the autocorrelation function, in addition to audio playback were all available for these judgments, but the most important criteria was the audio playback and presence versus absence of the speech fundamental frequency. In our speech-analysis environment, fundamental frequency was judged to be present if the probability of voicing of an analysis frame was higher than 0.5, and this probability was determined according to a twodimensional normal distribution of speech intensity and periodicity.

2.2. The current data set Because compilation of the CSJ is currently underway (as of february 2003), we are not able to use the whole body of the Core. The data set used for the

Corpus-based analysis of vowel devoicing in spontaneous Japanese 209

analyses reported below consists of about 23 hours of segment-labeled speech containing 427,973 vowel segments. This data set contains 29 female and 56 male speakers whose average age and standard deviation were 32.2±5.5 and 32.3±6.6 years old, respectively. Sixty five subjects were born in Tokyo, and all others were born in three surrounding prefectures of Tokyo, namely, Saitama, Kanagawa, and Chiba. From a dialectological point of view, all subjects spoke so-called Standard Japanese. As for the type of monologue, 41 APS and 44 SPS monologues are present in our data set. Six APS and 23 SPS monologues are by female speakers and 35 APS and 21 SPS monologues are by male speakers. Most of these monologues lasted from 10 to 15 minutes. During the course of transcription work, the speech signal was divided into chunks delimited by a pause longer than 200 ms. This chunk we will call an ‘utterance’, but utterance in this sense may or may not correspond to a syntactically meaningful construction. Lastly, the following notation is adopted in the rest of this paper. Symbols ‘C’ and ‘V’ stand for consonants and (short) vowels. ‘Co’ and ‘Cv’ stand respectively for voiceless and voiced consonants. ‘Vc’ and ‘Vnc’ stand respectively for close and non-close vowels. The combination of these symbols placed within forward slashes represents the phonological environment; for example, /CoVcCo/ stands for the phonological environment in which close vowels are both preceded and followed by voiceless consonants, while /CoVcCv/ stands for the environment in which close vowels are preceded by a voiceless consonant and followed by a voiced consonant. When it is necessary to make a distinction between the preceding and following consonant, integers 1 and 2 are used as an index: ‘C1’ and ‘C2’ stand for preceding and following consonant, respectively.

3. Overview of vowel voicing We start our analysis by giving an overview of the vowel voicing in the current data set. Table 2 tabulates the number of vowel samples and the average devoicing rate represented as a percentage. Devoicing rates of long vowels (/aH/, /eH/, /iH/, /oH/, and /uH/) remained consistently the lowest. Among short vowels, close vowels showed distinctively higher devoicing rate than non-close vowels, as expected.

210 Kikuo Maekawa and Hideaki Kikuchi Table 2. Number of samples and averaged devoicing rate of all vowel segments VOWEL a aH e eH i iH o oH u uH

N

VOICED

DEVOICED

109,624 3,956 58,154 12,363 75,581 2,650 88,412 19,445 49,448 8,340

108,432 3,954 57,401 12,361 60,675 2,646 87,282 19,437 33,917 8,307

1,192 2 753 2 14,906 4 1,130 8 15,531 33

% DEVOICED 1.09 0.05 1.29 0.02 19.72 0.15 1.28 0.04 31.41 0.40

Table 3 shows the distribution of devoicing rate as a function of the voicing of the C1 and C2 in the /C1VC2/ environment (tabulated over 300,018 vowels). In addition to the expected fact that the devoicing rate is by far the highest in the /CoVcCo/ environment, this table reveals interesting findings about the nature of vowel devoicing. First, the devoicing rate of close vowels in the ‘typical’ /CoVcCo/ environment was not 100%. Second, close vowels were also devoiced with modest probability in the /CoVcCv/ environment (17.37% and 20.91% for /i/ and /u/, respectively). Third, non-close vowels also could be devoiced in the /CoVncCo/ environment (2.10%, 3.31%, and 3.45 % for /a/, /e/, and /o/, respectively). Moreover, there was no environment in which devoicing was completely blocked. Vowels could be devoiced even in the /CvVncCv/ environment (i.e., non-close vowels preceded and followed by voiced consonants), which is regarded to be the most atypical environment for vowel devoicing. Similar findings were reported earlier in Venditti and van Santen (1998). To examine whether the devoicing occurring in environments other than /CoVcCo/ is phonetically the same as the devoicing in /CoVcCo/ is an interesting research question. In the next section, we will examine devoicing in three different environments, i.e., /CoVcCo/, /CoVcCv/, and /CoVncCo/.

Corpus-based analysis of vowel devoicing in spontaneous Japanese 211 Table 3. Devoicing in the /C1VC2/ environment as a function of the voicing of C1 and C2 VOWEL

a

e

i

o

u

4.

C1 Co Co Cv Cv Co Co Cv Cv Co Co Cv Cv Co Co Cv Cv Co Co Cv Cv

C2 Co Cv Co Cv Co Cv Co Cv Co Cv Co Cv Co Cv Co Cv Co Cv Co Cv

VOICED DEVOICED % DEVOICED 12,214 262 2.10 18,570 92 0.49 24,943 481 1.89 0.15 19,867 29 5,550 190 3.31 10,890 116 1.05 11,552 323 2.72 0.25 11,388 29 1,475 12,124 89.15 10,556 2,219 17.37 9,200 126 1.35 1.09 12,072 133 12,247 437 3.45 19,752 365 1.81 14,650 13 0.09 0.08 16,802 14 1,732 9,267 84.25 11,851 3,133 20.91 5,562 127 2.23 0.78 7,748 61

Analysis of vowel devoicing

4.1. The /CoVcCo/ environment We will first analyze devoicing in the /CoVcCo/ environment. As we saw already in Table 3, the devoicing rates in this ‘typical’ environment were less than 90%. So, the essential task here is to identify the conditions that decrease the probability of vowel devoicing in this context. Tables 4 and 5 summarize the voicing status of /i/ and /u/ according to the phonemic classification of C1 and C2. These tables, as well as all the following tables, need some introduction. First, because C1 and C2 were phonemically classified, allophones shown in Table 1 were merged into phonemes. Also, we presuppose a voiceless (dental) affricate phoneme /c/ adopting the phonemic analysis of Hattori (1950).

212 Kikuo Maekawa and Hideaki Kikuchi Second, the combinations of C1 and C2 where the total number of samples was less than 10 were omitted from the tables. Third, all phonemically palatalized consonants were omitted altogether, because in most of the C1C2 combinations involving the palatalized consonants, the number of samples was less than 10. Table 4. Cross-tabulation of the voicing of /i/ in the /CoVcCo/ environment by C1 and C2 VOWEL

C1

c

h

i k

p

s

t

C2 c h k p Q s t c h k Q s t c h k Q s t Q c h k Q s t k Q

VOICED 16 35 31 7 16 64 32 5 22 15 21 11 21 19 167 73 32 144 53 118 7 47 50 25 259 49 11 13

DEVOICED % DEVOICED 82.02 73 16.67 7 92.03 358 86.27 44 50.00 16 39.05 41 84.98 181 94.12 80 29.03 9 95.80 342 65.00 39 21.43 3 97.68 883 76.54 62 28.02 65 86.70 476 61.45 51 64.53 262 93.72 791 7.09 9 97.37 259 22.95 14 95.66 1,102 78.63 92 40.73 178 6,507 99.25 0.00 0 0.00 0

Corpus-based analysis of vowel devoicing in spontaneous Japanese 213 Table 5. Cross-tabulation of the voicing of /u/ in the /CoVcCo/ environment by C1 and C2 VOWEL

C1

c

h

u k

p

s

DEVOICED % DEVOICED

C2

VOICED

c h k Q s t c h k Q s t c h k p Q s t k s t c h k p Q s t

16 24 44 13 137 19

57 10 872 32 140 207

78.08 29.41 95.20 71.11 50.54 91.59

4 17 15 25 6 10

86 16 227 7 46 248

95.56 48.48 93.80 21.88 88.46 96.12

48 132 151 3 114 380 148

123 56 246 21 26 1,202 1,021

71.93 29.79 61.96 87.50 18.57 75.98 87.34

8 12 6

7 18 12

46.67 60.00 66.67

3 4 31 2 23 60 37

8 8 2,207 154 31 195 1,210

72.73 66.67 98.61 98.72 57.41 76.47 97.03

214 Kikuo Maekawa and Hideaki Kikuchi 4.1.1. Interaction of consonant manners Tables 4 and 5 show the importance of the manner of articulation of C1 and C2 as the factors of vowel devoicing, as suggested by many previous studies (See introduction and discussion for references). Tables 6 and 7 are summaries of Tables 4 and 5 from this point of view. Table 6. Devoicing rate [%] of /i/ in the /CoVcCo/ environment classified by the manner of C1 and C2 C1

Affricate

C2 Fricative

Stop

Affricate

81.1

33.3

89.4

78.3

Fricative

96.3

38.1

98.4

94.6

Stop

80.2

51.5

89.3

77.3

91.0

47.7

43.8

Table 7. Devoicing rate [%] of /u/ in the /CoVcCo/ environment classified by the manner of C1 and C2

Affricate

C1

C2 Fricative

Stop

Affricate

77.2

48.1

94.5

83.6

Fricative

95.1

61.2

97.5

93.5

Stop

80.8

74.0

80.1

77.1

84.4

68.8

35.9

These tables show several interesting tendencies. First, the devoicing rate was the highest when fricative C1 was followed by stop C2 in both tables, and the second highest devoicing rate was observed when fricative C1 was followed by affricate C2 in both tables. In contrast, the devoicing rate was the lowest when affricate C1 is followed by fricative C2, and the second lowest rate was observed when fricative C1 is followed by fricative C2 in both tables. Also, it is worth noting that, in terms of the peripheral distribution, the highest devoicing rate was observed when C2 was stop, and the lowest devoicing rate was observed when C2 was fricative. These facts show clearly that there is an interaction between the manners of articulation of C1 and C2. A two-way ANOVA between the manners of

Corpus-based analysis of vowel devoicing in spontaneous Japanese 215

C1 and C2 applied to data pooled over /i/ and /u/ showed that main effects of C1 and C2 and their interaction were all significant (For C1, DF =2, F=44.38, P <0.0001; For C2 DF =2, F =1959.43, P <0.0001; For C1*C2, DF =4, F =263.24, P <0.0001). Phonetic interpretation of the manner interaction will be discussed in Section 5.1 below. In the calculation of Tables 6 and 7, samples in which C2 was a geminate /Q/ were omitted, because the manner of /Q/ per se is not specified from a phonological point of view, and, it seemed that a following geminate constituted a special environment of devoicing, as shown below. Table 8 compares devoicing rates of close vowels (pooled over /i/ and /u/) in cases where C2 was and was not a geminate. This table shows that the devoicing rate was lower when C2 was a geminate, regardless of the manner of C1 (DF =758, t=24.84, P<0.0001, unequal variance). Further analysis revealed that the devoicing rate was the highest for the combination of fricative C1 and a stop geminate (namely a geminate followed by a stop), and was the lowest for the combination of fricative C1 and a fricative geminate (namely a geminate followed by a fricative). These show the same tendency as observed in Tables 6 and 7. Table 8. Effect of the following geminate on devoicing rate: Pooled data of /i/ and /u/ C1

C2 non /Q/

C2 /Q/

Voiced Devoiced % Devoiced Voiced Devoiced % Devoiced

Affricate

454

2,021

81.7

29

49

62.8

Fricative

860

14,099

94.3

112

181

61.8

1,464

4,954

77.2

282

87

23.6

Stop

4.1.2. Consecutive devoicing Because the initial and final consonants in the /CoVcCo/ environment are both voiceless (and due to the common CV syllable structure of Japanese), it happens that more than two consecutive vowels can belong to this environment (e.g. …(CoVc)CoVcCo(VcCo)…) When this happens it is called consecutive, or sequential, devoicing. Experimental studies have shown that more than two consecutive close vowels can be devoiced in this environment (Maekawa 1990 a,b).

216 Kikuo Maekawa and Hideaki Kikuchi At the same time, however, it is widely believed that there is a tendency to avoid consecutive devoicing (See Sakuma 1929 and Maekawa 1989, among many others). If this tendency does exist in spontaneous speech, it may help us to understand why the devoicing rate in the canonical /CoVcCo/ environment was not 100% in our data. Although the environment of consecutive devoicing can be formed both word-internally and across a word boundary, we examine only the wordinternal environment in order to exclude potential influence of a word boundary (cf. Kondo, 1997). The current data set contains 318 samples where consecutive devoicing could happen word internally. Table 9 shows the distribution of voicing status with respect to the first two vowels in the environment of consecutive devoicing. For example, if /niNsiki/ (‘recognition’) is followed by verbforming suffix (i.e., sahen verb) /suru/, the last two vowels of /niNsiki/ are in the consecutive devoicing environment. According to this table, 84 samples out of the total of 318 showed consecutive devoicing (26.4%), while in all other samples in this environment consecutive devoicing was avoided. The table also shows that the most frequent pattern of vowel voicing in this environment was a devoiced first vowel followed by a voiced second vowel. Table 9. Voicing of the first two vowels in the environment of consecutive devoicing

VOICED FIRST VOWEL DEVOICED

SECOND VOWEL VOICED DEVOICED 17 44 171 84

Figure 1 compares devoicing rates of the first and second vowels in the consecutive devoicing environment. Its abscissa represents the combination of the manner of C1 and C2, and is sorted in the descending order of the observed devoicing rate of the first vowel. Letters, ‘A’, ‘F’, and ‘S’ stand respectively for affricate, fricative, and stop; and are combined in the order of C1/C2. This figure shows that the two devoicing rates were, by and large, inversely proportional, reflecting a ‘one or the other’ relationship between the two vowels.2 The graph also shows that when a fricative was combined with an affricate or stop, it was always the vowel associated with (i.e., in the same mora as) the fricative that showed the higher devoicing rate, and, when both consonants were fricatives, it was the second vowel that showed a high devoicing rate.

Corpus-based analysis of vowel devoicing in spontaneous Japanese

217

Figure 1. Devoicing rate of two vowels in the environment of consecutive devoicing

4.2. The /CoVcCv/ environment From this point on, we will examine vowel devoicing in ‘atypical’ environments. This section deals with the /CoVcCv/ environment. Tables 10 and 11 show the devoicing rate of /i/ and /u/ as a function of the manner of C1 and C2. A two-way ANOVA between the manners of C1 and C2 applied to data pooled over /i/ and /u/ showed that main effects of C1 and C2 and their interaction were all significant (For C1, DF =2, F =440.24, P<.0001; For C2 DF=4, F =344.15, P<.0001; For C1*C2, DF=8, F =155.35, P<.0001). Table 10. Devoicing rate [%] of /i/ in the /CoVcCv/ environment classified by the manner of C1 and C2 C2 Approximant Fricative Liquid Nasal Stop Affricate 12.8 9.7 8.4 18.2 9.4 12.5 C1 Fricative 20.2 8.7 10.4 38.4 10.2 28.3 Stop 5.8 5.7 2.0 9.6 5.9 7.8 14.0 7.1 24.4 8.6 8.2

218 Kikuo Maekawa and Hideaki Kikuchi Table 11.

Devoicing rate [%] of /u/ in the /CoVcCv/ environment classified by the manner of C1 and C2 C2

Approximant Fricative Liquid

Nasal

Stop

Affricate

13.8

20.6

6.9

28.3

12.9 19.8

C1

Fricative

46.5

16.7

5.5

65.6

22.1 36.8

4.7

2.6

3.1

3.8

5.0

18.4

12.6

4.6

35.8

15.9

Stop

3.9

As far as C1 is concerned, the effect of consonant manner was similar to that observed in the /CoVcCo/ environment in that fricatives and stops showed the highest and lowest devoicing rate, respectively. As for C2, the effect of consonant manner was drastically different from that observed in the /CoVcCo/ samples. The manner of articulation that showed the highest devoicing rate here was nasal. This is congruent with the results of Maekawa (1989 and 1990a). Also, approximants, i.e., /w/ and /y/, enhanced devoicing more than stops did. The highest devoicing rate of all was observed for vowel /u/ preceded by a fricative and followed by an approximant.A closer look at the data, however, revealed that this enhancing effect of an approximant was the result of a high devoicing rate in only a few lexical items, namely, /desu/ (polite form of copula /da/) and /masu/ (an auxiliary verb of politeness). In the /CoVcCv/ samples, /desu/ was followed by sentence-ending particle /yo/ 138 times and devoiced 107 times (the devoicing rate was 77.54%). Also, /masu/ was followed by particles /yo/ or /wa/ 28 times and devoiced 14 times (50% devoicing). If we remove these two lexical items from the data set, the resulting devoicing rate was only 18%, and is lower than the 46.5% reported in the C1 fricative/C2 approximant cell of Table 10. Figure 2 shows the relation between word-frequency and devoicing rate of words in the /CoVcCv/ environment. Note that individual symbols in the figure represent the averaged devoicing rate of a given word. Note also that both axes are plotted on a logarithmic scale, and, words whose frequency was lower than 10 or whose devoicing rate was 0 were excluded from the analysis. The data points for /desu/ and /masu/ in this figure are likely to be outliers of the overall trend of a slight negative correlation (N =293, r=–0.146)3. The effect of the following approximant should be regarded, at least partly, as a consequence of word idiosyncrasy of high frequency function words.

Corpus-based analysis of vowel devoicing in spontaneous Japanese

219

Figure 2. Word-frequency and devoicing rate in the /CoVcCv/ environment

4.3. The /CoVncCo/ environment The last environment we will examine is /CoVncCo/, namely, non-close vowels both preceded and followed by voiceless consonants. Tables 12–14 show the devoicing rate of three non-close vowels as a function of the manner of C1 and C2. It is difficult to extract any phonetically meaningful generalizations from these tables. Fricative C1 and stop C2 seem to enhance devoicing more than other manners, but the difference was not salient. Indeed, a threeway ANOVA of vowels (/a/, /e/, /o/), C1 manner, and C2 manner revealed that none of the main effects were significant (For vowels, DF =2, F =2.57, P>0.0766; For C1 manner, DF =2, F=1.82, P>0.1616; For C2, DF=3, F=0.64, P >0.5890). The C1-C2 manner interaction was not significant either (DF =6, F =0.98, P>0.4354).

220 Kikuo Maekawa and Hideaki Kikuchi Table 12. Devoicing rate [%] of /a/ in the /CoVncCo/ environment by the manner of C1 and C2 C2

Affricate

Affricate –

Fricative

(1) –

Stop

(7) 0.0

Geminate

(47) 1.3

(79) 0.7

C1 Fricative

3.1 (389) 5.0 (714) 1.5 (1206) 1.3 (316) 2.7

Stop

0.8 (880) 1.1 (2537) 2.8 (5162) 1.1 (1134) 2.0

1.5

1.9

2.5

1.1

Numbers in parenthesis show the number of samples for each combination.

Table 13. Devoicing rate [%] of /e/ in the /CoVncCo/ environment by the manner of C1 and C2 C2 C1

Affricate

Fricative

Stop

Geminate

Affricate

–

(1)

–

(0)

–

(0)

–

(7)

0.0

Fricative

0.9

(333)

2.2

(45)

7.5

(388)

0.0

(132)

3.7

Stop

0.6

(176)

4.4 (1,083)

3.4 (2,925)

1.2

(650)

3.2

4.3

3.9

1.0

0.8

Numbers in parenthesis show the number of samples for each combination.

Table 14. Devoicing rate [%] of /o/ in the /CoVncCo/ environment by the manner of C1 and C2 C2 Affricate C1

Fricative

Affricate

–

(3)

–

(3)

Fricative

2.6

(76)

4.1

Stop

2.6

(721)

2.8

Stop 1.7

Geminate

(59)

4.0

(455)

3.8

(410)

4.3 (1,205)

1.7

(119)

4.0

2.1 (2,070)

3.8 (7,208)

1.1

(355)

3.3

2.5

3.9

2.6

Numbers in parentheses show the number of samples for each combination.

In Tables 12–14, the devoicing rate stayed nearly the same regardless of the combination of consonant manners, and it is this very fact that characterizes the devoicing of non-close vowels. Devoicing in the /CoVncCo/ environment is special in that the manners of adjacent consonants do not play a crucial role in the prediction of devoicing rates. But this does not mean that devoicing of non-close vowels was completely free from phonological con-

Corpus-based analysis of vowel devoicing in spontaneous Japanese 221

ditioning. There is at least one phonological factor that influences the devoicing rate of /CoVncCo/ vowels: consecutive identical morae, or, the repetition of the same mora. Sakuma (1929) noted that in words like /kokoro/ (‘mind’) and /haha/ (‘mother’), the vowel in the first mora could be devoiced. Table 15 summarizes the devoicing rate of the first vowels of 1260 samples that contain consecutive identical morae in the /CoVncCo/ environment. Devoicing rates of /a/ and /o/ shown in the table were higher than the overall devoicing rate shown in Tables 12 and 14. Table 15. Devoicing of the first vowel of two identical morae in the /CoVncCo/ environment VOWEL

VOICED

DEVOICED

a e o

458 112 490

54 5 141

% DEVOICED 10.5 4.3 22.3

In addition to this phonological conditioning, extra-linguistic factors played an important role in the devoicing of /CoVncCo/ samples. First, Figure 3 shows the effect of speaking rate on the devoicing of non-close vowels. The speaking rate of a given speaker’s utterance was taken to be the number of mora per second, averaged over the entire utterance. A histogram of speaking rates was plotted for each speaker, and was divided into 4 intervals for purposes of the current analysis. In the figure, speaking rate 1 means that the average speaking rate of the utterance containing the vowel in question is within the lowest 25% of the speaker’s histogram, and, speaking rate 4 means the top 25%. With the exception of /o/, the devoicing rate of non-close vowels increased monotonically as a function of the speaking rate.

222 Kikuo Maekawa and Hideaki Kikuchi

speaking rate Figure 3. Effect of speaking rate on devoicing rate in the /CoVncCo/ environment

Lastly, Table 16 shows the effect of ‘laughter’ on non-close vowel devoicing. In the transcription of CSJ, a tag was given if the speaker was speaking while laughing. Although this difference was not statistically significant (DF =27, t=–0.86, P<0.3967, unequal variance), the devoicing rate of non-close vowels in utterances containing the laughter-tag was consistently higher than in utterances without the tag. Table 16. Devoicing rate in the /CoVncCo/ environment as a function of laughter VOWEL LAUGHTER

N

VOICED DEVOICED % DEVOICED

a

0 1

12,184 292

11,940 274

244 18

2.00 6.16

e

0 1

5,616 124

5,434 116

182 8

3.24 6.45

o

0 1

12,396 288

11,977 270

419 18

3.38 6.25

Corpus-based analysis of vowel devoicing in spontaneous Japanese 223

5.

Discussion

5.1. Interpretation of manner interaction The results of our analysis about the manner of C1 and C2 are congruent with most past studies. For example, Takeda and Kuwabara (1987) reported that the devoicing rate of vowels in general was higher when C1 was a fricative, and the devoicing rate of the vowel in the /si/ mora was highest when the mora was followed by a stop. Similarly, Yoshida and Sagisaka (1990) reported that the devoicing rate of close vowels preceded by voiceless consonants became the highest when they were followed by stops. However, these studies examined the effects of C1 and C2 independently, and did not pay attention to their interaction. Recently, N. Yoshida (2002) and Fujimoto (2003) examined the interaction of adjacent consonants and arrived at conclusions similar to ours. However, their experiments examined only a subset of all possible manner combinations. Yoshida’s experiment examined /k/ and /s/ only, and Fujimoto’s examined /k, t, s/ and /h/. Our results reveal the validity of the manner interaction in much wider phonetic context, and in a more naturalistic setting, namely, in spontaneous speech. This is probably the most valuable finding of the current study. In our analysis of the /CoVcCo/ environment, we found that the interaction between the manners of C1 and C2 was statistically significant. The fact that the combinations of fricative-fricative and affricate-fricative resulted in a low devoicing rate is interpreted naturally if we think about the ease of mora boundary perception. In a CV mora whose consonant is a fricative or affricate, the devoiced vowel is phonetically realized as the extension of the frication noise. So, devoicing of vowels in the abovementioned phonetic context (that is, Co[fric/affric]-Vc-Co[fric]) results in the succession of frication noise, of which the first and last halves belong to different morae. Devoicing of this sort is likely to be avoided because it is difficult to perceive the mora boundary within this extended frication. Similar perceptual difficulty is also likely to arise when a devoiced vowel is preceded by a stop and followed by a fricative. In this combination, the mora boundary occurs between the aspiration noise of the stop and the frication noise of the fricative. Perception of a mora boundary in this context, however, is not as difficult as the combination of a fricative/ affricate followed by a fricative, because the presence of a stop can easily be perceived by the presence of its burst, and, the aspiration noise of a stop

224 Kikuo Maekawa and Hideaki Kikuchi is phonetically different from frication noise with respect to its quality and quantity. On the other hand, in the manner combinations having a stop as C2, it is relatively easy to perceive a mora boundary, because the boundary is formed by an acoustically salient feature, i.e., the burst of the stop. This salience is also preserved when C2 is an affricate, since the first half of an affricate is phonetically nothing but a stop. Lastly, the negative effect on devoicing of a following geminate can also be interpreted from a perceptual point of view. Devoicing of a vowel before a geminate requires, on the part of the listener, perception of two mora boundaries embedded within a stretch of voiceless sounds. For example, if the first vowel of /hiQsori/ (‘quietly’) is devoiced, the listener is required to perceive the first mora boundary at the point where palatal fricative (the conditional variant of /h/ before /i/) changes its color into a alveolar fricative, and, the second mora boundary somewhere within the long stretch of the alveolar fricative. It is not surprising that the language has a tendency to avoid such a difficult perceptual combination.

5.2. Consecutive devoicing The second valuable finding of the current study is the quantitative confirmation of the tendency to avoid consecutive devoicing and the role played by the combination of consecutive consonants. In Section 4.1.2, we noted that it was vowels associated with (i.e., in the same mora as) fricatives that showed higher devoicing rates. It is interesting, in this respect, to see that the observed devoicing rates of the first vowel in a consecutive devoicing environment were, by and large, close to those observed in the /CoVcCo/ environment, as summarized in Table 17. This similarity suggests that consecutive devoicing is basically a simple process. No special forwardlooking processing is needed to determine the devoicing rate of the first vowel. The devoicing rate of the second vowel, on the other hand, involves backward reference to the voicing status of the preceding (i.e. the first) vowel. At this point, it is important to note that the combination ‘S/S’ in an exception in both Figure 1 and Table 17. The devoicing rate for this combination in the consecutive devoicing environment is low, yet the rate in the canonical /CoVcCo/ environment is high. Currently, we are unable to explain this exception, but it is noteworthy that the number of samples used in

Corpus-based analysis of vowel devoicing in spontaneous Japanese 225

the analyses of consecutive devoicing is small for many of the manner combinations (see Figure 1). An increase in data will make it possible to decide if this case is really an exception. Lastly, the finding that consecutive devoicing does play an important role in the devoicing of close vowels requires revision of past analysis presented by the first author. Maekawa (1989 and 1990a) reported that the devoicing rate of close vowels could be higher when the following mora contained a non-close vowel. Although we do not present the data here, this tendency was clearly observed in the current data set. However, The tendency should be interpreted, at least partly, as a by-product of the avoidance of consecutive devoicing. That is, when a close vowel has a non-close vowel in the following mora, this automatically means that the vowel in question (i.e. the first clost vowel) is not in the environment of consecutive devoicing, hence the devoicing rate of that vowel is expected to be higher than elsewhere. Table 17. Comparison of the devoicing rate of the first vowel in a consecutive devoicing environment with that of the vowel in the /CoVcCo/ environment, pooled over /i/ and /u/ MANNER

V1

/CoVcCo/

F/A F/S A/S S/A A/A S/F F/F S/S A/F

98.3 95.3 94.4 91.6 75.0 62.5 42.3 39.2 22.2

95.9 98.1 92.6 80.7 79.3 68.6 48.9 84.4 43.1

5.3. Atypical environments The third contribution of this study is the observation of devoicing in atypical environments, namely, in /CoVcCv/ and /CoVncCo/ environments. Our analyses suggest that the devoicing of /CoVcCv/ close vowels were similar to that of /CoVcCo/ close vowels in that they were deeply conditioned by

226 Kikuo Maekawa and Hideaki Kikuchi the manner of articulation of adjacent consonants. Although the influence of C2 was quite different depending on the voicing of C2, it seems that these environments constitute one large class of vowel devoicing. Devoicing of non-close vowels, on the other hand, was a radically different phenomenon from close vowel devoicing in that the manners of adjacent consonants had almost no influence on devoicing rate. With respect to the influence of extra-linguistic factors that we presented in the analysis of non-close vowels, it is worth noting that both speaking rate and laughter showed exactly the same influence upon the devoicing of close vowels. The devoicing rate of close vowels increased monotonically as a function of speaking rate without exception, and, vowels uttered with laughter showed higher a devoicing rate than those uttered without laughter. The effect of speaking rate on the devoicing rate has been repeatedly confirmed in previous studies such as Maekawa (1990a) and Kondo (1997), and has been confirmed here for spontaneous speech data. Recent studies of linguistic variations recorded in CSJ have revealed that the presence of laughter was an excellent indicator of the speaker’s relaxation, resulting in a casual speaking style. Perhaps vowels are more likely to be devoiced in a casual speaking style than in a more formal speaking style in which speakers pay more attention to their speech. This view is consistent with the finding of Imaizumi, Hayashi and Deguchi (1995) that close vowel devoicing is less prominent when school teachers spoke to hearingimpaired pupils than when they spoke to normal hearing pupils. In the current data, as a matter of fact, the average devoicing rates in SPS (simulated public speaking) samples were significantly higher than that in APS (academic presentation speech) samples, as shown in Table 18. According to a two-way ANOVA between phonetic environment and speech type, both main effects were significant and the interaction was not significant (Environment: DF =2, F =536000.9, P<0.0001; Speech type: DF =1, F=39.32, P<.0001; Environment*Speech type: DF =2, F=2.95, P <0.0524). Table 18. Difference of devoicing rate due to speech type Environment CoVcCo CoVcCv CoVncCo

APS

SPS

N

% DEV

N

% DEV

11,028 10,943 12,215

85.9 18.4 2.5

13,570 16,816 18,685

87.8 19.9 3.1

Corpus-based analysis of vowel devoicing in spontaneous Japanese 227

6. Concluding remarks The use of a spontaneous speech corpus has revealed its effectiveness in the analysis of vowel devoicing. The data presented here is one of the most reliable resources for the study of vowel voicing, both in its quality and in its quantity. Full coverage of the many C1–C2 manner combinations would have been impossible if the amount of data was substantially smaller than the current data set. Needless to say, however, the current data set is still not large enough for a complete analysis of the statistically complex phenomena like consecutive devoicing discussed in Section 4.1.2. More reliable conclusions will be achieved once we have access to the entire CSJ-Core whose data size is more than twice the current data. Most of the analyses done in this paper are linguistic analyses in the sense that phonological environments were used as the factors conditioning vowel devoicing. Yet, as suggested in the analysis of non-close vowel devoicing, it is obvious that extra-linguistic factors also played a certain role. Extensive analyses of extra-linguistic factors and the integration of linguistic and extra-linguistic factors is an important step towards a full understanding of vowel devoicing phenomenon. Lastly, intonation labeling of the CSJCore will make it possible to examine the effect of prosodic conditionings such as pitch accent. All of these analyses should be the focus of future study.

Acknowledgments The authors are grateful to all speakers in the Corpus Spoken Japanese. Our gratitude also goes to Professor Hisao Kuwabara of Teikyo Science University who sent us his paper upon our request, and Dr. Jennifer Venditti whose comments on an earlier version of this paper helped us greatly.

Notes 1. The Core is also labeled for other research information such as clause boundary, discourse segmentation and dependency structure, but this information is not relevant to the current paper. Visit the following URL for more information about CSJ; http://www2.kokken.go.jp/~csj/public/index.html

228 Kikuo Maekawa and Hideaki Kikuchi 2. It seems that ‘S/S’ is an exception to the general tendency of inverse proportion. See section 5.2 for discussion. 3. The sample located in between /desu/ and /masu/ in Figure 1 is /si/, a suffix that turns a noun or adjectival into a verb ( i.e. a sahen verb).

Syllable structure and its acoustic effects on vowels in devoicing environments Mariko Kondo

1. Introduction Vowel devoicing is a common phonological process in many languages and typically involves high vowels and schwa. High vowels and schwa are inherently short (Bell 1978; Dauer 1980) and the process usually occurs when the vowels are either adjacent to, or surrounded by, voiceless consonants, during which the glottis is fully open. It is thought that vowel devoicing is a consequence of articulatory undershoot of glottal movements. It also suggests that vowel devoicing processes are the results of glottal gestural overlap between voiceless consonants and short vowels. The movements of glottal muscles for the short high vowels /i/ and /u/ blend with those of the adjacent voiceless sounds or a pause (Jun 1993; Jun and Beckman 1994). In many languages, the process is also considered to be part of the vowel neutralization and reduction processes in which vowels are first reduced in duration and centralized in quality, typically in the unaccented position, and then eventually devoiced and/or deleted in fast or casual speech (Hyman 1975; Wheeler 1979; Dauer 1980; Kohler 1990). The Japanese high vowels /i/ and /u/ also become voiceless when surrounded by voiceless consonants, or when preceded by a voiceless consonant and followed by a pause: i.e. /C8VC8/ or /C8V#/ (where the Vs are [+high]). However, in Japanese the vowel devoicing processes do not involve apparent centralization of vowels. There is no obvious durational reduction of vowels in the unaccented positions in Japanese, nor does vowel quality depend on accentuation. However, the vowel devoicing process is very common in many Japanese dialects, especially in eastern dialects including Standard Japanese. The process occurs even in slow or formal speech (Kondo 1997). This suggests that Japanese high vowel devoicing is not merely an optional process in fast or casual speech, but is also a phonologically controlled process.

230 Mariko Kondo Vowel devoicing means a lack of vocal fold vibration during the production of the high vowels /i/ and /u/ between voiceless sounds. This seems to be a natural process because it is not very economical for vocal folds to vibrate during vowel production when the glottis is open for the preceding and following voiceless sounds. Studies have suggested that Japanese vowel devoicing can be affected by various phonetic and phonological factors, such as the type and combination of preceding and following consonants (Kuwabara and Takeda 1988; Yoshida and Sagisaka 1990), the presence of an accent on the vowel (Takeda and Kuwabara 1987), position in a word or utterance (Maekawa 1989; Takeda and Kuwabara 1987) and following word boundary (Sakurai 1985). However, Kondo (1997) found that Japanese vowel devoicing was an almost obligatory process even when the vowel was accented and followed by an internal word boundary, so long as there were no devoiceable vowels in adjacent syllables (the single devoicing environment). For example ashita /asi»ta/ ‘tomorrow’, kikai /»kikai/ ‘machine’ and kusa /ku»sa/ ‘grass’ (in these examples and future examples vowels that can be devoiced are shown in italics and underlined). Devoicing in two consecutive syllables can also occur to a certain extent in spontaneous speech (Maekawa and Kikuchi, this volume). However, when vowels in adjacent syllables are all devoiceable (the consecutive devoicing environment), such as kashitsuchishi /kasitu»tisi/ ‘accidental death’ and fukushikikokyuu /fukusiki»kokjuu/ ‘abdominal breathing’, some vowels remain voiced. All studies agree that in the consecutive devoicing environment only some devoiceable vowels undergo the devoicing process. Therefore, the devoicing factors suggested above do not always result in devoiced vowels. When speaking tempo is altered, the effects of most of the suggested devoicing factors are minimal in the single environment. If vowel devoicing is merely a consequence of articulatory undershoot or glottal gestural overlap between short high vowels and voiceless consonants, then the devoicing should occur more when speech rate increases. In Japanese, devoicing of high vowels seems to be almost compulsory at all tempi, as long as there are no devoiceable vowels in neighboring syllables. Speaking tempo has very little effect on devoicing rates. On the other hand, devoicing rates in consecutive devoicing environments vary according to the tempo. Devoicing rates of high vowels in prose texts are not necessarily high at a comfortable speaking tempo for all types of preceding consonants. But when consecutive devoicing environment data are excluded, the devoicing rates at all tempi significantly rise, and vowel devoicing seems to be almost compulsory (Kondo 1997).

Syllable structure and its acoustic effects on vowels in devoicing environments 231

Devoicing rates indicate that vowels do not always become voiceless in phonetically ideal environments. The most important factor affecting vowel devoicing is whether there is a devoiceable vowel in adjacent syllables: i.e. single or consecutive environments. Almost all high vowels in single devoicing environments are devoiced, whereas in consecutive environments high vowels sometimes remain voiced. It has also been found that voiced vowels in devoicing environments are often acoustically different from the same vowels in non-devoicing environments. Acoustic analyses of high vowels in devoicing environments revealed that vowel devoicing is not always a clear-cut distinction of either voiced or voiceless; there are also many partially voiced/devoiced vowels. In fact, phonetic realizations of vowels in devoicing environments varies from fully voiced to completely voiceless. Despite being phonetically in the same condition, high vowels show different acoustic characteristics. This means that vowel devoicing does not simply indicate the presence or absence of vocal fold vibration, but involves fundamental acoustic changes of the vowels in the processes, and the degree of change is dependent on its phonological environment. Devoicing occurs mainly in the single devoicing environment. It sometimes occurs in the presence of two consecutive syllables, but never in three at a normal speaking tempo. Devoicing conditions in single and consecutive devoicing environments are phonetically identical. This implies that phonetic conditions are not the only conditions that affect devoicing. When phonetic and phonological conditions are in favor of vowel devoicing and the high vowels become voiceless, it is important to determine whether the acoustic changes of the vowels are simply a change of phonation or the processes involve other acoustic changes. In this paper, the acoustic changes of vowels in devoicing environments will be examined with respect to vowel duration and intensity. Since devoicing rates differ significantly in single and consecutive environments, vowel quality will be examined under both environments. Based on the results of the acoustic study, the devoicing processes will then be analyzed in terms of syllable structure. The aim is to determine how Japanese vowel devoicing changes syllable structures, why some vowels in devoicing environments do not become voiceless and also why devoicing does not occur in consecutive syllables, especially when there are more than three consecutive syllables. From the results there will be a discussion as to whether Japanese vowel devoicing processes are part of the vowel weakening processes that produce devoicing in other languages.

232 Mariko Kondo 2.

Acoustic characteristics of vowels in the devoicing environment

2.1. Durations of Devoiced Morae It is well known that Japanese speech rhythm is based on the mora. There is a tendency towards equalizing durations of morae (Campbell and Sagisaka 1991; Y. Sato 1993; Han 1994, etc.), and the duration of whole words or phrases is proportional to the number of morae in those words or phrases (Port et al. 1987). However, M. Beckman (1982) found the duration of morae with devoiced vowels were significantly shorter than their voiced counterparts. If vowel devoicing simply means a change of vowel phonation from voiced to voiceless, then voiceless vowels should retain their duration. However, if devoicing is part of the vowel weakening process, durational reduction may occur. Therefore, the durations of devoiced morae were compared with the durations of their voiced counterparts in the same phonetic environment to examine whether devoiced vowels retain their durations. In the experiment, six subjects pronounced 41 test words (containing 74 devoicing sites) three times each in random order. Their individual pronunciation of devoiceable vowels in the same words was not always consistent. For example, both /i/ vowels in /hoo»tiki/ ‘fire alarm’ are in the devoicing environment. The same speaker may devoice the first /i/ in one utterance and voice it in another utterance. When voicing of the same vowel varied, the duration of the mora with a voiceless vowel was compared with the duration of its voiced counterpart with a voiced vowel. In order to minimize the effects of the various factors that control segmental duration such as (a) type of phonemes, (b) neighboring phonemes, (c) mora position in a breath group, and (d) speaking rate, durational comparisons were made only of the same mora in the same word uttered by the same speaker in utterance internal positions. Data were collected from 738 recorded words (41 test words x 6 subjects x 3 pronunciations) containing 1332 devoiceable vowels (74 devoicing sites x 6 subjects x 3 pronunciations). There were 45 devoicing sites that had voicing variations. All words with voicing variation (45 sites x 3 times = 135 high vowels) were segmented using Waves+ speech analysis on a SUN workstation. The results found that morae with voiceless vowels were significantly shorter in duration than those with voiced vowels [t(44)=8.49, p <.001] (Figure 1). The average ratio of devoiced morae against /CV/ counterparts

Syllable structure and its acoustic effects on vowels in devoicing environments 233

was 83.93% (SD 12.97). However, the durations of devoiced morae were significantly longer than their corresponding consonant’s portion of /CV/ morae [t(44)=13.62, p<.001].

Figure 1. Average durational difference between devoiced morae and consonants and vowels in CV morae of all types of consonants

Figure 2. Average closure duration and the duration after release of stops and stop part of affricates in CV morae and devoiced morae

The closure durations of stops and the stop part of affricates in devoiced morae were compared with the closure durations of prevocalic stops and the stop part of prevocalic affricates in /CV/ morae. The average closure duration of stops and affricates in devoiced morae was not significantly different from that in /CV/ morae as shown in Figure 2. However, the average duration of stops and affricates in devoiced morae excluding closure duration (i.e. after release of stop closure) was significantly shorter than that of /CV/ morae [t(31)=7.12, p<.005]. This means that vowel devoicing reduces the duration of devoiced vowel but does not affect the duration of

234 Mariko Kondo the preceding consonant. It is technically impossible to measure the duration of devoiced vowels after voiceless fricatives, since there is no way to tell the boundary between the voiceless vowel and preceding voiceless fricatives. Therefore, the durations of morae with devoiced fricatives were excluded from the data. The results indicated that when a vowel was devoiced, the vowel became shorter than its fully voiced counterpart, and as a result, the whole duration of the mora was reduced.1

2.2. Intensities of Vowels in the Devoicing Environments The previous section demonstrated that there was durational reduction when vowels were devoiced. Devoicing rates showed that vowels in single devoicing sites were almost always devoiced, whereas only some vowels in consecutive devoicing sites became voiceless. If devoicing in single sites is a natural process, voiced high vowels in single sites are unnatural and therefore they may be acoustically different from the same vowels in nondevoicing environments. On the other hand, if only some high vowels undergo the devoicing process in consecutive sites, then it must be natural for voiced vowels in consecutive devoicing sites to retain the same acoustic qualities as in non-devoicing environments. Moreover, if devoicing is part of the vowel weakening process, voiced vowels in devoicing environments may show weakening of their intensities as well as durational reduction. An experiment was conducted to measure intensities of voiced vowels in devoicing environments at three speaking tempi and to compare them with the intensities of voiced vowels in non-devoicing environments in the same words. Three subjects pronounced 6 test words listed in (1a) and (1b) with single and consecutive devoicing sites at slow, comfortable and fast tempi (devoiceable vowels are underlined in italic). (1)

a. /ta,i.sjo.ku. »te.a.te/ [tai˛okμteate] ‘retirement allowance’ /ka.mo.tu. »se,N.pa.ku/ [kamotsμsempakμ] ‘cargo boats’ /ta.ka.sa.ki. »si.mi,N/ [takasaki˛imii)] ‘the Takasaki citizens’ b. /hu.ku.sjo.ku. »ke,N.sa/ [μkμ˛okμkensa] ‘dress inspection’ /sjo.ku.hi.se.tu.ja.ku/ [˛okμCisetsμjakμ] ‘a cut in food expenses’ /ha,i.sju.tu.ki.dju,N/ [hai˛μtsμkid¸μμ)] ‘exhaust limit’ (Here dots /./ denote syllable boundaries, and commas /,/ denote mora boundaries.)

Syllable structure and its acoustic effects on vowels in devoicing environments 235

The total number of devoiceable vowels in single devoicing sites in the test words was 6 (as some of the test words contain more than 1 single site), and the number in consecutive devoicing sites was also 6, yielding 324 devoiceable vowels ([6 vowels + 6 vowels] x 3 rates x 3 repetitions x 3 speakers = 324 devoiceable vowels). The average intensities of voiced vowels in devoicing and nondevoicing environments, excluding word-initial and word-final morae, were calculated for the three tempi and individual subjects, and were compared using a T-test. Since speaking tempo was effective only in the single devoicing sites and not in the consecutive sites, the vowel intensities were compared by their devoicing environments using a T-test. When the devoiceable vowels remained voiced in the single devoicing condition, their intensities were significantly lower than those of non-devoiceable vowels at all speaking tempi for all subjects (Table 1). One of the subjects (A) devoiced all devoiceable vowels at the normal tempo and only once voiced the underlined devoiceable vowel /u/ in /hukusjoku»keNsa/ at the slow rate. Therefore, no comparison was made of subject A’s data for the two tempi. This result was expected because when a vowel was voiced in a single devoicing environment, it was often partially voiced, i.e. the duration of the vowel tended to be shorter and its intensity was lower, which sometimes made it difficult to judge whether a vowel was actually voiced or voiceless. Table 1. T-test results of intensity differences between voiced vowels in single devoicing and non-devoicing environments (one-tailed)

Subject

Tempo fast

A

B

C

comfortable

Average intensity of devoiceable vowels (dB)

Average intensity of non-devoiceable vowels (dB)

67.07

77.34

df

p-value

3

p < 0.005

N/A

N/A

N/A

N/A

slow

(59.57)

(76.56)

N/A

N/A

fast

68.69

75.14

2

p < 0.025

comfortable

72.63

77.64

2

p < 0.05

slow

69.10

72.45

11

p < 0.001

fast

69.69

76.27

4

p < 0.025

comfortable

71.10

74.85

5

p < 0.05

slow

70.32

72.93

7

p < 0.001

236 Mariko Kondo However, in the consecutive devoicing environments, when devoiceable vowels were voiced their intensities were not necessarily lower (Table 2). The intensity of voiced devoiceable vowels and the intensity of nondevoiceable vowels were significantly different only for Subject A at all tempi. There was also a significant difference at the normal tempo for Subject B, but not at fast or slow tempi, and not at any tempo for Subject C. Table 2. T-test results of intensity differences between voiced vowels in consecutive devoicing and non-devoicing environments (one-tailed)

Subject

A

B

C

Average intensity of devoiceable vowels (dB)

Average intensity of non-devoiceable vowels (dB)

fast

70.20

comfortable

df

p-value

78.25

5

p < 0.025

72.37

75.60

6

p < 0.005

slow

70.15

76.12

9

p < 0.005

fast

71.57

73.83

3

n.s.

comfortable

76.12

78.47

13

p < 0.001

slow

72.03

72.63

11

n.s.

fast

74.71

74.63

10

n.s.

comfortable

74.46

74.79

11

n.s.

slow

71.76

72.26

12

n.s.

Tempo

The average intensity ratios of voiced devoiceable vowels of three speakers at three tempi in single and consecutive environments are presented in Figure 3. Subject A’s intensity data at normal and slow tempi in the single environment were excluded from the analysis as their intensities were not statistically compared (see Table 1). Intensity of sound is very sensitive and is influenced by various factors, such as neighboring sounds, pitch, stress and accentuation. Under the same conditions, the same vowel has higher intensity in higher pitch than in lower pitch, and also higher intensity in a stressed position than in an unstressed position. Moreover, different vowels have their own intrinsic intensities even when spoken with equal effort. Vowels made with a wider vocal tract have a higher intensity level than close vowels. For the same degree of opening, vowels with closer F1 and F2 have a higher intensity than vowels with F1 and F2 far apart; i.e. back vowels are a little more intense than

Syllable structure and its acoustic effects on vowels in devoicing environments 237

Figure 3. The average intensity ratios of three speakers between voiced devoiceable vowels and non-devoiceable vowels in single and consecutive devoicing environments at three tempi

front vowels. In Japanese, the F1 and F2 of the vowels [i], [e] and [μ] are relatively far apart while [a] and [o] have relatively close F1 and F2. In other words, the intensities of [i], [e] and [μ] are generally less than those of [a] and [o]. In this experiment, all devoiceable vowels were either [i] or [μ] with an inherently weak intensity. This may have lowered the average intensity ratios of devoiceable vowels against non-devoiceable vowels that are inherently greater in intensity. It was extremely difficult to find ideal test words for comparing intensities in both devoicing and non-devoicing environments, and therefore the type of vowel tested was not always identical. Although there were differences between the intensities of voiced vowels in the devoicing and nondevoicing environments, this might simply have been due to the different types of vowels in the two environments. Under equal conditions, high vowels have intrinsically lower intensities than non-high vowels, and all vowels in the devoicing environment are high vowels. However, the following patterns were noted: (a) more intensity weakening at all tempi in the single devoicing environment than in the consecutive environment, (b) greatest intensity weakening at the fast tempo and least intensity weakening at the slow tempo in the single devoicing environment, and (c) there was no tempo effect on intensity in the consecutive devoicing environment.

238 Mariko Kondo Vowels in the devoicing environments are not only shorter but also have less intensity than non-devoiceable vowels. In other words, vowels in the devoicing environments are first reduced in duration and intensity, and then further devoiced. In extreme cases, the vowels become deleted. 3. Syllable constraints on vowel devoicing High vowels are almost always devoiced in the single devoicing environment, whereas devoicing is not a compulsory process in the consecutive devoicing environment. Also, the results presented in the previous section indicate that voiced vowels in single devoicing environments are acoustically short and weak whereas in consecutive devoicing environments they have full duration and full voicing, despite being in phonetically identical conditions. In other words, the devoicing process must be controlled by more than just phonetic factors. Moreover, there are no definite voicingdevoicing patterns of preceding and following consonant types, although there is a tendency for preceding fricatives to trigger devoicing more than stops or affricates, in both single and consecutive devoicing environments (Maekawa and Kikuchi, this volume). Physiological studies found that movements of laryngeal muscles during the production of a voiceless consonant and a following high vowel blend better when the preceding consonant is a fricative than a stop or affricate (Yoshioka 1981). Devoicing rates of vowels preceded by fricatives are high in single devoicing sites, but they are not necessarily high in consecutive devoicing environments. This means that devoicing is controlled by the presence of devoiceable vowels in adjacent syllables in addition to phonetic factors. The fundamental difference between the two environments is that only in the single devoicing environment is it possible for a preceding consonant to be resyllabified to an adjacent syllable after a vowel becomes voiceless, i.e. a change of syllable structure. When a vowel becomes voiceless, it is acoustically manifested either as the continuation of a preceding fricative or as a preceding stop released into a fricative, thus creating sequences of voiceless consonants. Japanese syllables are predominantly light open syllables /(C)V/. Consonant clusters do not occur within a syllable except for the rare occurrence of /–NC/ sequences in syllable coda position e.g. /hoNtte/ ‘Books are...’ and /abadi:Nkko/ ‘Aberdonian’. They can also occur in a /Cj-/ sequence in syllable onset position if we consider a glide /j/ as a consonant, as in /tja/ ‘tea’ and /gjoo/ ‘line’. When the voiceless vowel loses its sonority, the preceding consonant in the devoiced syllable cannot consti-

Syllable structure and its acoustic effects on vowels in devoicing environments 239

tute a syllable on its own. The syllable structure of a word is altered as a result of vowel devoicing. In addition to the mora, the syllable is important in Japanese as an accent bearing unit, and constrains various phonological processes, such as the formation of loan words (Itô 1990; Shinohara 1997). Vowel devoicing processes lose the syllabicity and moraic status of devoiced morae and change the syllable structure of a word by creating consonant sequences. The phonological structure of the word akikan /a.ki.kaN/ [akikaN] (unaccented) ‘empty can’ with devoiced [i•] is represented as (2a)2. When the high vowel /i/ becomes voiceless, the devoiced mora /kC/ cannot be attached to the syllable node because the syllable has lost its core element. Therefore, the second syllable cannot sustain its status as a syllable (2b). Then the remaining /kC/ becomes non-moraic because devoiced morae are not long enough to be considered as a mora (see Section 2.1). The quality of the devoiced vowel /i/ is reflected in the palatalization of the preceding consonant as /kC/, but duration is one of important characteristics of the mora in Japanese. Hence, the non-moraic /kC/ is syllabified to the coda of the preceding syllable /a/ (2c). The process creates a bimoraic heavy syllable /VC/ (which is permitted in Japanese), and reduces the number of syllables and morae of the word. (“” denotes a syllable, “μ” denotes a mora.) (2a) akikan /akikaN/ [aki•kaa)]3

(2b)

μ

μ

μ

μ

μ

μ

V C

V C V

C

V

C

C V

a

i•

N

a

kC

k

k

k

a

(2c)

μ

μ

μ μ

C

V C C V C

a N

a

kC k

a N

When devoicing occurs at the beginning of a word, the consonant in the devoiced syllable is syllabified to the onset of the following syllable because this is the only possible place it can move to e.g. kita /ki.ta/ [ki•ta]

240 Mariko Kondo (unaccented), ‘north’ and hikari /hikari/ [Ci•ka»Ri] ‘light’. As shown in (3a–c), when the vowel /i/ becomes voiceless in the word kita /ki.ta/ (3a), the sequence [ki•] loses its syllabic status and becomes /kC/ (3b). Then the /kC/ becomes non-moraic because devoiced morae are not long enough to be considered as fully moraic. Hence, the non-moraic /kC/ is syllabified to the onset of the following syllable /ta/ (3c). The syllable onset consonant clusters are not very common in Japanese, but occur in /Cj-/ sequences. (3b) [kCta]

(3a) kita /ki•ta/ [kita]

(3c) [kCta]

μ

μ

C V

C V

C C V

C C

V

k

t

kC t

kC t

a

i•

a

μ

μ

μ

a

Devoicing occurs even when a devoiceable vowel is in an accented syllable. Devoicing of an accented vowel can be blocked or avoided by shifting the accent to another syllable (McCawley 1977; Sakurai 1985; Vance 1987). However, vowel devoicing in accented syllables can also occur in normal speech (N. Hattori 1989). As mentioned earlier, in Japanese, syllables carry the lexical accent. When the vowel in an accented syllable is devoiced, and its preceding consonant cannot sustain its syllabic status, it can no longer carry an accent. Even if the preceding consonant remains moraic, the mora is not an accent bearing unit. For instance, the underlined /i/ in the word shokikan /sjo.»ki.kaN/ ‘cabinet secretary’ is devoiceable. When it is devoiced and loses its syllabicity, it cannot simply be syllabified to the preceding syllable /sjo/ as (4a). This is because it was the syllable /ki/ that carried the accent in the word and the accent bearing syllable has now been lost. The preceding /k/ has to be syllabified to the following syllable /kaN/. This analysis seems appropriate since the accented syllable corresponds to the acoustic manifestation of an accent match. The acoustic cue for the lexical accent is the fall of the fundamental frequency (F0) from an accented

Syllable structure and its acoustic effects on vowels in devoicing environments 241

vowel to the following syllable. Perceptual cue for accent on a devoiced vowel is the unusually high starting F0 of the following vowel that then falls very sharply (Sugito and Hirose 1988). The first /k/ is desyllabified, becomes non-moraic as in (4b), and then is resyllabified to the onset of the following syllable /kaN/, creating the superheavy syllable /kCkaN/ (/CCVC/) as in (4c). The acoustic cue of the lexical accent is manifested in that syllable. (4a) */sjo»kikaN/ [˛o»ki•kaN]

(4b) Demoraification of / kC/

(4c) Resyllabification

μ μ

μ μ

μ

μ μ

μ

μ μ

C C V C C V C

C C V C C V C

C C V C C VC

s j o kC k a N

s j o kC k a N

s j o kC k a N

When there is only one devoiced vowel in a non-word-initial syllable, the preceding consonant in the same mora can be syllabified to its preceding syllable. In the case of word initial position or in an accented syllable, it can be syllabified to the following syllable. Therefore, vowel devoicing in the single devoicing environment is always possible4. However, in the consecutive devoicing environment, not all devoiceable vowels can become voiceless. For example, both underlined italic /u/ vowels are devoiceable in the word dookutsu /dookutu/ (unaccented) ‘cave’, and common pronunciations for the word are [do:kμ8tsμ] with the first /u/ devoiced and [do:kμtsμ] with both /u/ vowels voiced. In an earlier study the pronunciation [do:kμ8tsμ] occurred in 16 out of 18 samples (Kondo 1997). The first pronunciation is possible because /k/ in /ku/ can be syllabified to the following syllable as shown in (5a). The process (5b) may be plausible. However the first syllable is super heavy /CVVC/ which is less favored in Japanese. The morphological structure must also be considered because strictly speaking the word dookutsu is a compound word (‘doo’ + ‘kutsu’), and very few native speakers would analyze the word as ‘dooku’ + ‘tsu’. Therefore, it would be more sensible to analyze the process as (5a). The vowel /u/ of /ku/ becomes voiceless,

242 Mariko Kondo then is desyllabified, becoming non-moraic, and finally resyllabified to the following syllable /tu/. On the other hand, in normal speech it is not possible to pronounce *[do:kμ8tsμ8] with two consecutively devoiced vowels. As shown in (5c), /k/ in /ku/ can be syllabified to the preceding syllable creating /CVVC/, but /t/ in /tu/ cannot because it would create the sequence */CVVCC/. This is not considered to be an acceptable superheavy syllable as the second last consonant is not a moraic nasal. Therefore, the pronunciation of [do:kμtsμ] with both voiced vowels is more favorable than the syllable final obstruent clusters. (5a) /doo#kutu/ [do: kμ8tsμ]

μ μ

μ

μ

μ μ μ

μ

μ μ

μ

C V V C V C V

C V V C C V

d o

d o

o

k u• t

u

o kx t

C V V C C V

o

o kx t u

μ

μ μ

μ

u

d

(5b) */dooku#tu/ [do: kμ8tsμ]

μ

μ

μ

μ

μ

μ

C V V C V C V

C V V C C V

d

d

o

o k

u•

t

u

o

o kx t

u

C V V C C V

d

o

o kx t u

Syllable structure and its acoustic effects on vowels in devoicing environments 243

(5c) *[do: kμ8tsμ8]

μ

μ

μ

μ

μ μ

μ μ

C V V C V C V

C V V C C

d

d

o

o

k

u• t

u•

(6a) [kμ8»tsμ8˛i•ta]

o

C V V C C

o kx ts

d

(6b) *[kμ8»tsμ8˛i•ta]

μ

μ

μ

C C V C C V

C C C C V

kx t

u

˛

t

o o kx ts

a

kx ts

˛

t

a

In the word kutsushita /ku»tusita/ ‘sock(s)’, where all the underlined vowels are devoiceable, consonants preceding devoiceable vowels are the stop [k], the affricate [ts] and the fricative [˛]. The pronunciation [kμ8»tsμ˛i• ta] with the first and third vowels devoiced and the second vowel voiced is most common. This process can also be explained in relation to the syllable structure. As shown in (6a) and (6b), when the first vowel /u/ in /ku/ becomes voiceless, the preceding /kx/ is desyllabified and becomes non-moraic, and then is syllabified to the onset of the following syllable. The third vowel /i/ also becomes voiceless, and the preceding /s/ [˛] is also syllabified to the onset of the following syllable. The process creates sequences of less common syllables /CCV/+/CCV/, but it is still better than devoicing in three consecutive morae. Triple devoicing is not acceptable as shown in (6b). The

244 Mariko Kondo first two consonants /kx/ and /ts/ cannot be syllabified to the following syllable, because it would create quadruple-consonant clusters */CCCCV/. Moreover, in the word /ku»tusita/ ‘sock(s)’, the syllable /tu/ carries the lexical accent. It is most logical to leave the vowel of /tu/ voiced rather than creating an awkward heavy syllable */tsta/ [ts˛ta] or */ktsta/ [kxts˛ta]. Alternatively, it is also acceptable to pronounce this word with all vowels voiced [kμ»tsμ˛ita], but never with all three vowels devoiced. The devoicing processes can be explained as (1) change of phonation, (2) durational reduction, (3) loss of syllabisity, (4) demoraification, and (5) resyllabification. For analyses of other syllable structures such as in cases including devoicing before geminate consonants and the acceptability of bimoraic syllable onset, refer to Kondo (2001).

4. Conclusions Vowel devoicing is fundamentally a phonetic process that economizes glottal movements of a short high vowel and its surrounding voiceless consonants. However, Japanese vowel devoicing processes are also affected by various phonological factors, especially the syllable structure. The experimental results showed that high vowels in the single devoicing sites were almost always devoiced but not all devoiceable vowels became voiceless in the consecutive devoicing sites. Vowels in typical devoicing sites became voiceless only when the consonants preceding devoiced vowels are possible to be syllabified to their adjacent syllables. Moreover, voiced high vowels in typical devoicing environments were often not fully voiced and were reduced in duration. These voiced devoiceable vowels were not only shorter but also had less intensity. This means that it is more natural for high vowels to undergo the devoicing process between voiceless sounds. Therefore when they did remain voiced they were acoustically shorter and weaker than when they occurred in the non-devoicing environment. The results also suggest that vowel devoicing is part of a vowel weakening process and the final state of the process is completely voiceless or in an extreme case vowels are deleted. Vowel weakening in Japanese affects vowel intensity and duration, but the quality of the vowels remain relatively unchanged regardless of the intensity level of the vowel. Two different mechanisms, namely phonetic and phonological processes, appear to control Japanese vowel devoicing. The vowel devoicing environment must qualify certain phonetic conditions: namely short vowels surrounded by

Syllable structure and its acoustic effects on vowels in devoicing environments 245

voiceless consonants or a voiceless consonant and a pause. Even when a phonetic environment favors devoicing, devoicing may be blocked if constrained by syllable structures. When syllable structures do not block vowel devoicing, other factors such as type of adjacent consonants, presence of lexical accent, word boundary, become effective.

Acknowledgement The author would like to thank an anonymous reviewer for useful comments and suggestions.

Notes 1. The presence of one devoiced vowel in a word did not affect the duration of a whole word. Despite shorter duration of devoiced morae, the whole durations of words did not show a significant difference from words of the same number of morae without devoiced vowels. However, when there are more than one devoiced vowel in a word, the whole word duration becomes significantly shorter. See Kondo (2003) for details. 2. The mora tier usually represents an alternative rather than an addition to the CV tier, and onset consonants are attatched directly to the syllable node as they are nonmoraic (Hayes 1989; Kenstowicz 1994). However, I use a separate CV tier in order to present clearly the formation of moraic consonant and resulting syllable structures. 3. Pseudo-phonemic transcriptions are used to describe devoiced vowels and resulting moraic consonants for convenience. The examples are the devoiced vowels /i•/ and /u• /, the allophones of consonants /˛/ instead of /s/ and /sju/ (in /si/ and /sju/), /C/ and // instead of /h/ (in /hi/ and /hu/), /t˛/ instead of /t/ (in /ti/) and /ts/ instead of /t/ (in /tu/). /kC/ was also used to indicate palatalization of /k/ and its release into a palatal fricative [C] in /ki/, and [kx] for /k/ in /ku/ to indicate backness of the /k/ and its release into a velar fricative [x]. 4. For the arguments concerning vowel devoicing and the loss of syllabicity, see Kondo (2001).

The effect of speech rate on devoiced accented vowels in Osaka Japanese Miyoko Sugito

1.

Introduction

This paper discusses the results of acoustic and physiological experiments on the effects of speech rate on devoiced accented vowels in Osaka Japanese. The close vowels /u/ and /i/ are often devoiced between voiceless consonants in many dialects of Japanese. Word accent on the moras with devoiced vowels are generally shifted to the following moras in the Tokyo dialect and other dialects, as are shown in accent dictionaries (NHK 1998; Kindaichi and Akinaga 1997). However, in the Osaka, Kyoto and other dialects in the Kansai district, the close vowels /u/ and /i/ preceding open vowels such as /a e o/, as in /kusa/ ‘grass’, /sita/ ‘tongue’ or /sika/ ‘deer’ etc., are often both devoiced and accented in natural or fast speech. Using the results of acoustic and physiological experiments, this paper explores how devoiced accented vowels are produced, and also how accent change occurs in those words when they are produced at different speech rates.

2.

Preview studies on word accent in Osaka Japanese

2.1. Word accent in two-mora words in Osaka Japanese Most dialects of Japanese have a moraic word pitch accent. Two-mora words in the Osaka dialect have four kinds of accent: HH, HL, L-HL, LH, where H and L represent high and low pitch. The accent-type -HL refers to a descending pitch from high to low within a single mora. Wada (1947) classified these pitch accent patterns into two types: high-starting and lowstarting. These differences are correlated with physiological differences (Sugito and Hirose 1978), as reported in section 5.1 of this article.

248

Miyoko Sugito

2.2. Words with devoiced accented vowels The devoiced, accented vowel has been a central topic in discussions on whether the Japanese language has pitch accent or not. S. Hattori (1960) and Kawakami (1969) reported that devoiced vowels might be heard as accented because of the greater intensity in that mora. Polivanov (1928) was the first to report a devoiced accented word on the basis of fieldwork carried out in 1914 and 1915, viz. in /kita/ ‘north’ in the Mie dialect in Nagasaki. He explained that only /a/ had a falling F0 contour, while the /kit/ portion had no vocal cord vibration. S. Hattori (1928) reported that words with devoiced accented initial vowels in the Kansai dialect, such as /sita/ ‘tongue’ and /sika/ ‘deer’, supposedly had final vowels with falling tones, where “initial” refers to “in the initial mora or syllable”; “final” refers to “in the final mora or syllable”. Sakuma (1931) extracted F0 contours of words such as these and compared them with words with the accent type L-HL in the Kansai dialect. However, he failed to find falling F0 contours on the final vowels, and concluded that the accent type in these words was not HL or L-HL but all had LH instead, insisting, incidentally, that “experiments on Japanese accent were useless”. Sugito (1969) extracted fundamental frequencies of 556 words of 1 to 6 moras produced by both native Tokyo and Osaka dialect speakers. The results showed that the falling F0 contours of the following vowels influenced perception of the preceding vowels as accented (Sugito 1969). Vowels following devoiced accented vowels had more sharply falling F0 contours than those following accented voiced vowels (Sugito 1969/1970). The abrupt falling F0 contour on the second vowel plays an important role in listeners perceiving an accent on the first devoiced vowel (Sugito 1969, 1982). Maekawa (1990), Matsui (1993) and M. Kitahara (1998) ran follow-up acoustic and perceptual experiments which showed similar results. In the production and perception of devoiced accented vowels, speech rate may also play an important role. This paper reports on experiments that examine the relationship between speech rate and the production of devoiced accented vowels.

The effect of speech rate on devoiced accented vowels in Osaka Japanese

3.

249

Experimental procedures

3.1. Acoustic experiments Two-mora words were uttered in two frame sentences in randomized order seven times each, at slow, natural, fast and very fast rates, respectively. The words were /kusa/ ‘grass’, /kuse/ ‘habit’, /sita/ ‘tongue’ /sika/ ‘deer’ with a close-open vowel sequence [Note: The terms 'close' and 'open' vowels are used instead of 'high' and 'low' to avoid confusion with the H or L notation for word accent and also with high or low fundamental frequencies.], /kasa/ ‘bulk’ with an open-open vowel sequence, and /kusi/ ‘comb’ with a closeclose vowel sequence. All of these words have the accent type HL. The word /huta/ (close-open) ‘lid’ with accent HH was also included. The two sentence frames were (A) “Kore-wa….” (“This is….”) with HHH accent, and (B) “Tsugi-wa….” (“The next is….”) with HLL accent. The speakers were one male (MN), born in 1930, and two females (KK and TA), born in 1963 and 1979, respectively. All three speakers were born and raised in the Osaka Prefecture. The accent of each word in a sentence frame was presented auditorily five times in randomized order. The listeners were three female Osaka dialect speakers who had studied Japanese accent. An accent type was assigned when there was 90% agreement among the three listeners. Acoustic analysis was done using the SUGI SpeechAnalyzer (Sugito 2000). Presence or absence of voicing in the first mora of tokens of /kusa/ was decided on the basis of speech waves and spectrograms. To determine speech rate, durations of the words and the second vowel /a/ were measured. In addition, F0 contours were extracted.

3.2. Physiological experiments The material for the physiological experiments used in this paper was made available at the University of Tokyo Research Institute of Logopedics and Phoniatrics. The subject was an Osaka dialect female speaker YI, born in 1950. Electromyographic recordings were made for twelve randomizations of the words /imi/ (four accent types) and /kusa/, /kusi/, etc. (accent HL), using hooked wire electrodes inserted into the cricothyroid (CT) and sternohyoid (SH) muscles (Sugito and Hirose 1978, 1988).

250 4.

Miyoko Sugito

Results of the acoustic experiments

4.1. Effect of speech rate on voicing and perception of pitch accent The results of acoustic analysis of the words /kusa/, /kuse/, /sita/, and /sika/ (close-open vowel sequences) produced at natural or fast rates were similar, their initial vowels often being devoiced and accented, although in slow speech all of the words were produced with the first vowels voiced. In this section, the results of /kusa/ uttered by speakers MN, KK, and TA are examined. Table 1 shows the results for /kusa/ produced in the sentence frame (A) “kore-wa (HHH) kusa (HL)” by three speakers at three different rates: natural, fast and very fast. The table shows the voicing status of the initial vowel /u/, the perceived type of accent, the number of times that accent was perceived as such out of seven utterances, the averaged durations and standard deviations of the words, and the averaged durations and standard deviations of the second vowel /a/. The following is a description of the results for each of the three speakers. Table 1. Analyzed results of utterances /kusa/ in carrier sentence (A), spoken by (1) MN, (2) KK, and (3) TA, at natural, fast, and very fast rates. speaker

rate

natural (1) MN (1930)

fast very fast

(2) KK (1963)

type of accent

+V +V

HL HL

–V +V

Durations

times /kusa/

/a/

mean (SD)

mean (SD)

7 5

305.0 (34.3) 275.0 (24.2)

65.9 (11.1) 79.0 (6.6)

HL HH*

2 1

266.0 (36.8) 202.0 (—)

78.0 71.0

(2.8) (—)

–V

HH*

6

165.5 (19.7)

65.0 (10.7)

natural

–V

HL

7

405.6 (42.2)

118.7 (22.0)

fast

–V

HL

7

334.7 (15.3)

104.0 (14.2)

very fast

–V

HL HH*

6 1

275.5 (36.8) 224.0 (—)

98.0 (18.2) 47.0 (—)

+V

HL

3

338.0 (40.1)

142.3 (35.8)

fast

–V –V

very fast

–V

HL HL HL

4 7 5

334.0 (44.2) 280.9 (32.7) 187.0 (14.6)

160.8 (35.4) 128.4 (29.2) 65.0 (12.6)

HH*

2

175.0 (7.1)

natural (3) TA (1979)

1st vowel devoiced or not

54.5

(2.1)

The effect of speech rate on devoiced accented vowels in Osaka Japanese

251

(1) MN results: At a natural rate of speech, MN uttered the first mora of /kusa/ voiced (+v) with accent HL. At a fast rate, five tokens of the initial vowel /u/ were accented and voiced, while two were devoiced (-v). However, when he was urged to speak faster, six of the /u/ vowels were devoiced, and all of the words were heard as High-High accented (HH* in the table). Six of the vowels /u/ were devoiced. However, the final /a/ vowels were all heard by the Osaka speakers to be as high as the devoiced accented first vowels /u/. (2) KK results: KK produced /u/ as devoiced and accented in natural, fast, and very fast speech. One exception occurred in very fast speech in which /u/ was voiceless and the accent was perceived as HH. (3) TA results: TA produced the /u/ as devoiced and accented 4 out of 7 times in natural speech, and all of the time in fast speech and very fast speech. Accent shift occurred twice when spoken at a very fast rate. Notice that the durations of /a/ of all the tokens that were perceived as accent HH are very short. Table 2. Analyzed results of utterances /kusa/ in carrier sentence (B), spoken by㸝1㸞MN, (2) KK, and (3) TA, at natural, fast, and very fast rates. speaker

rate

1st vowel devoiced or not

type of accent

+V +V –V

HL HL HL

very fast

–V –V

natural

/kusa/

/a/

mean (SD)

mean (SD)

7 3 4

287.1 (23.3) 277.7 (13.9) 272.8 (22.5)

64.7 (10.3) 75.3 (16.2) 74.0 (4.6)

HH* HL

7 7

163.9 (19.9) 388.1 (21.1)

60.3 (10.0) 131.9 (14.9)

–V

HL HH*

6

331.2 (11.7)

102.2 (12.0)

1

311.0

77.0

(—)

–V

HL HH*

5 2

282.6 (15.2) 261.0 (15.6)

103.8 90.5

(5.0) (2.1)

+V

HL

1

393.0

176.0

(—)

fast

–V –V

HL HL

6 7

341.8 (23.0) 270.0 (33.7)

170.0 (24.9) 122.4 (22.8)

very fast

–V

HH*

7

169.6 (38.1)

60.0 (21.8)

natural (1) MN (1930)

(2) KK (1963)

fast

fast very fast

(3) TA (1979)

Durations

times

natural

(—)

(—)

252

Miyoko Sugito

Table 2 shows the data for sentence frame (B) “Tsugi-wa (HLL) kusa (HL).” Looking at the table, we see that MN and TA devoiced and accented /u/ in /kusa/ more often in sentence frame (B) than in sentence frame (A). KK devoiced and accented all words. For TA, both voiced and devoiced vowels were found in both sentence frames; however, accent changes occurred more often in (B) than in (A). The results of the acoustic analysis may be summarized as follows: (1) Individual differences were observed. Speaker MN tended to produce the first mora voiced and accented. However, for the younger speakers, KK devoiced all the first mora vowels, while TA usually, but not always, devoiced and accented them. (2) Speech rate affected vowel voicing and accentedness. In fast speech, devoiced, accented vowels were observed more often than in natural speech. At a very fast speech rate, not only was the first mora vowel devoiced, but also the word accent tended to change to HH. (3) The sentence frame affected the accent patterns. Accent change was more often observed in frame (B) (accent HLL) than in (A) (accent HHH). A reason may be that when they spoke at a fast rate, it was more difficult for speakers to make the necessary laryngeal adjustments to raise the pitch for the accent HL immediately after the falling tones of “Tsugi-wa (HLL)”.

4.2. Speech rate and accent Examples of the effect of speech rate on accent are illustrated in Figure 1, which shows examples for speaker TA of speech waves in the top panels and F0 contours in the bottom panels of /kusa/ with accent HL in the sentence frames (A) and (B). The examples of sentence frame (A) “Kore-wa kusa” (HHH HL) are on the left, and (B) “Tsugi-wa kusa” (HLL HL), on the right. The vertical broken lines show the end of the sentence frame. The tokens shown in (1)–(4) were produced at a natural rate, while (5)–(6) were produced at a very fast rate. The F0 contours of (1) and (2) show that the first vowels of /kusa/ are voiced, as indicated by the white arrows. The same vowels in (3) and (4) are devoiced; however, the first moras of the tokens are perceived to be accented because of the abrupt falling F0 contours on the second vowels (indicated by the black arrows). Although the F0 contours at the end of the carrier sentences in (5) and (6) are rising, suggesting that the following mora has high pitch, the tokens of /kusa/ were perceived as having a HH accent by three Osaka dialect speakers. The second

The effect of speech rate on devoiced accented vowels in Osaka Japanese

253

vowels of /kusa/ in (5) and (6) are very short and their F0 contours have nearly level tones.

Figure 1. Speech waves and F0 contours of /kusa/ (HL) ‘grass’ in sentence frames, (A) “Kore-wa kusa”(HHH HL) ‘This is grass’ (1)(3)(5) and (B) “Tsugiwa kusa.”(HLL HL) ‘The next is grass’ (2)(4)(6). (1)(2): the first vowels voiced, accented. (3)(4): with devoiced accented vowels (natural speech). (5)(6): with accent perceived as HH (very fast speech). Broken lines: the beginning time points of the words “kusa” (speaker: TA).

An acoustic comparison of /kusa/ (HL) and /huta/ with accent HH also provides evidence to support the shift of /kusa/ from HL to HH in very fast speech.

254

Miyoko Sugito

Figure 2. Speech waves and F0 contours of /huta/ (HH) ‘lid’ in sentence frames, (A) “Kore-wa huta”(HHH HH) ‘This is a lid’ (1)(3)(5) and (B) “Tsugiwa huta.”(HLL HH) ‘The next is a lid’ (2)(4)(6). (1)(2): the first vowels voiced. (3)(4): the first vowels devoiced (natural speech). (5)(6): the first vowels devoiced (very fast speech) (speaker: TA). Broken lines: the beginning time points of the words “huta” (speaker: TA).

Figure 2 shows the F0 contours of the words /huta/ with HH accent in the sentence frames (A) and (B). The tokens in (1)–(4) were spoken at a natural rate, and those in (5)–(6) at a very fast rate by speaker TA. The F0 contours of (1) and (2) show that the first mora vowels are voiced, as indicated by white arrows. The first mora vowels of (3) and (4) are devoiced. The F0 contours of the second mora vowels are almost level. The second vowels of

The effect of speech rate on devoiced accented vowels in Osaka Japanese

255

/huta/ in (5) (6) of Figure 2 are similar to those of /kusa/ in (5) and (6) in Figure 1. All of them had level F0 contours, short durations, and were perceived as having HH accent.

5.

Results of the physiological experiments

This section uses the results of physiological experiments to investigate how Osaka accent patterns and devoiced accented vowels are produced in words /kusa/, with a close-open vowel configuration, compared with words like /kusi/ with a close-close vowel sequence. We will also discuss the question why accent change occurs in /kusa/ when it is spoken at a very fast rate.

5.1. Production of Osaka accent Figure 3 shows the averaged EMG (electromyographic) activities of the CT (Cricothyroid) (thick line) and SH (Sternohyoid) muscles (thin line) in 12 repetitions of /imi/ spoken with four different accent patterns: HH, HL, LHL, and LH at natural speech rate by a female Osaka dialect speaker, YI. The contours are superimposed on the same horizontal axis. Fundamental frequency contours were also averaged. Activities of the CT, the muscle shown to be involved in F0-raising, begin before the starting time point (the vertical thick line) of words with High-starting accent (HH and HL). However, SH activity is seen to occur more than 200 msec preceding L-HL or LH, (the Low-starting accents). SH activity has also been observed corresponding to the pitch fall on the second mora of the words with accent types HL and L-HL. The activity of SH is physiologically explained in that it is related not only to jaw opening and tongue back lowering, but also with voice lowering (Honda et al. 1999). Notice also that when H precedes L, as in the accent types HL or L-HL, the activity of CT is greater than in the accent types HH or LH. The EMG data show that the activity of CT is greater in HL, where H precedes L, than in HH where the fundamental frequency is high throughout.

256

Miyoko Sugito

Figure 3. Averaged F0 contours, patterns of EMG (electromyographies), CT and SH, for twelve utterances of /imi/ with four accent types HH, HL, L-HL, and LH, and speech envelopes (subject: YI).

5.2. F0 contours of /kusi/ and /kusa/ with accent HL When /kusi/ (HL) and /kusa/ (HL) are spoken at a very slow rate, the F0 contours of both words are not very different from each other. However, when they are spoken at a natural or fast rate, /kusi/ (with a close-close

The effect of speech rate on devoiced accented vowels in Osaka Japanese

257

vowel sequence) and /kusa/ (with a close-open vowel sequence) have different F0 contours. Figure 4 shows twelve superimposed F0 contours of /kusi/ and /kusa/, respectively. Dotted lines were interpolated through the period of /s/ from the end of V1 to the start of V2. Speaker YI spoke at a natural speech rate during the physiological experiment. Here, the F0 contours of (1) /kusi/ and (2) /kusa/ are quite different from each other. In /kusi/ (1), F0 contours begin to fall in the vicinity of the end of the first vowel, while in /kusa/ (2) it begins to fall at the beginning of the second vowels, as indicated by the black arrows. The initial vowel in /kusi/ is voiced, while that in /kusa/ is devoiced (except in one token whose second vowel starts a little lower compared with the other falling contours).

Figure 4. Superimposed F0 contours, (1) /kusi/ (HL) and (2) kusa/ (HL), twelve utterances each. Dotted lines: interpolated through the period of /s/ from the end of V1 to the start of V2. Arrows: the starting time points of falling F0 contours (speaker: YI).

5.3. Production of different F0 contours in /kusi/ and /kusa/ Figure 5 shows averaged F0 contours, cricothyroid (CT) muscle activity, sternohyoid (SH) muscle activity, and the speech amplitude of (1) /kusi/ (close-close vowel sequences) and (2) /kusa/ (close-open vowel sequences) with devoiced, accented first vowels. The contours represent the average of 12 and 11 repetitions, respectively. The vertical thick lines mark the onset point of the second vowel.

258

Miyoko Sugito

Figure 5. Averaged F0 contours, EMG (electromyographies), CT and SH, and speech amplitudes of (1) /kusi/ for twelve utterances, and (2) /kusa/ for eleven utterances with devoiced accented vowels. Arrows: the starting time points of F0 fall (subject: YI).

(1) /kusi/ with accent HL: The F0 contour of the first vowel of /kusi/ is high while the second vowel starts with a relatively low frequency. CT activity begins prior to the onset of the first vowel, which presumably accounts for the high F0 of the first vowel. During the first vowel, only the CT is active and SH activity is almost absent. SH activity begins prior to the onset of the second vowel. The end of CT activity and the beginning of SH activity occur at the same time at the end of the first vowel, as indicated by the broken vertical line. Activity of SH is associated with the low F0 of the second vowel. (2) /kusa/ with accent HL: This figure shows the F0 contour, the CT, and SH pattern of the word with devoiced accented vowel. The F0 contour of the vowel following the devoiced mora starts high and then drops sharply. It is notable that the CT peak (as indicated by the white arrow) is observed at the time it would occur if the first vowel were voiced; the same time point as observed in /kusi/. This suggests that the command for raising F0

The effect of speech rate on devoiced accented vowels in Osaka Japanese

259

was input for the first vowel, even though it was devoiced. An additional peak of CT activity (where the small black arrow points) is also observed in /kusa/. Notice that the second CT peak begins preceding the initial high starting F0 of the second vowel /a/. In /kusi/, the onset of the second vowel /i/ has a low F0, and correspondingly, there is no second CT activity associated with the second vowel /i/. As for /kusa/, co-occurring with the second CT activity, there is also onset of SH activity. The SH activity is associated with the F0 fall on the second vowel. All eleven utterances of words /kusa/ showed a similar pattern.

5.4. Discussion of the results of the physiological experiments Differences were found in the F0 contours of /kusa/ and /kusi/ with the same accent HL. The second vowel /a/ of /kusa/ showed a much steeper F0 fall than was observed in the second vowel /i/ in /kusi/. The second activity of CT and following greater SH activity continued during /a/ of /kusa/. A possible explanation for this may be that contraction of SH is involved in hyoid-larynx lowering, jaw opening, and tongue backing instrumental for F0 lowering (Honda et al. 1999). As for /kusa/ with the first vowel voiced in Figure 4 (2), F0 does not begin to fall at the end of the first vowel, but at the beginning of the second vowel. F0 fall on the following second vowel causes the first vowel to be perceived as accented. The result is different from what was discussed before, in which stress on the first vowel caused the first mora to be heard as accented. Moreover, timing relationships between prosodic and segmental control are different according to the differences in vowel height (close-open vs. close-close) in these words (Sugito 2003). Speech rates also affect these differences. Activity of CT was found for the accented first vowel even though it was devoiced. This observation suggests that the first vowel in /kusa/ is accented, even though there is no voicing during the vowel. The second set of activities of CT, and the following activities of SH may be related to the high starting F0 and abrupt falling contours of the second vowel /a/ of /kusa/. However, when the words were pronounced very fast, the HL accent of /kusa/ was changed to accent HH, as shown in Figure 1 (5) and (6). There may not have been enough time for the muscle commands of SH to bring about an abrupt falling F0 contour in the second vowel. Natural or rather fast speech rate is necessary for pronunciation of words with devoiced accented vowels. It may be the case that in very fast speech there is

260

Miyoko Sugito

no SH activity associated with these second vowels. The vowels that follow devoiced accented vowels need to have adequate length in order to allow for a falling F0 contour to occur.

6. Summary This paper examined the acoustic and physiological characteristics of voicing and accent changes in Osaka dialect words at different speech rates. Individual differences were observed. When speakers spoke at a relatively fast rate, devoiced, accented vowels were produced more frequently. Moreover, at a very fast rate, the HL accent was often changed to a HH accent. Laryngeal activities for the devoiced, accented vowels in /kusa/ were compared with those for the voiced accented vowels in /kusi/. CT activity in devoiced accented /kusa/ was found to occur at the time it would have occurred if the vowels were voiced. This observation strongly suggests that the devoiced vowels were not only perceived as accented, but were also produced as accented. With regard to the laryngeal activity for vowels in the second mora of the words, the second peak of CT activity was associated with a high starting F0, and the following SH activity with a fall in F0. These joint activities may be involved in the resulting steep falling F0 contour following the devoiced accented vowels. The accent change found in very fast speech might be due to the short duration of the second vowels. That is, we might conjecture that since the vowels were short, there was no time for the SH to become active, and consequently, no F0 fall occurred on these short vowels spoken in very fast speech. We hope that additional physiological experiments with natural, fast, and very fast speech, using MRI, will provide further insight into how this accent change occurs.

Acknowledgements The author would like to express her gratitude to Professors Donna Erickson, Raymond Weitzman, and Jeroen van de Weijer who kindly provided comments on this paper.

Where voicing and accent meet: their function, interaction, and opacity problems in phonological prominence Shin-ichi Tanaka

1. Introduction This study is devoted to rethinking the function and interaction of voicing and accent from a perspective of ‘prominence’ and tackling phonologicallysignificant issues on their interaction. Our ultimate goal is to shed new light on their interaction in a general theory of prominence that involves the harmonic scale of accent, tone, sonority, and voicing and to solve certain problems observed in the accentual phenomena on devoiced vowels of Japanese. Specifically, we are concerned with the issues of what happens when a vowel that should bear accent is exactly in the position that should be devoiced. This situation causes various problems because accent and devoicing are incompatible in principle but turn out to be sometimes compatible in the phonological grammar of Japanese. Let us review the historical background and motivation of our study. There have been many phonetic studies on vowel devoicing and its relation to accent in Japanese, and some researchers in this field are contributing their recent findings to the “vowel voice” part of the present book (Sugito, this volume). Compared to the abundance of phonetic literature on this topic, little attention has been paid to a phonological account of what happens when an accented vowel is devoiced. Yet we can find some theoretical work in the metrical framework, such as Yamada (1990), Haraguchi (1991), Tanaka (1992), and Yokotani (1997), which agree, on the basis of the descriptive literature (NHK, ed. 1998; Akinaga, ed. 2001), that accent can either remain on a devoiced vowel or shift to an adjacent vowel. However, the optionality and directionality of accent shift are so complicated that derivational analyses such as those above are problematic in their empirical coverage and do not explain the cases of accent shift beyond metrical constituents, as Yokotani (1997) and Tanaka (2002a) point out. Derivational accounts also pose the fundamental question as to how accent shift

262 Shin-ichi Tanaka from a devoiced vowel is represented phonologically in the first place. Their usual assumption is that a devoiced vowel loses its capacity to act as an accent-bearer and that the loss of the capacity is expressed phonologically by deleting the syllable node of the devoiced vowel, which triggers accent shift to an adjacent landing site. In section 3.1, it will turn out that an accented devoiced vowel still has a syllable node, while an accent-losing devoiced vowel does not, even though they are equally devoiced. Even if it is true that a devoiced vowel loses its original syllable node in the case of accent shift, it is still unclear what happens to the floating devoiced vowel with its voiceless onset that has lost a syllable node: floating segments are erased by convention, but the devoiced vowel may not be deleted, because vowel devoicing is distinct from vowel deletion (e.g., sentakúki sentáku• ki• / sentákki• ‘washing machine’ and suizokúkan suizóku• kan / suizókkan ‘aquarium’). That is, the syllable node should be deleted to cause accent shift but it should be preserved to make the distinction between vowel devoicing and vowel deletion, which makes the situation fall into a dilemma. All of these problems follow from the fact that previous phonological accounts do not uncover the essential nature of accent and voicing or make clear the exact phonological mechanism of the optionality of devoiced accent and accent shift. A key to solving such problems lies in reconsidering accent and voicing from a much wider perspective before focusing on the specific issues of devoiced accent and accent shift. This is because, as we will show, accent and voicing equally count as phonological prominence, and a general theory of prominence that involves tone, length, and sonority as well as accent and voicing allows us to understand their fundamental nature. Such a theory also serves to account for why accent quite often interacts with tone, length, sonority, and voicing. There have been systematic studies on the relation between accent and one of these four even in the recent OT literature; for example, accent and tone in de Lacy (1999), accent and sonority in Kenstowicz (1994b) and de Lacy (2001), and accent and length (syllable quantity) in Prince & Smolensky (1993) and other work on quantity sensitivity. However, these studies do not aim at elucidating the issues in the general context of prominence. Instead, Hayes (1995) is the first that attempts to construct an integrated theory of syllable prominence in which accent placement is sensitive to tone, length, and sonority. Unfortunately, however, his theory does not make clear the mutual relations between any two of the concepts of tone, length, sonority, and voicing, but only pays attention to the sensitivity of accent to the four elements of prominence.

Where voicing and accent meet

263

Furthermore, it lacks restrictiveness in that the relation of accent to length can be represented either in metrical grids or in prominence grids. Thus, a comprehensive and coherent theory is necessary that can account for the whole system of the relations and interactions among the elements of prominence. To develop such a theory and tackle the specific problems with accent and voicing, we will take the following steps. First, in section 2.1, we will show the general schema of our theory and demonstrate how it works with the harmonic scale of phonological prominence, where the overall interrelations among the elements of prominence will be made clear. Section 2.2 will focus on our particular interest, i.e., the interaction between accent and voicing, and we will realize the exact nature of their interaction: the harmonically-complete relation (implicational markedness) between them. We will observe, however, that the harmonic completeness in this implicational relation does not hold when we take the accentual phenomena of devoiced vowels in Japanese into consideration, which will be discussed in section 3.1. Then, we will show that they even raise an opacity problem in phonology, which is so serious that we cannot find any solution in the derivational framework. Instead, as we will argue in section 3.2, the notion of sympathy in Optimality Theory can give a very simple account of the phenomena, and the problems of harmonic incompleteness and opacity can be resolved within that framework in a fairly convincing way.

2.

A general theory of prominence

2.1. Completeness in the harmonic scale of prominence As Hayes (1995: 271) notes, “[h]eavy syllables, or syllables with high tone, or syllables with low vowels, and so on, tend to sound louder than other syllables.” We can also add voicing to this category, because vowels in the syllable nucleus are usually voiced and do sound louder than syllables with voiceless vowels, as we see by comparing normal speech with whisper. Here, what sounds louder in production is perceptually more salient at the same time, so phonological elements such as tone, length, sonority, and voicing can be said to serve as ‘prominence’ just like accent. Unlike these, there is another type of prominence that may be called ‘positional prominence’, that is, syllable onsets, root-initial syllables, word-initial syllables, etc. (J. Beckman 1998, de Lacy 2001, among others), but our particular interest

264 Shin-ichi Tanaka here is in phonetically-driven prominence or ‘inherent prominence’ which corresponds to a specific articulatory means to improve perceptual salience: for example, voicing corresponds to vibration of the vocal cords, sonority to aperture, tone to pitch, etc. Positional prominence does not involve its own specific articulatory resource. To make the discussion clearer, let us consider the phonetic factors or resources of the elements of prominence by using the chart in (1): (1)

Articulatory resources of phonological prominence

voicing

sonority

tone, length

accent

vibration

vibration aperture

vibration aperture pitch, duration

vibration aperture pitch, duration intensity

least prominent

loudness

most prominent

As is indicated in (1), voicing is the least prominent element, sonority is the next, tone and length are equally more prominent than the two, and the most prominent is accent. This is because more phonetic factors (or more articulatory effort) accumulate in a more prominent element. For example, voicing involves only vibration of the vocal cords, and sonority concerns both vibration and aperture. Furthermore, tone includes high pitch with vibration and aperture, and length refers to the duration of both. Finally, accent (including stress) involves intensity as well as the four articulatory resources, and all of these factors contribute to loudness (here, I use the term ‘accent’ to mean both stress accent and pitch accent by definition). The idea of prominence has several advantages and elucidates the fundamental characteristics of the elements of prominence. First, it correctly captures the implicational relation among the prominence elements: sonorant segments are voiced segments but not vice versa (e.g., voiced obstruents); high-toned segments are always sonorants, i.e., vowels or sonorant consonants, but it is not always the case that sonorants bear high tone; and accented vowels have higher pitch (i.e., high tone) and longer duration, but high-toned vowels or long vowels do not always have accent. Second, this implicational relation also captures typological differences in prosody: the intensity of accent in a word does not matter in true tone languages such as Chinese, but pitch is important in both pitch-accent languages like Japanese and stress-accent languages like English (the difference between both lan-

Where voicing and accent meet

265

guages lies in whether pitch gets phonologized as tone or remains somewhat phonetic as tune in the realization of pitch contour). Third, the chart in (1) accounts for what Hayes (1995: 7) calls ‘the parasitic nature of stress,’ which refers to the fact that stress parasitically invokes phonetic resources that serve other phonological ends. This point is clear from (1), because in accent or stress, all the articulatory resources are put together to realize loudness in speech production. The phonetic characterization of prominence elements in (1) can phonologically be represented as the Harmonic Scale of Prominence in (2), where A 쌇 B indicates that B is a proper subset of A, and A > B means A is more prominent than B: (2)

Harmonic Scale of Prominence a. voicing 쌇 sonority 쌇 tone 쌇 accent b. accent > tone > sonority > voicing

(2a) shows a harmonically-complete system of prominence where an element always implies the existence of any element(s) to the left. Especially, accent presupposes the existence of all the elements in the scale, so accent often interacts with tone, sonority, and voicing in realizing prominence, as will be discussed below. Note here that length is not incorporated into this phonological system. This is because an accented vowel is indeed phonetically longer than an unaccented one but is not necessarily a long vowel in phonology, although a long vowel tends to attract accent in quantity-sensitive languages. The aspect of quantity sensitivity is captured by the constraint of WEIGHT-TO-STRESS or more strictly speaking, PEAK-PROMINENCE (Prince & Smolensky 1993) and syllable quantity is a different concept from syllable prominence (Hayes 1995: 270–273). So, in what follows, we will just consider syllable prominence in line with the scale in (2) and exclude syllable quantity (i.e., length) from our discussion.1 Now let us look at the interplay of accent with the other prominence elements. As shown in (2), accent implies tone, sonority, and voicing, and there is a good possibility that accent has an effect on, or is influenced by, these elements in pitch-accent and stress-accent languages, where accent works together with the other prominence elements in order to enhance syllable prominence or the culminativity of a word. It is a kind of conspiracy effect of prominence elements. Such cases are classified into accent-conditioned prominence and prominence-conditioned accent (Tanaka 2005): in the former case, the behavior of tone, sonority, and voicing is sensitive to

266 Shin-ichi Tanaka accent position, whereas in the latter, accent placement is sensitive to the other elements. Since accent is the most dominant element in the scale of prominence, it can generally be said that we can find the former case more easily than the latter, due to the dominant nature of accent. This dominance relation is related to the prosodic hierarchy: voicing is a segmental property, sonority and tone are the properties in the domain of syllable, and accent is characterized by the upper prosodic category, foot.2 For example, in various pitch-accent languages, including Japanese, tone patterns are determined by connecting the accented mora to the high tone, as in (3a), but there are only few languages where accent placement relies on the tone patterns, as in (3b): (3)

Interaction of accent and tone a. Accent-conditioned tone (Japanese) kokóro ‘heart’ | | |

kámakiri ‘mantis’ | | | |

L H L

habúrasi ‘toothbrush’ | | | |

H L LL

otootó ‘little brother’ | | | | L HH H

L HL L

niwakaáme ‘sudden rainfall’ | | || | L H HH L

b. Tone-conditioned accent (Lithuanian, from Halle & Vergnaud 1987) víiras ‘man’ Víislas ‘Vistula’ | | | |

viínas ‘wine’ | |

HL L

LH L

HL L

viíksmas ‘course’ | | LH

L

The tone patterns in Japanese are derived by linking accent to H, and then the preceding and following moras are assigned H and L, respectively, with the proviso that the unaccented initial mora is always L (Haraguchi 1991). On the other hand, a long vowel in Lithuanian may either have acute (HL) or circumflex (LH) tone, and accent falls on the first mora linked to H (Hayes 1995). In both languages, accent and tone cooperate to highlight prominence in a word. This conspiracy effect is also seen in (4), where accent and sonority agree in prominent position:

Where voicing and accent meet

(4)

267

Interaction of accent and sonority a. Accent-conditioned sonority (English) Japán [æ] / Japanése [´] cónduct [A] / condúctive [´] Ítaly [´] / Itálian [æ] áccident [´] / accidéntal [e] b. Sonority-conditioned accent (Winnebago, from Susman 1943) hiáira ‘more’ aaháia ‘a deerskin’ hagoréia ‘sometimes’ gipiésge ‘enjoyable’ heriánaga ‘he was and’ haguáhi ‘go to get’

(4a) shows that accent loss causes vowel reduction and, conversely, accent acquirement turns schwas to full vowels; that is, sonority is based on accent placement.3 (4b) is the opposite case, where accent placement is based on vowel sonority. Winnebago, a Siouan language, has the sonority hierarchy of a > o > u > e > i; when accent is assigned on a diphthong, it falls on the more sonorous vowel (Susman 1943). Although Hayes (1995: 15) reports a few other cases of sonority-conditioned accent, they are relatively restricted in number. The fundamental characteristics of such interactions as in (3) and (4) are also seen in the case of accent and voicing. In the next section, we will present our main concern, accent and voicing, on the basis of the harmonic scale in (2).

2.2. Interaction of accent and voicing Let us briefly review the findings of the previous section. A crucial point is that the prominence scale in (2) is a harmonically-complete system in which implicational relations always hold among the elements of prominence in phonetic, phonological, or typological respects. As for accent and voicing, it is predicted from the prominence scale that i) accent implies voicing but not vice versa, ii) in pitch-accent or stress-accent languages, there may be a conspiracy effect that accent and voicing cooperate to enhance syllable prominence, and iii) accent-conditioned voicing is more dominant than voicing-conditioned accent. The prediction in i) is quite natural, since there is no doubt that accent always falls on a vowel within a word and a vowel is a voiced and sonorant segment, while it is not necessarily the case that any vowel bears accent in a word. As for the predictions in ii) and iii), accent-related voicing phenomena are given in (5) and (6), which are accentconditioned voicing and voicing-conditioned accent, respectively. Examples

268 Shin-ichi Tanaka are taken from English and Japanese to show that both stress-accent and pitch-accent languages exhibit the phenomena in accordance in ii). (5)

Accent-conditioned voicing a. /ks/-Voicing (English) éxecute [ks] / exécutive [gz] exhíbit [gz] / exhibítion [ks] b. /s/-Voicing (English) tránsit [s] / transítion [z]

prósody [s] / prosódic [z]

c. Blocking of /l/-Devoicing (English, from Hayes 1995) Íceland [l•] / Icelándic [l] áthlete [l•] / athlétic [l] d. Blocking of high vowel devoicing (Japanese) hísu• ‘hysteria’ písu• ‘piss’ híhi• ‘baboon’ súsu• ‘soot’ syúsi• ‘seed’ ku•tú ‘shoes’ ku•sí ‘comb’ ki•kú ‘daisy’ tu•tí ‘soil’ hu•kú ‘clothes’ (6)

Voicing-conditioned accent (Sino-Japanese, from Tanaka 2002a)4 bú-ka ‘subordinate’ zí-ken ‘affair’ kí-zoku• ‘noble’ zí-ki• ‘period’

vs. vs. vs. vs.

*hú-ka / hu• -ká ‘failure’ *sí-ken / si•-kén ‘exam’ *kí-soku• / ki•-sóku• ‘rule’ *sí-ki• / si•kí ‘four seasons’

Voicing in (5a, b) applies when the syllable in question acquires accent, just like in Verner’s Law (Halle 2003 and references therein), so this is clearly a case of the conspiracy of accent and voicing. In (5c), /l/ is devoiced when it is located between /s, T/ and a vowel, but devoicing is blocked when the following vowel is accented, which means that the accented vowel should be prominent together with the preceding voiced consonants just like (5a,b).5 Blocking of devoicing in the presence of accent is also seen in Japanese. As in ki•setu• ‘season,’ ki•soku• ‘rule,’ and si•setu• ‘facilities,’ the high vowels /i, u/, which have the lowest sonority among vowels, usually undergo devoicing when flanked by voiceless consonants or preceded by a voiceless consonant in word-final position. However, the words in (5d) whose vowels are both devoiceable in principle show that accent bans its application (devoicing does not occur on consecutive syllables, as we will argue in section 3.2). By contrast, the examples in (6) are cases where accent position is controlled by the application of devoicing: SinoJapanese should have accent immediately before the final morpheme, as in bú-ka ‘subordinate,’ zí-ken ‘affair,’ etc., but for the words in the right-hand

Where voicing and accent meet

269

column, devoicing must apply and accent automatically shifts to the adjacent syllable. Note that the blocking effect of devoicing is also seen on the landing site of the final example si•kí ‘four seasons.’ 6 This is because adjacent syllables are usually not devoiceable, as stated above. What (5) and (6) have in common is that accent and voicing conspire to maximize syllable prominence. Especially, the presence/absence of accent and voicing must target the same vowel, because accent implies voicing in the harmonically-complete system of prominence. Finally, the following are interesting cases with the interaction between accent and voicing, where compound accent and Rendaku voicing are in complementary distribution and the presence of one of them is necessary and sufficient for word prominence (Zamma, this volume, also discusses this point): (7)

Names of islands with sima ‘island’ from Tanaka (2003a, 2005)7 a. Rendaku without accent sakura-zima miyako-zima isigaki-zima iriomote-zima iou-zima b. Accent without Rendaku itukú-sima syoudó-sima awazí-sima tanegá-sima okinó-sima

(8)

Personal names with saburou ‘the third son’ from Haraguchi (2002)8 a. Rendaku without accent nin-zaburou ken-zaburou dai-zaburou b. Accent without Rendaku yo-sáburou ki-sáburou

tyou-zaburou

tama-sáburou tomi-sáburou

This complementary distribution of accent and voicing can be given a plausible account in our prominence theory. The domain of prominence in these cases is the whole word, not the syllable as in the previous cases, and one word prominence is necessary and sufficient as the basic nature of culminativity of a word. It may be the case that either of them functions as the prominence that marks the boundary of a compound.

270 Shin-ichi Tanaka 3.

The optionality of devoiced accent and accent shift

3.1. Incompleteness, opacity, and problems in derivational theory We have seen in the previous section that accent shift in (6) applies so as to avoid the devoiced vowels, because accent must agree with voicing according to the implicational relation in the prominence scale. In that sense, accent shift is a kind of repair strategy for upholding harmonic completeness in prominence. Actually, however, it is a well-known fact that accent can remain on a devoiced vowel as well as shift to the adjacent vowels (NHK, ed. 1998; Akinaga, ed. 2001). Moreover, a careful investigation of various data makes us realize that the movement and directionality of accent is very complicated, as is clear from the examples in (9): (9)

Optionality and directionality of accent shift a. Rightward shift across feet (kí•)-(sya) / (ki•)-(syá) ‘reporter’ (kí•)-(so) / (ki•)-(só) ‘basics’ (sí•)-(ken) / (si•)-(kén) ‘exam’ (kí•)-(kai) / (ki•)-(kái) ‘machine’ (kí•)-(soku•) / (ki•)-(sóku•) ‘rule’ (bou)(sí•)-(kake) / (bou)(si•)-(káke) ‘hat rack’ b. Leftward shift across feet (nana)-(hú• )(sigi) / (naná)-(hu•)(sigi) ‘seven wonders’ (nana)-(hí•)(kari) / (naná)-(hi•)(kari) ‘seven lights (influence of parents)’ c. Leftward shift within a foot (bi)(zyutú• )-(kan) / (bi)(zyútu• )-(kan) ‘art museum’ (dai)(gakú• )-(sei) / (dai)(gáku• )-(sei) ‘university student’ d. Rightward shift within a foot (sita)-(kú• ti)(biru) / (sita)-(ku• tí)(biru) ‘lower lip’ (syoku)(mu)-(sít•u)(mon) / (syoku)(mu)-(si•tú)(mon) ‘police check up’ e. Adjacent devoiceable syllables (sí•)-(ki) / (si•)-(kí) ‘four seasons’ (kí•)-(ti) / (ki•)-(tí) ‘base’ (hú• )-(ki) / (hu• )-(kí) ‘no return’ (syú• )-(ki) / (syu• )-(kí) ‘alcoholic smell’

Here, feet are already assigned to each word for expository purposes, following the analysis in Tanaka (2001, 2002b): accent is placed by constructing bimoraic feet from right to left without crossing morpheme boundaries, and it basically falls on the penultimate foot of each word.

Where voicing and accent meet

271

What is crucial is the very fact that the non-shifted variants allow accent to fall on devoiced vowels, a harmonically-incomplete behavior of accent and voicing. For that matter, vowel devoicing itself is a very strange phenomenon in the first place, since sonorants, including vowels, are preferably voiced in phonology, which is another aspect of harmonic completeness in prominence (see also note 1). There might be a possibility that devoicing is a phonetic rule outside the grammar, but we do not adopt this idea because, as we will see below, devoicing and its correlation to accent can be phonologized in a constraintbased grammar. Phonetically, pitch cannot be implemented without the vibration of the vocal cords, and yet the fact that accent stands in the devoiced environment clearly shows that phonetics and phonology are different. In fact, accent should be an abstract entity. This incompleteness suggests that the Harmonic Scale of Prominence in (2) and its related constraints may be outranked by the system of compound accent. In other words, compound accent may be respected at the cost of harmonic completeness, or otherwise accent shift applies by giving priority to harmonic prominence over compound accent. We will put forward such an account in the next section. In addition to harmonic incompleteness, the devoiced accent also poses another problem with phonological analysis, viz., opacity. More exactly, harmonic incompleteness may stem from the opacity concerned. As illustrated in (10 a), in derivational terms, accent-shifted forms are obtained in the feeding order of devoicing and accent shift, with compound accent preceding them. The resulting forms are transparent: (10) Non-surface-true opacity a.

Feeding Order /kí-sóku/

(kí)(soku) (kí•)(soku• ) transparent (ki•)(sóku• ) b.

opaque

Rules

/bízyutu-kán/ (bi)(zyutú)(kan) (bi)(zyutú•)(kan) (bi)(zyútu• )(kan)

Counter-Feeding Order /kí-sóku/

/bízyutu-kán/

(kí)(soku) — (kí•)(soku• )

(bi)(zyutú)(kan) — (bi)(zyutú• )(kan)

Compound Accent Devoicing Accent Shift Rules Compound Accent Accent Shift Devoicing

272 Shin-ichi Tanaka On the other hand, non-accent-shifted forms are derived in the counterfeeding order of accent shift and devoicing. In this case, accent shift does not apply because there is no trigger or environment there. However, devoicing applies after that, so the resulting forms exhibit non-surface-true opacity, or underapplication of accent shift. We have seen in (9) that both transparent and opaque outputs are acceptable in the Japanese accentual grammar. So not only is some theory necessary to obtain opaque outputs, but also the optionality of accent shift must be accounted for within that theory. Unfortunately, derivational theory does not serve to do the job. The opaque forms (kí• )(soku) and (bi)(zyutú• )(kan) in (10b) would be difficult to obtain, precisely because a condition that bans accent on devoiced vowels, which is a trigger of accent shift in (10a), should not be violated in derivational theory. Recall that in derivational theory, constraints are universal and inviolable in any case. More crucial cases involve accent shift across constituent boundaries. Consider the following examples, where the vowel in question loses its syllable as a consequence of devoicing, because accent-bearing elements are syllables in Japanese: 9 (11) Accent shift under derivational theory a. Leftward shift ( *) (*) (. *) <(*)> bi zyutu• kan

( *) (*) (*) <(*)> bi zyutu• kan

b. Rightward shift (*) (*) (*) <(. *)> (*) <(. *)> ? ki• soku• ki• soku• c. Cancellation of extrametricality (*) ( *) (*)(. *) (. *) ki• soku• *ki• soku• d. Leftward shift (. *) (. *) (. *) (*) <(. *)> (. *) (*)<(. *)> ? nana hi• kari nana hi• kari

Where voicing and accent meet

273

(11a) shows how leftward shift occurs after devoicing. We can obtain the correct form by deleting the syllable on the devoiced vowel. But a problem occurs with the rightward shift in (11b): the head of the foot cannot go out of the domain after syllable deletion, and even worse, the final foot is invisible, so accent shift is not predicted. Even if extrametricality were canceled before the application of devoicing and accent shift, as in (11c), the landing site of accent shift would be wrong and an ungrammatical form would be derived. In the same way, leftward shift across constituent boundaries is not accounted for, as (11d) shows. Another fundamental question arises when we take into account the fact that the accent-preserving vowel -tú• - in the left column of (11a) is still dominated by a syllable node but its accent-losing counterpart in the right column of (11a) is not, even though they are equally devoiced. This situation is very puzzling. Moreover, it is unclear what happens to the floating devoiced vowel that has lost its syllable node: floating segments are erased by convention, but the devoiced vowel may not be deleted, because vowel devoicing is distinct from vowel deletion as in sentakúki sentáku• ki• / sentákki• ‘washing machine’ and suizokúkan suizóku• kan / suizókkan ‘aquarium’ (cf. Kondo, this volume). In short, the syllable node should be deleted to cause accent shift but it should be preserved to make the distinction between vowel devoicing and vowel deletion, which leads to a paradoxical situation.

3.2. Sympathy in Optimality Theory Now let us give a constraint-based account of devoiced accent and optional accent shift in OT. Following the line in Tanaka (2002a), we assume the following constraints that are ranked in the given order: (12) Constraints for devoicing and accentuation 10 a. OCP (devoice) Syllables with devoiced vowels are not adjacent. b.*C8VC8 and *C8V# A high vowel is devoiced between voiceless consonants or when preceded by a voiceless consonant in word-final position. c. IDENT (voice) Voicing values remain identical between input and output.

274 Shin-ichi Tanaka d. *V!8 Accent is not permitted on devoiced vowels. e. NON-FINALITY (μ’, ’, F’) The accented mora or syllable or foot must not be final in PrWd. f. ALIGN-R (PrWd, ’) The right edge of any PrWd is aligned with the right edge of an accented syllable.11 (12d) is a specific constraint that reflects the Harmonic Scale of Prominence we discussed before (we will not enter into strictly formal analysis for the whole scale; see note 1 for this topic). Tanaka (2002a) demonstrates that such constraints as in (12) are sufficient to correctly capture the complicated directionality of accent shift in (9). Actually, however, another mechanism is necessary to account for the optionality in (9) and obtain the opaque forms with devoiced accent as well as the transparent ones with accent shift. (13) is such a mechanism where we follow McCarthy (1999, 2003a) in assuming the notion of sympathy in grammar:12 (13) Sympathy-related constraints a. Selector: IDENT (voice) b. Sympathy Constraint: MAX-O (accent) There is only one faithfulness constraint in my proposed ranking of (12), namely, IDENT (voice); thus, this is the selector, as in (13a). Moreover, since we have been concerned with accent shift as a consequence of devoicing, the sympathy constraint is MAX-O (accent), as in (13b), which monitors the similarity in accent between the sympathetic candidate and other candidates. This constraint assigns one violation mark to accent shift but two violation marks to accent deletion, since the latter is more serious than the former. Given the constraints in (12) and (13), the non-surface-true opacity can be accounted for very straightforwardly, as shown in (14):

275

Where voicing and accent meet

(14) Sympathy and free ranking a.

/kí+sóku/

(kí)+(soku) (kí)+(soku8) (kí•)+(soku8) () (ki•)+(sóku8) (ki•)+(soku8) b.

/nána+husigi/

(nana)+(hú)(sigi)

OCP *C8VC8 *C8V#

*! *!

*!

IDENT (voice) * ** ** **

IDENT (voice) * * * *

(nana)+(hu8)(sígi) (nana)+(hu8)(sigi) /bízyutu+kán/

(bi)(zyutú)+(kan) (bi)(zyutu•)+(kán) (bi)(zyutú8)+(kan) () (bi)(zyútu8)+(kan) (bi)(zyutu8)+(kan) d.

/sitá+kutibiru/

(sita)+(kúti)(biru)

OCP *C8VC8 *C8V#

*!

IDENT (voice)

OCP *C8VC8 *C8V#

IDENT (voice) ** ** ** **

OCP *C8VC8 *C8V#

(sí)+(ki) (sí)+(ki•) (sí•)+(ki) () (si•)+(kí) (si•)+(ki) (sí•)+(ki•) (si•)+(kí•)

*! *!

NONFIN ALIGN-R

* * * **!

*!

MAX-O *V!8 (accent)

** ** *** * *****

NONFIN ALIGN-R

* *

*!*

* **! MAX-O *V!8 (accent)

IDENT (voice)

*** *** ** ***!* ******

* * * **! MAX-O *V!8 (accent)

* * * * ** **

NONFIN ALIGN-R * * *

* * **! *

* ** ****

NONFIN ALIGN-R

* * * *

*! *!

MAX-O *V!8 (accent)

**!

(si•tá)+(ku8ti)(biru) (si•ta)+(ku8ti)(biru) /sí+kí/

* *!

** ** ** * ***

*

(si•ta)+(kú8ti)(biru) () (si•ta)+(ku8tí)(biru)

e.

* * *

* * * *

NONFIN ALIGN-R

(nana)+(hú8)(sigi) () (naná)+(hu8)(sigi)

c.

MAX-O *V!8 (accent)

*

OCP *C8VC8 *C8V#

*** * *

** * ***

Here, all the non-shifted forms with devoicing are evaluated as optimal correctly. More crucially, note here that if we rerank *V!8 over MAX-O (accent) in (14), as shown by the dotted lines, then the accent-shifted (i.e. transparent) form of each example is uniformly selected as optimal. So the two constraints must have free ranking (Anttila 1997, 2002; Anttila and

276 Shin-ichi Tanaka Cho 1998). This is the reason why both transparency and non-surface-true opacity are observed in the accentual system of Japanese. However, in the younger generation, devoiced but non-shifted forms are getting more and more common (Akinaga 1998: 220),which means that devoicing tends to apply obligatorily without any accent shift. This clearly shows the recent establishment of the ranking in (14).

4. Conclusion In this article, we have developed a theory of prominence by showing evidence for the Harmonic Scale of Prominence from phonetics, phonology, typology, and prosody. Specifically, we have argued for an implicational relation among voicing, sonority, tone, and accent by considering their various phonological interactions for enhancing syllable prominence or culminativity. Then, we have seen that the well-known phenomenon of devoiced accent poses such problems as i) incompleteness for prominence theory, ii) optionality and directionality of accent shift for derivational theory, and iii) non-surface-true opacity for general phonological theory. The apparent harmonic incompleteness in prominence has much to do with the opacity of accent and devoicing. Thus, we have proposed a solution in OT and have demonstrated that the notions of sympathy and reranking serve to solve all of these problems.

Acknowledgements This paper is part of my talk delivered at the Phonology Forum 2001 of the Phonological Society of Japan, which was held on August 28 at Chiba University. I would like to thank the audience for their helpful comments. I am also very grateful to Stuart Davis, Jeroen van de Weijer, and an anonymous reviewer for helping me improve the content and style of this paper. Special thanks go to Jeroen van de Weijer, Tetsuo Nishihara, and Kensuke Nanjo, who gave me the opportunity to elaborate my idea in this project. Any remaining inadequacies or misconceptions are my responsibility alone, of course. This study is supported by the Grant-in-Aid for Scientific Research (Basic Sciences (C)(2), grant number 15520306) of the Japan Society for the Promotion of Science.

Where voicing and accent meet

277

Notes 1. We assume here with Hayes (1995) that syllable weight consists of syllable quantity and syllable prominence. But his theory differs from ours in that length can be reflected on both moraic structure (i.e., syllable quantity) and prominence structure. Instead, we assume that length is only a matter of syllable quantity. Incidentally, another interesting approach to prominence is seen in Anttila (1997), who proposes a constraint-hierarchy system that incorporates accent, length, and sonority by using Prince & Smolensky’s (1993) Harmonic Alignment, although it does not incorporate voicing or tone and does not capture the implicational relations among prominence elements. If we incorporate voicing and tone as well in this approach and assume such binary prominence scales as V! > V (accent), V > V (tone), VV > V (length), V > C (sonority), | | H L and V > V• (voicing), Harmonic Alignment can derive constraint hierarchies like *{V, V!} >> *{V!, V} (accent & tone), *{VV, V!} >> *{V!V, V} (accent & | | | | H L H L length; WEIGHT-TO-STRESS or more strictly, PEAK-PROMINENCE), *{V, C!} >> *{V!, C} (accent & sonority), *{V, V!8} >> *{V!, V8} (accent & voicing), *{C, V8} >> *{C8, V} (sonority & voicing), etc. See Tanaka (2003b) for the details of such a theory, where the overall scale in (2) and these specific binary scales are developed in a uniform fashion. 2. Conversely, length-conditioned accent (i.e., quantity-sensitivity) seems to be more dominant cross-linguistically than accent-conditioned length (e.g. vowel lengthening on an accented vowel, vowel shortening with accent loss, etc.). This is another reason we argue for the division of labor between syllable quantity and syllable prominence and exclude length from our discussion of syllable prominence. For the interaction of accent and syllable length, see McGarrity (2003). 3. We assume here that schwa is less sonorous than non-high vowels. 4. In what follows, morpheme boundaries are represented with –. 5. Of course, it is not true that the consonants that precede accent should always be voiced in such a way as in (5a, b). As is well-known, stops in English may be voiceless and aspirated when they precede accent. Generally, less sonorous consonants are preferred to their voiced counterparts in onset position, because of the Dispersion Principle (Clements 1990, 1992). In Pirahã, CV and CVV are more prominent and attract accent more strongly than GV and GVV, respectively, which is a very rare case (C is a voiceless consonant and G is a voiced consonant). This is also related to the principle and a steep rise from onset to nucleus may sound more prominent than a gentle rise. 6. hísu• in (5d) and si•kí in (6) are fairly contrastive in that although they are originally accented on the first syllable, only the latter undergoes devoicing and

278 Shin-ichi Tanaka

7.

8.

9. 10. 11.

12.

accent movement to the right. The former example may pose a problem with the analysis that we will present in section 3.2, but it is true that hi•sú may optionally be acceptable as well. We will leave this question for further study, since such examples as in (5d) are limited in occurrence and the patterns in (6) are productive. The difference between them is whether or not they have a word boundary in their domain. Examples we are concerned with here are ones in which the modifier of sima is more than two moras. Exceptions such as nakano-sima and nakadoori-sima without any accent and Rendaku and hatizyóu-zima and kakará-zima with both are quite rare. As for (7b), syoudó-sima, awazísima, and tanegá-sima may lead us to believe that the voiced obstruents immediately before the boundary cause the blocking of Rendaku (due to the Lyman’s Law in a wider domain); however, there are island names like isigaki-zima, ogi-zima, and megi-zima, which contradict such a hypothesis. The distinction in (8) can be phonologized in such a way that the specifier in (8a) is CVC or CVV while the one in (8b) is CV or CVCV. It is surprising that the violation of Lyman’s Law is acceptable in the head of the names -zabu in (8a). We assume here, following Poser (1990) and Tanaka (1992), that compound accent is derived by final-foot extrametricality. (12a) may be a locally self-conjoined constraint banning a devoiced vowel, but we adopt an OCP-based version for expository purposes. This constraint is virtually equivalent to Prince & Smolensky’s EDGEMOST (pk; R; Word), which states that a peak of prominence (i.e. accent) lies at the right edge of PrWd. In addition to (12e, f), there are other constraints for compound accentuation; see Tanaka (2001, 2002b) for the exact hierarchy and the ranking relation between (12e) and (12f). McCarthy (2003b) compares various approaches to opacity using comparative markedness, local conjunction, stratal OT, sympathy, and targeted constraints only to conclude that each of them has its own advantages and disadvantages. Empirically, sympathy seems to be the best candidate to account for the data concerned, so we adopt it here.

Bibliography

Abramson, Arthur S. and Leigh Lisker 1970 Discriminability along the voicing continuum: Cross-language tests. Proceedings of the 6th International Congress of Phonetic Sciences, 569–573. Prague: Academia, Czechoslovak Academy of Sciences. Akinaga, Kazue 1998 The accent of Standard Japanese. In A Dictionary of Pronunciation and Accent in Japanese, NHK Institute of Broadcasting Culture (ed.), 174–221. Tokyo: NHK Publication. Akinaga, Kazue (ed.) 2001 A New Concise Dictionary of Japanese Accent. Tokyo: Sanseido. Anderson, John M. and Charles Jones 1974 Three theses concerning phonological representations. Journal of Linguistics 10, 1–26. Anderson, John M. and Colin J. Ewen 1987 Principles of Dependency Phonology. Cambridge: Cambridge University Press. Anonymous 1963 Nihongo no Rekishi 2: Moji to no Meguriai [History of Japanese 2: Contact with Characters] Tokyo: Heibon-sha. Archangeli, Diana B. and Douglas G. Pulleyblank 1994 Grounded Phonology. Cambridge and London: MIT Press. Anttila, Arto T. 1997 Deriving variation from grammar. In Variation, Change, and Phonological Theory, Frans Hinskens, Roeland van Hout and W. Leo Wetzels (eds.), 35–68. Amsterdam: John Benjamins. 2002 Morphologically conditioned phonological alternations. Natural Language and Linguistic Theory 20: 1–42. Anttila, Arto T. and Young-mee Yu Cho 1998 Variation and change in Optimality Theory. Lingua 104: 31–56. Avery, J. Peter 1996 The Representation of Voicing Contrasts. Doctoral dissertation, University of Toronto. Avery, J. Peter and William J. Idsardi 2002 Laryngeal dimensions in Japanese phonology. Talk presented at the Montreal-Ottawa-Toronto Phonology Workshop. Montreal, February 2002.

280 Bibliography Backley, Phillip 1998 Tier geometry: An explanatory model of vowel structure. Doctoral dissertation, University College London. Backley, Phillip and Toyomi Takahashi 1998 Element activation. In Structure and Interpretation: Studies in Phonology (PASE Studies & Monographs 4), Eugeniusz Cyran (ed.), 13–40. Lublin: Wydawnictwo Folium. Bao, Zhiming 1990 On the nature of tone. Doctoral dissertation, MIT. Beckman, Jill N. 1997 Positional faithfulness, positional neutralization, and Shona vowel harmony. Phonology 14 (1): 1–46. 1998 Positional faithfulness. Doctoral Dissertation, University of Massachusetts, Amherst. Beckman, Mary E. 1982 Segmental duration and the ‘Mora’ in Japanese. Phonetica 39: 113– 135. Bell, Alan E. 1978 Syllabic consonants. In Universals of Human Language, Volume 2: Phonology, Joseph H. Greenberg (ed.), 153–201. Stanford, California: Stanford University Press. Benua, Laura H. 1998 Transderivational identity: Phonological relations between words. Doctoral dissertation, University of Massachusetts, Amherst. Bird, Steven G. 1995 Computational phonology: A constraint-based approach. Cambridge: Cambridge University Press. Bloch, Bernard 1946 Studies in colloquial Japanese I: Inflection. Journal of the American Oriental Society 66: 97–109 (References are to the version in Miller 1970: 1–24). Blumstein, Sheila E., William E. Cooper, Harold Goodglass, Sheila Statlender and Jonathan Gottlieb 1980 Production deficit in aphasia: a voice onset time analysis. Brain and Language 9: 153–170. Boersma, Paul P. G. 1997 How we learn variation, optionality, and probability. Proceedings of the Institute of Phonetic Sciences, Amsterdam 21: 43–58. (Available on the Rutgers Optimality Archive, ROA-221.) Cabrera-Abreu, Mercedes 2000 A phonological model for intonation without low tone. Bloomington, Indiana: Indiana University Linguistics Club Publication.

Bibliography

281

Calabrese, Andrea 1995 A constraint-based theory of phonological markedness and simplification procedures. Linguistic Inquiry 26: 373–463. Campbell, Nick and Yoshinori Sagisaka 1991 Moraic and syllable-level effects on speech timing. Journal of Electronic Information Communication Engineering SP 90–107: 35–40. Charette, Monik and Asli Göksel 1998 Licensing constraints and vowel harmony in Turkic languages. In Structure and Interpretation: Studies in Phonology (PASE Studies & Monographs 4), Eugeniusz Cyran (ed.), 65–88. Lublin: Wydawnictwo Folium. Cho, Young-mee Yu 1998 Language change as constraint reranking. Historical Linguistics 1995. Amsterdam: John Benjamins. Chomsky, A. Noam and Morris Halle 1968 The Sound Pattern of English. New York: Harper and Row Clements, George N. 1978 Tone and syntax in Ewe. In Elements of Tone, Stress and Intonation Donna J. Napoli (ed.), 21–99. Washington, D.C.: Georgetown University Press. 1990 The role of the sonority cycle in core syllabification. In Papers in Laboratory Phonology I: Between the Grammar and Physics of Speech, John C. Kingston and Mary E. Beckman (eds.), 283–333. Cambridge: Cambridge University Press. 1992 The sonority cycle and syllable organization. In Phonologica 1988, Wolfgang U. Dressler, Hans C. Luschützky, Oscar E. Pfeiffer and John R. Rennison (eds.), 63–76. Cambridge: Cambridge University Press. 2001 Representational economy in constraint-based phonology. In Distinctive Feature Theory, T. Alan Hall (ed.), 71–146. Berlin /New York: Mouton de Gruyter. Clements, George N. and Susan R. Hertz 1991 Nonlinear phonology and acoustic interpretation. In Actes du XIIème Congrès International des Sciences Phonétiques, Aix-en-Provence, 19–24 août 1991 [Proceedings of the XIIth International Congress of Phonetic Sciences, Aix-en-Provence, August 19–24, 1991], Vol. 1, 364–373. Aix-en-Provence: Université de Provence, Service des Publications. Cohn, Abigail C. 1993 Nasalisation in English: Phonology or phonetics. Phonology 10: 43– 81.

282 Bibliography Coleman, John S. and Janet B. Pierrehumbert 1997 Stochastic phonological grammars and acceptability. In Computational Phonology: Third Meeting of the ACL Special Interest Group in Computational Phonology, 49–56. Association for Computational Linguistics, Somerset. Cremelie, Nick and Jean-Pierre Martens 1995 On the use of pronunciation rules for improved word recognition. In Proceedings Eurospeech ‘95: 1747–1750. 1997 Automatic rule-based generation of word pronunciation networks. In Proceedings Eurospeech ‘97: 2459–2462. 1999 In search of better pronunciation models for speech recognition. Speech Communication 29: 225–246. Dauer, Rebecca M. 1980 The reduction of unstressed high vowels in modern Greek. Journal of the International Phonetic Association 10: 17–27. de Lacy, Paul V. 1999 Tone and prominence. Ms., University of Massachusetts, Amherst (Available on the Rutgers Optimality Archive, ROA-333). 2001 Prosodic markedness in prominent positions. Ms., University of Massachusetts, Amherst (Available on the Rutgers Optimality Archive, ROA-432). 2002 The Formal Expression of Markedness. Ph.D. dissertation, University of Massachusetts, Amherst. 2004 Markedness conflation in Optimality Theory. Phonology 21: 154– 200. End, Kunimoto 1973 Kaiki to ruisui: Ma-gyou no dakuon-gana to sono haikei [Back formation and analogy: The kana of the [b]-column used for the [m]column, and its background]. Gifudaigaku Kyiku Gakubu Kynky Hkoku: Jinbun 21: 103–112. 1989 Kokugo Hyoogen to On’in Genshoo [Expression in Japanese and Phonological Phenomena]. Tokyo: Shinten-sha. Flemming, Edward S. 1995 Auditory representations in phonology. Doctoral dissertation, University of California, Los Angeles. Frisch, Stefan A. 1996 Similarity and Frequency in Phonology. Doctoral dissertation, Northwestern University, Evanston, Illinois. Fujimoto, Masako and Shigeru Kiritani 2003 Comparison of vowel devoicing for speakers of Tokyo and Kinki dialects. Journal of the Phonetic Society of Japan 7: 58–69.

Bibliography

283

Fukada, Toshiaki, Takayoshi Yoshimura and Yoshinori Sagisaka 1998 Automatic generation of multiple pronunciations based on nural networks and language statistics. In Proceedings of the ESCA Workshop on Modeling Pronunciation Variation for Automatic Speech Recognition, Helmer Strik, Judith M. Kessens and Mirjam Wester (eds.), 41–46. Rolduc, Kerkrade. 1999 Automatic generation of multiple pronunciations based on nural networks. Speech Communication 27: 63–73. Fukai, Ichiro (ed.) 1973 Zouhyou Monogatari Kenkyuu to Sou Sakuin [A Study of Zhy Monogatari: with facsimile and word index]. Tokyo: Musashino Shoin. Fukazawa, Haruka 1999 Theoretical implications of OCP effects on features in Optimality Theory. Doctoral dissertation, University of Maryland, College Park. Fukazawa, Haruka and Mafuyu Kitahara 2001 Domain-relative faithfulness and the OCP: Rendaku revisited. In Issues in Japanese Phonology and Morphology, Jeroen M. van de Weijer and Tetsuo Nishihara (eds.), 85–109. Berlin /New York: Mouton de Gruyter. Fukazawa, Haruka, Mafuyu Kitahara and Mitsuhiko Ota 1998 Lexical stratification and ranking invariance in constraint-based grammars. In Papers from the 34th Regional Meeting of the Chicago Linguistic Society, Part Two: The Panels, M. Catherine Gruber, Derrick Higgins, Kenneth S. Olson and Tamra Wysocki (ed.), 47–62. Chicago: Chicago Linguistic Society. 2002 Constraint-based modelling of split phonological systems. In On’in Kenkyuu [Phonological Studies] 5, the Phonological Society of Japan (ed.), 115–120. Tokyo: Kaitakusha. Fukuda, Suzy E. and Shinji Fukuda 1999 The operation of rendaku in the Japanese specifically languageimpaired: A preliminary investigation. Folia Phoniatrica et Logopaedica 51: 36–54. Fukui, Seiji 1990 Tonal features of numeral sequences in Kinki Dialect [in Japanese]. Studies in Phonetics and Speech Communication IV: 41–67. Kinki Society of Phonetics. Fukushima, Kunimichi 1959 Nihonkigo gokai [Commentary on the words in Nihonkigo [Ribenjiyu]]. Kokugogaku 36: 44–53. Gandour, Jackson T. and Rochana Dardarananda 1982 Voice onset time in aphasia: Thai I. perception. Brain and Language 17: 24–33.

284 Bibliography Gandour, Jackson T. and Rochana Dardarananda 1984 Voice onset time in aphasia: Thai II. production. Brain and Language 23: 177–205. Grootaers,Willem A. 1976 Nihon no Gengo Chiri Gaku no Tameni [For the Sake of Dialect Geography in Japan]. 48–77. Tokyo: Heibonsha. Halle, Morris 2003 Verner’s Law. In A New Century of Phonology and Phonological Theory: A Festschrift for Professor Shosuke Haraguchi on the Occasion of His Sixtieth Birthday, Takeru Honma, Masao Okazaki, Toshiyuki Tabata and Shin-ichi Tanaka (eds.). Tokyo: Kaitakusha. Halle, Morris and Kenneth N. Stevens 1971 A note on laryngeal features. MIT Quarterly Progress Report of the Research Laboratory of Electronics 101: 198–213. 1991 Knowledge of language and the sounds of speech. In Music, Language, Speech and Brain: Proceedings of an International Symposium at the Wenner-Gren Center, Stockholm, 5–8 September 1990, Johan Sundberg, Lennart Nord and Rolf Carlson (eds.), 1–19. Houndmills: MacMillan Press. Halle, Morris and Jean-Roger Vergnaud 1987 An Essay on Stress. Cambridge, Massachusetts: MIT Press. Hamada, Atsushi 1952a Hatsuon to dakuon to no sookansei no mondai [Issues in relativity between moraic nasals and voiced obstruents]. Kokugo-Kokubun 21(3), 18–32. 1952b Kouji gonen Chousen-ban Iroha Onmon taion kou [Thoughts on correspondence of Hangul [to Japanese kana] seen in Iroha (Iropa) of the fifth year of Kouji [1492] printed in Korea]. Kokugo-Kokubun 21 (10): 22–32. 1955 Haneru-on [Moraic nasals]. In Kokugogaku Jiten, Kokugo Gakkai (ed.), 750–751. Tokyo: Tokyo-do. 1960 Rendaku to renjou [Rendaku and sandhi]. Kokugo-Kokubun 29 (10): 1–16. 1971 Sei daku [Sei-daku: Clear-muddy]. Kokugo-Kokubun 40 (11): 40–51. Hamano, Shoko 2000 Voicing of obstruents in Old Japanese: Evidence from the soundsymbolic stratum. Journal of East Asian Linguistics 9: 207–25. Han, Mieko, S. 1962a Japanese phonology. Tokyo: Kenkyusha. 1962b Unvoicing of vowels in Japanese. Study of Sounds 10: 81–100. 1994 Acoustic manifestations of mora timing in Japanese. Journal of the Acoustical Society of America 96: 73–82.

Bibliography

285

Haraguchi, Shosuke 1977 The Tone Pattern of Japanese: An Autosegmental Theory of Tonology. Tokyo: Kaitakusha. 1991 A Theory of Stress and Accent. Dordrecht: Foris. 2002 A theory of voicing. In A Comprehensive Study on the Phonological Structure of Languages and Phonological Theory, Shosuke Haraguchi (ed.), 1–22. Technical Report of Basic Sciences (A)(1), Grant-inAid for Scientific Research by the Japan Society for the Promotion of Science. Harris, John K. M. 1990 Segmental complexity and phonological government. Phonology 7: 255–301. 1994 English Sound Structure. Oxford: Blackwell. 1997 Licensing Inheritance: An integrated theory of neutralisation. Phonology 14: 315–370. 1998 Phonological universals and phonological disorder. In Linguistic Levels in Aphasia: Proceedings of the RuG-SAN-VKL Conference on Aphasiology, Evy Visch-Brink and Roelien Bastiaanse (eds.), 91–117. San Diego, CA: Singular Publishing Group. Harris, John K. M. and Geoffrey A. Lindsey 1995 The elements of phonological representation. In Frontiers of Phonology: Atoms, Structures, Derivations, Jacques Durand and Francis X. Katamba (eds.), 34–79. Harlow, Essex: Longman. 2000 Vowel patterns in mind and sound. In Phonological knowledge: Conceptual and empirical issues, Noel Burton-Roberts, Philip Carr and Gerry J. Docherty (eds.), 185–205. Oxford: Oxford University Press. Hasegawa, Kiyoshi, Katsuaki Horiuchi, Tsutomu Momozawa and Saburo Yamamura (eds.) 1986 Obunsha’s Comprehensive Japanese-English Dictionary. Tokyo: Obunsha. Hashimoto, Shinkichi 1917 Kokugo kanazukai kenkyshij no ichihakken. Teikoku Bungaku 23 [References are to the version in Hashimoto (1949: 123–163).]. 1932 Kokugo ni okeru biboin [Nasalized vowels in Japanese]. Kokugo On’in no Kenkyuu. Tokyo: Iwanami. 1949 Moji Oyobi Kanadzukai no Kenkyuu. Tokyo: Iwanami. Hattori, Noriko 1989 Mechanisms of word accent change: Innovations in Standard Japanese. Doctoral dissertation, University College, London. Hattori, Shiro 1928 On two-syllable words uttered in Kameyama-cho area in Mie Prefecture. Bulletin of The Phonetic Society of Japan 11.

286 Bibliography Hattori, Shiro 1950 Phoneme, phone, and compound phone. Gengo Kenkyu 16: 92–109 (Revised version appeared in Gengogaku no Hoohoo [Methods in Linguistics]. Tokyo; Iwanami, 1960). 1960 Gengogaku no Hoohoo [Methods in Linguistics]. Tokyo: Iwanami. Hayata, Teruhiro 1977a Nihongo no on’in to rizumu [Sounds and rhythm of Japanese]. Dentou to Gendai 45: 41–49. 1977b Seisei akusento ron [Generative accentuation] In no, Susumu and Takeshi Shibata (eds.), Nihongo 5: On’in [Japanese 5: Sounds], 323– 360. Hayes, Bruce P. 1989 Compensatory lengthening in moraic phonology. Linguistic Inquiry 20: 253–306. 1995 Metrical Stress Theory: Principles and Case Studies. Chicago: The University of Chicago Press. Hayes, Bruce P. and Donca Steriade 2004 Introduction: the phonetic basis of phonological markedness. In Phonetically-Based Phonology, Bruce Hayes, Robert Kirchner and Donca Steriade (eds.), 1–33. Cambridge: Cambridge University Press. Hepburn, James Curtis 1867 A Japanese and English Dictionary with an English and Japanese Index. Shanghai: American Presbyterian Mission Press [Reprinted in 1983. Tokyo: Charles E. Tuttle]. Hibiya, Junko 1999 Variationist Sociolinguistics. The Handbook of Japanese Linguistics, Natsuko Tsujimura (ed.), 101–120. Cambridge, Mass.: Blackwell. Hinskens, Frans and Jeroen M. van de Weijer 2003 Patterns of segmental modification in consonant inventories: A crosslinguistic study. Linguistics 41 (6): 1041–1084. Hirayama, Teruo, Ichiro Oshima, Makio Ono, Makoto Kuno, Mariko Kuno and Takao Sugimura 1992 Gendai Nihongo Hgen Daijiten [Dictionary of Japanese Dialects]. Tokyo: Meiji Shoin. Hirose, H. et al. 1994 Analysis and formulation of the prosodic features of Standard Mandarin Chinese. The Journal of the Acoustical Society of Japan 50 (3): 177–187. Honda, Kiyoshi, Hiroyuki Hirai, Shinobu Masaki and Yasuhiro Shimada 1999 Role of Vertical Larynx Movement and Cervical Lordosis in F0 Control. Language and Speech 42: 401–411.

Bibliography

287

Huang, Xuedong, Alex Acero and Hsian-Wuen Hon 2001 Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Upper Saddle River, NJ: Prentice Hall PTR. Hulst, Harry G. van der 1989 Atoms of segmental structure: Components, gestures and dependency. Phonology 6: 253–284. 1995 Radical CV Phonology: The categorial gesture. In Frontiers of phonology: Atoms, structures, derivations, Jacques Durand and Francis X. Katamba (eds.), 80 –116. Harlow, Essex: Longman. Hume, Elizabeth V. and Georgios Tserdanelis 2002 Labial unmarkedness in Sri Lankan Portuguese creole. Phonology 9: 441–458. Hyman, Larry M. 1975 Phonology: Theory and Analysis. New York: Holt, Rinehart and Winston. Hwang, Mei-Yuh and Xuedong Huang 1993 Shared-Distribution Hidden Markov Models for Speech Recognition. IEEE Trans. on Speech and Audio Processing 1 (4): 414–420. Ide, Itaru 1989 Manyougana [Man’y-gana]. In Kanji-kza 4: Kanji to kana [Kanji and kana], Kiyoji Sato (ed.), 225–255. Tokyo: Meiji Shoin. Imaizumi, Satoshi, Akiko Hayashi and Toshisada Deguchi 1995 Listener adaptive characteristics of vowel devoicing in Japanese Dialogue. Journal of the Acoustical Society of America 98(2): 768–778. Inkelas, Sharon 1998 The theoretical status of morphologically conditioned phonology: a case study of dominance effects. In Yearbook of Morphology 1997, Geert E. Booij and Jaap van Marle (eds.), 121–155. Dordrecht: Kluwer. Inoue, Fumio 2000 Tohoku hogen no hensen: Shonai hogen rekishi gengogaku teki koken [History of Tohoku dialects: Contribution of historical linguistics of Shnai dialect]. Tokyo: Akiyama. Inoue, Michiyasu 1932 Manyoushuu zakkou [Thoughts on Man’ysh]. Tokyo: Meiji-shoin. Itô, Junko 1990 Prosodic minimality in Japanese. In CLS 26-II: Papers from the Parasession on the Syllable in Phonetics and Phonology, K. Deaton, Manuela Noske and M. Ziolkowski (eds.), 213–239. Itô, Junko, Yoshihisa Kitagawa and Ralf-Armin Mester 1996 Prosodic faithfulness and correspondence: Evidence from a Japanese argot. Journal of East Asian Linguistics 5: 217–94.

288 Bibliography Itô, Junko and Ralf-Armin Mester 1986 The phonology of voicing in Japanese. Linguistic Inquiry 17: 49–73. 1993 Licensed segments and safe paths. Canadian Journal of Linguistics 38: 197–213 1995a Japanese phonology. In Handbook of Phonological Theory, John A. Goldsmith (ed.), 817–838. Cambridge: Blackwell. 1995b The core-periphery structure of the lexicon and constraints on reranking. In University of Massachusetts Occasional Papers in Linguistics 18: Papers in Optimality Theory, Jill N. Beckman, Suzanne C. Urbanczyk and Laura Walsh Dickey (eds.), 181–210. Amherst: GLSA. 1996 Stem and word in Sino-Japanese. In Phonological Structure and Language Processing: Cross-linguistic Studies, Takeshi Otake and Anne Cutler (eds.), 13–44. Berlin /New York: Mouton de Gruyter. 1997 Correspondence and compositionality: The ga-gy variation in Japanese phonology. In Derivations and Constraints in Phonology, I. M. Roca (ed.), 419–462. New York: Oxford University Press. 1998 Markedness and word structure: OCP effects in Japanese. Ms. University of California, Santa Cruz (Available on the Rutgers Optimality Archive, ROA-255). 1999a The phonological lexicon. In The Handbook of Japanese Linguistics, N. Tsujimura (ed.), 62–100. Malden, Mass. and Oxford, U.K: Blackwell Publishers. 1999b The lexicon in Optimality Theory. Handout presented at University of Tsukuba, Special Research Project for the Typological Investigation of Languages and Cultures of the East and West. 2000 Weak parallelism and modularity: Evidence from Japanese. In Report of the Special Research Project for the Typological Investigation of Languages and Cultures of the East and West III, Part I, Shosuke Haraguchi (ed.), 89 –105. Ibaraki: University of Tsukuba. 2001 Covert generalizations in Optimality Theory: the role of stratal faithfulness constraints. In Proceedings of 2001 International Conference on Phonology and Morphology, 3–33, Yongin, Korea. 2003 Japanese Morphophonemics: Markedness and Word Structure. Cambridge, Mass.: MIT Press. Itô, Junko, Ralf-Armin Mester and Jaye E. Padgett 1995 Licensing and underspecification in Optimality Theory. Linguistic Inquiry 26: 571–614. 1999 Lexical classes in Japanese: A reply to Rice. Phonology at Santa Cruz 6: 39–46. Itoh, Motonobu, Itaru F. Tatsumi and Sumiko Sasanuma 1986 Voice onset time perception in Japanese aphasic patients. Brain and Language 28: 71–85.

Bibliography

289

Iwabuchi, Etsutar 1934 Youkyoku no utai-kata ni okeru nisshou “tsu” ni tusite [On the entering tone [=coda] “-t” in the singing of ykyoku]. Kokugo to Kokubungaku 11: 5, 7 and 9. (98–117, 91–101 and 85–95, respectively) Iwanami Shoten Henshbu (ed.) 1992 Gyakubiki Kjien. Tokyo: Iwanami. Jakobson, Roman 1968 Child language, aphasia and phonological universals. The Hague: Mouton. Jakobson, Roman, C. Gunnar M. Fant and Morris Halle 1952 Preliminaries to speech analysis. Cambridge, Mass.: MIT Press. Jessen, Michael and Catherine O. Ringen 2001 On the status of [voice] in German. In WCCFL 20: 304–317. Jdaigo Jiten Hensh Iinkai (ed.) 1967 Jidaibetsu kokugo daijiten: jdaihen. Tokyo: Sanseid. Jun, Sun-Ah 1993 The phonetics and phonology of Korean prosody. Unpublished Ph.D. dissertation. The Ohio State University, Columbus, Ohio. Jun, Sun-Ah and Mary E. Beckman 1993 A gestural-overlap analysis of vowel devoicing in Japanese and Korean. Paper presented at the 1993 Annual Meeting of the LSA, Los Angeles, 7–10 January, 1993. Jurafsky, Daniel and James H. Martin 2000 Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Upper Saddle River, NJ: Prentice Hall. Kager, René W. J. 1999 Optimality Theory. Cambridge: Cambridge University Press. Kamei, Takashi 1970 Kana wa naze dakuon sen’you no jitai o motanakatta ka-o megutte kataru [Discussing why kana did not have letters only for daku-on]. Hitotsubashi Daigaku Kenkyuu Nenpou: Jinbun Kagaku Kenkyuu 12: 1–92. 1985 Dakuon [Voiced sounds]. Heibonsha Hyakka Jiten 9: 227–228. Tokyo: Heibonsha. Kamei, Takashi, Rokuro Kono and Eiichi Chino 1997 Gengogaku daijiten selection: Nihonrett no gengo [Linguistics encyclopedia selection: Language in the Japanese archipelago]. Tokyo: Sanseido. Kasuga, Kazuo 1941 Kojiki ni okeru seidaku kakiwake ni tsuite [On the sei-daku distinctions in Kojiki]. Kokugo-Kokubun 11(4): 39–78.

290 Bibliography Kawakami, Shin 1969 Musee haku no tsuyosa to akusento kaku [The intensity of devoiced moras and the accent nucleus]. Kokugakuindai Kokugo Kenkyu 27. Kawasaki, Takako 1996 Sonority and voicing: a structural analysis. Ms., McGill University, Montreal, Quebec. Kaye, Jonathan D., Jean Lowenstamm and Jean-Roger Vergnaud 1985 The internal structure of phonological representations: A theory of charm and government. Phonology Yearbook 2: 305–328. 1990 Constituent structure and government in phonology. Phonology 7: 193–231. Kazama, Rikiz (ed.) 1979 Tsuzuriji gyakujun hairetsu goksei ni yoru Daigenkai bunrui goi. Tokyo: Fuzanb. Kenstowicz, Michael J. 1994a Phonology in Generative Grammar. Oxford: Blackwell. 1994b Sonority-driven stress. Ms., Massachusetts Institute of Technology, Cambridge (Available on the Rutgers Optimality Archive, ROA-33). Kess, Joseph E. and Tadao Miyamoto 1999 The Japanese Mental Lexicon. Psycholinguistic Studies of Kana and Kanji Processing. Amsterdam and Philadelphia: John Benjamins. Kikuchi, Hideaki and Kikuo Maekawa 2002 Accuracy of automatic phoneme labeling on spontaneous speech. Proceedings of the 2002 Spring meeting of the Acoustical Society of Japan: 97–98. Kikuda, Norio 1971 Yôgen no rendaku no ichiyôin [xxx]. Kaishaku 17 (5): 24–29. Kindaichi, Haruhiko and Kazue Akinaga 1997 Shinmeikai Accent Dictionary of the Japanese Language. Tokyo: Sanseidoo. Kindaichi, Haruhiko, Ooki Hayashi and Takesi Sibata (eds.) 1988 An Encyclopaedia of the Japanese Language. Tokyo: Taishukan. Kindaichi, Kyosuke 1941 Kokugo no hensen [Transition of Japanese Language]. Tokyo: NHK. Kiparsky, R. Paul V. 1982 Lexical Phonology and Morphology. In Linguistics in the Morning Calm, the Linguistic Society of Korea (ed.), 3–91. Seoul: Hanshin. 1985 Some consequences of Lexical Phonology. Phonology Yearbook 2: 85–138. Kitahara, Mafuyu 1998 The interaction of pitch accent and vowel devoicing in Tokyo Japanese. In Japanese-Korean Linguistics 8, D. Silvia (ed.), 303–15. Stanford, CA: CSLI.

Bibliography

291

Kitahara, Yasuo (ed.) 1990 Nihongo gyakubiki jiten [Japanese Reverse Dictionary]. Tokyo: Taishkan. Kiyose, Gisabur N. 1985 Heianch hagy-shiin p-onron [The study of ha-gyo consonant psound in the Heian period]. Onsei no Kenky 21: 73–87. Kohler, Klaus J. 1984 Phonetic explanation in phonology: the feature fortis/lenis. Phonetica 41: 150–174. 1990 Segmental reduction in connected speech in German: phonological facts and phonetic explanations. In Speech Production and Speech Modelling, William J. Hardcastle and Alain Marchal (eds.), 69–92. Dordrecht: Kluwer. Komatsu, Hideo 1971 Nihon seishoushi ronkou [Study of the history of Japanese accent]. Tokyo: Kazama shob. Kondo, Mariko 1995 Temporal adjustment of devoiced morae in Japanese. Proceedings of the 13th International Congress of Phonetic Sciences 3: 238–241. 1997 Mechanisms of vowel devoicing in Japanese. Doctoral dissertation, University of Edinburgh. 2001 Vowel devoicing and syllable structure in Japanese. Japanese/Korean Linguistics 9. 2003 Speech rhythm and consonant sequence production in Japanese. The Proceedings of the 6th International Seminar on Speech Production, 138–143. Kubozono, Haruo 1988 The Organization of Japanese Prosody. Doctoral dissertation, Edinburgh University (Published by Kurosio Publishers, 1993). 1993 The Organization of Japanese Prosody. Tokyo: Kurosio Publishers. 1995 Gokeisei to On’in Koozoo [Word Formation and Phonological Structure]. Tokyo: Kurosio Publishers. 1996 Syllable and accent in Japanese. The Bulletin of the Phonetic Society of Japan 211: 71–82. 1998 Kintaroo-to Momotaroo-no akusento-koozoo [The structure of accent in Kintaroo and Momotaroo]. Kobe Gengogaku Ronsoo 1: 35–49. 2002a Prosodic structure of loanwords in Japanese: Syllable structure, accent and morphology. Journal of the Phonetic Society of Japan 6 (1): 79–97. 2002b The syllable as a unit of prosodic organization in Japanese. In The Syllable in Optimality Theory, Caroline Féry and Ruben van de Vijver (eds.), 129–56. Cambridge: Cambridge University Press.

292 Bibliography Kula, Nancy Chongo and Lutz Marten 1998 Aspects of nasality in Bemba. SOAS Working Papers in Linguistics and Phonetics 8: 191–208. Kuno, Susumu 1973 The Structure of the Japanese Language. Cambridge, Mass: MIT Press. Kuroda, Shige-Yuki 2002 Contrast in Japanese. A contribution to feature geometry. Paper presented at the Second International Conference on Contrast in Phonology. University of Toronto, Toronto, Ontario, Canada. May 3, 2002. Kuwabara, Hisao and Kazuya Takeda 1988 Analysis and prediction of vowel-devocalization in isolated Japanese words. ATR Technical Report TR-I-0033. Kyoto: ATR Interpreting Telephony Research Laboratories. Labrune, Laurence 1999 Variation intra et inter-langue: Morpho-phonologie du rendaku en japonais et due sai-sios en coréen. In Phonologie: théorie et variation, Cahiers de grammaire 24: 117–152. Lange, Roland A. 1973 The Phonology of Eighth-Century Japanese. Tokyo: Sophia University. Liberman, Mark Y. and Alan S. Prince 1977 On stress and linguistic rhythm. Linguistic Inquiry 8: 249–336. Lisker, Leigh and Arthur S. Abramson 1964 A cross-language study of voicing in initial stops: acoustical measurements. Word 20: 384–422. 1970 The voicing dimension: some experiments in comparative phonetics. Proceedings of the Sixth International Congress of Phonetic Sciences, Prague 1967, 563–567. Prague: Academia, Czechoslovak Academy of Sciences. Lombardi, Linda 1995 Laryngeal features and privativity. The Linguistic Review 12: 35–59. 2002 Why place and voice are different: Constraint-specific alternations in Optimality Theory. In Segmental Phonology in Optimality Theory, L. Lombardi (ed.), 13–45. Cambridge: Cambridge University Press. Lyman, Benjamin S. 1894 Change from surd to sonant in Japanese compounds. Oriental Club of Philadelphia. Mabuchi, Kazuo 1971 Kokugo on-in ron [Japanese Phonology]. Tokyo: Kasama Shoin. Maddieson, Ian 1984 Patterns of Sounds. Cambridge: Cambridge University Press.

Bibliography

293

Maddieson, Ian and Peter N. Ladefoged 1993 Phonetics of partially nasal consonants. In Phonetics and Phonology Vol. 5, Nasals, Nasalization, and the Velum, M. K. Huffman and R. A. Krakow (eds.), 251–301. San Diego: Academic Press. Maeda, Hiroyuki 2004 Seiyaku joretsu to nyuuryokukei no henka [Constraint ranking and shift of inputs]. In Nihongoshi no rironteki jisshouteki kiban no saikouchiku [Reconstructing the theoretical and empirical bases of the history of Japanese], Grant-in-Aid Scientific Research, Ministry of Education, Culture, Sports, Science and Technology. Maekawa, Kikuo 1989 Boin-no museika [Devoicing of vowels]. In Nihon-go no OnseiOn’in (1), M. Sugito (ed.), 135–153, Meiji Shoin. 1990a Effects of speaking rate on the voicing variation in Japanese. Technical Report of the Institute of Electronics, Information and Communication Engineers (SP89-148): 47–53. 1990b Production and perception of the accent in the consecutively devoiced syllables in Tokyo Japanese. Proceedings of International Conference on Spoken Language Processing (ICSLP) 2: 517–520. Kobe. 2002a Hanashikotobani okeru chouboinno tanko. Kokugogakkai 2002 nendo syunkitaikai youshisyuu: 43–50. 2002b Study of language variation using Corpus of Spontaneous Japanese. Journal of Phonetic Society of Japan, 6 (3): 48–59. Maekawa, Kikuo, Hanae Koiso, Sadaoki Furui and Hitoshi Isahara 2000 Spontaneous speech corpus of Japanese. Proceedings of the Second International Conference of Language Resources and Evaluation (LREC) 2: 947–952. Maekawa, Kikuo, Hideaki Kikuchi, Yosuke Igarashi and Jennifer J. Venditti 2002 X-JToBI: An extended J_ToBI for spontaneous speech. Proceedings of the 7th International Conference on Spoken Language Processing (ICSLP2002) 3: 1545–1548. Denver. Marten, Lutz 1996 Swahili vowel harmony. SOAS Working Papers in Linguistics and Phonetics 6: 61–75. Martin, Samuel E. 1952 Morphophonemics of Standard Colloquial Japanese. Supplement to Language (Language Dissertation No. 47). Baltimore: Linguistic Society of America. 1975 A Reference Grammar of Japanese. New Haven: Yale University Press. 1987 The Japanese Language through Time. New Haven: Yale University Press.

294 Bibliography Maruyama, Rinpei 1967 Joudaigo Jiten. (Dictionary of Jdai [710–794] vocabulary). Tokyo: Meiji Shoin. Mathias, Gerald B. 1973 On the modification of certain Proto-Korean-Japanese reconstructions. Papers in Japanese Linguistics 2: 31–47. Matsui, F. Michinao 1993 Museihaku joo no akusento kaku no chikaku ni tsuite [Perceptual study of the accent on devoiced accented mora]. Paper presented at the 28th Kinki Onsei Gengo Kenkyuukai, Osaka, Japan. Matsumoto, Takashi 1965 Ma-gyou on ba-gyou on koutai genshou no keikou [Tendency of alternations between the [b]-column sounds and the [m]-column sounds]. Kokugogaku Kenkyuu 5: 52–65. Matsumura, Akira (ed.) 1988 Daijirin [Daijirin Japanese Dictionary]. Tokyo: Sanseid. Matthews, Peter H. 1974 Morphology: An Introduction to the Theory of Word Structure. Cambridge: Cambridge University Press. McCarthy, John J. 1999 Sympathy and phonological opacity. Phonology 16: 331–399. 2003a Sympathy, cumulativity, and the Duke-of-York Gambit. In The Optional Syllable, Caroline Féry and Ruben van de Vijver (eds.). Cambridge: Cambridge University Press. 2003b Comparative Markedness. Ms., University of Massachusetts, Amherst. Available on Rutgers Optimality Archive, ROA- 489]. McCarthy, John J. and Alan S. Prince 1995 Faithfulness and Reduplicative Identity. University of Massachusetts Occasional Papers in Linguistics 18: Papers in Optimality Theory, Jill N. Beckman, Suzanne C. Urbanczyk and Laura Walsh Dickey (eds.), 249–384. Amherst: GLSA. McCawley, James D. 1968 The Phonological Component of a Grammar of Japanese. The Hague: Mouton. 1977 Accent in Japanese. In Studies in Stress and Accent, Southern California Occasional Papers in Linguistics 4, Larry M. Hyman (ed.), 261–302. Los Angeles: University of Southern California Department of Linguistics. McGarrity, Laura W. 2003 Constraints on patterns of primary and secondary stress. Doctoral Dissertation, Indiana University. Mielke, Jeffrey 2004 The emergence of distinctive features. Ph.D. dissertation, Ohio State University.

Bibliography

295

Miller, Roy A. 1967 The Japanese Language. Chicago: University of Chicago Press. 1986 Nihongo: In Defence of Japanese. London: Athlone. Miller, Roy A. (ed.) 1970 Bernard Bloch on Japanese. New Haven: Yale University Press. Miyake, Marc H. 2003 Old Japanese: A Phonetic Reconstruction. London: Routledge Curzon. Mori, Hiromichi 1991 Kodai no on’in to Nihonshoki no seiritsu [Sounds of Old Japanese and completion of Nihonshoki]. Tokyo: Taishukan. Murayama, Tadashige 2001 Nihon-no Myooji Besuto 10,000 [Top 10,000 Surnames in Japanese]. Tokyo: Shin-Jinbutsu-Ooraisha. Nakagawa, Yoshio 1966 Rendaku, Rensei (Kashou) no Keifu [Compounds with Rendaku and Compounds without Rendaku]. Kokugo Kokubun 35 (6): 302–314. Kyoto: Kyoto University. Nakata, Norio (ed.) 1972 Kooza kokugo-shi 2: On’in-shi – Moji-shi [History of sounds and characters]. Tokyo: Taishkan. Nakata, Norio and Hiroshi Tsukishima 1980 Dakuten [Daku-ten]. In Kokugogaku daijiten [Dictionary of National Language Study], Kokugo Gakkai (ed.), 586–587. Tokyo: Tokyodo. Napoli, Donna J. and Marina A. Nespor 1976 The syntax of raddoppiamento sintattico. Unpublished Ms. Nasu, Akio 1999 Onomatope-ni okeru yuuseika-to [p]-no yuuhyoosei [Voicing in onomatopoeia and the markedness of [p]]. Journal of the Phonetic Society of Japan 3: 52–66. 2001 Heiretugo akusento no yure to keisan [Accent of Japanese dvandva and its variations]. Paper presented at the 26th Annual Meeting of Kansai Linguistic Society. Nasukawa, Kuniya 1995 Melodic structure and no constraint-ranking in Japanese verbal inflexion. Paper presented at the Autumn Meeting of the Linguistic Association of Great Britain. University of Essex. 1998 An integrated approach to nasality and voicing. In Structure and Interpretation: Studies in Phonology (PASE Studies & Monographs 4), Eugeniusz Cyran (ed.), 205–225. Lublin: Wydawnictwo Folium. 1999 Prenasalisation and melodic complexity. UCL Working Papers in Linguistics 11: 207–224.

296 Bibliography Nasukawa, Kuniya 2005a A Unified Approach to Nasality and Voicing. Berlin /New York: Mouton de Gruyter. 2005b Melodic complexity in infant language development. In Developmental Paths in Phonological Acquisition, Marina Tzakosta, Claartje Levelt and Jeroen van de Weijer (eds.). Leiden Papers in Linguistics 2 (1): 53–70. Nihon Daijiten Kankkai (ed.) 1972–76 Nihon kokugo daijiten [Grand Japanese Dictionary]. Tokyo: Shgakukan. Nihon Hoso Kyokai [NHK] 1985 NHK Nihongo Hatsuon Akusento Jiten [NHK Pronunciation and Accent Dictionary of Japanese]. 1st ed., Nihon Hoso Shuppan Kyokai, Tokyo. 1998 NHK Nihongo Hatsuon Akusento Jiten [NHK Pronunciation and Accent Dictionary of Japanese]. 2nd ed., Nihon Hoso Shuppan Kyokai, Tokyo. Nishihara, Tetsuo 2002 Tohoku hgen ni okeru Shiin no Yseika [On consonant voicing in the Tohoku dialect]. Miyagi Kyiku Daigaku Gaikokugo Kenky Ronsh 2: 19–24. Miyagi University of Education, Sendai. Nishimiya, Kazutami 1960 Joudai-go no seidaku: shakkun moji o chshin to shite [Sei-daku in Jdai Japanese [710–794]: focusing on the characters for kun readings]. Man’y 36: 1–19. Ogura, Sinpei 1910 Lyman-si no rendaku-ron [Lyman’s theory of sequential voicing]. Kokugakuin zassi [Journal of the National Research Institute] 16 (7): 9–23. Ohala, John J. 1983 The origin of sound patterns in vocal tract constraints. The Production of Speech, P. F. MacNeilage (ed.), 189–216. New York: Springer. Ohno, Kazutoshi 2000 The lexical nature of Rendaku in Japanese. In Japanese/Korean Linguistics, Vol. 9, Mineharu Nakayama and Charles J. Quinn, Jr. (eds.), 151–164. Stanford: CSLI publications and Stanford Linguistics Association. 2002 Rules or lexicon: sticking to rules or giving them up. Presentation at the Second Conference on Formal Linguistics. June 22–23. Hunan University, Changsha, Hunan, China. forthc. Analogy: guessable rules – Towards a better understanding of the rendaku phenomenon. In Proceedings of LP2002, Shosuke Haraguchi, Bohumial Palek and Osamu Fujimura (eds.).

Bibliography

297

Okumura, Mitsuo 1955 Rendaku. In Kokugogaku jiten, Kokugo Gakkai (ed.), 916–961. Tokyo: Tkyd. Ono, Masahiro 1995 Kindai no moji [Characters in Kindai (in and after 1338)]. In Gaisetu Nihongo no rekishi [Survey of the history of Japanese], Takeyoshi Sat (ed.), 42–83. Tokyo: Asakura. no, Susumu 1947–48 Nihonshoki no jion-gana ni okeru seidaku hyouki ni tuite. [On the sei-daku representations by kana of the on reading in Nihonshoki]. Kokugo to Kokubungaku [Japanese Language and Literature] 24 (11): 49–59 and 25(1): 43–50. 1953 Joudai Kana-dzukai no Kenkyuu. [Study of Kana Usage in Joodai (710–794)]. Tokyo: Iwanami. 1980 Nihongo no sekai 1: Nihongo no seiritsu [World of Japanese 1: Formation of Japanese]. Tokyo: Chkronsha. Oohashi, Junichi 2002 Tohoku hogen onsei no kenkyu [Study of the Sounds of Tohoku dialects]. Tokyo: Oufuu. Orgun, Cemil Orhan 1998 Cyclic and noncyclic phonological effects in a declarative grammar. In Yearbook of Morphology 1997, Geert E. Booij and Jaap van Marle (eds.), 179–218. Dordrecht: Kluwer. Otsu, Yukio 1980 Some aspects of rendaku in Japanese and related problems. In Theoretical issues in Japanese linguistics: MIT Working Papers in Linguistics 2, Yukio Otsu and Ann Farmer (eds.), 207–227. tsubo, Heiji 1977 Katakana, hiragana [Katakana and hiragana]. In Nihongo 8: Moji [Japanese 8: Characters], Susumu no and Takeshi Shibata (eds.), 249–299. Tokyo: Iwanami. Parker, Charles K. 1939 A dictionary of Japanese compound verbs. Tokyo: Maruzen. Pater, Joseph V. 1999 Austronesian nasal substitution and other NC effects. In The prosody-morphology interface, René W. J. Kager, Harry G. van der Hulst and Wim Zonneveld (eds.), 310 –343. Cambridge: Cambridge University Press. Pater, Joseph V. and Adam Werle 2001 Typology and variation in child consonant harmony. Proceedings of HILP 5: 119 –139. Pierrehumbert, Janet B. and Mary E. Beckman 1988 Japanese Tone Structure. Cambridge, Mass: MIT Press.

298 Bibliography Ploch, Stefan 1999 Nasals on my mind: the phonetic and the cognitive approach to the phonology of nasality. Doctoral dissertation, School of Oriental and African Studies, University of London. Polivanov, Yevgeny D. 1928 Two kinds of musical accent of the Mie dialect in Nagasaki Prefecture. Studies on the Japanese Language (translated by S. Murayama 1976): 61. Port, Robert F., Jonathan M. Dalby and Michael L. O’Dell 1987 Evidence for mora timing in Japanese. Journal of the Acoustic Society of America 81 (5): 1574–1585. Poser, William J. 1990 Evidence for foot structure in Japanese. Language 66: 78–105. 2002 Japanese periphrastic verbs and noun incorporation. Ms., University of Pennsylvania. Prince, Alan S. 1998 Foundations of Optimality Theory; Current directions in Optimality Theory. In Handouts of lecture at the Phonology Forum 1998, Kobe University, September 1998. Phonological Studies 2, the Phonological Society of Japan (ed.). Tokyo: Kaitakusha. Prince, Alan S. and Paul Smolensky 1993 Optimality Theory: Constraint interaction in generative grammar. Ms., Rutgers University and University of Colorado. [Blackwell, Oxford, 2004.] Pulleyblank, Douglas G. 1997 Optimality Theory and features. Optimality Theory: An Overview, D. Archangeli and T. Langendoen (eds.), 59–101. Massachusetts, USA and Oxford, UK: Blackwell. 2003 Covert feature effects. WCCFL 22 Proceedings, Gina Garding and Mimu Tsujimura (eds.), 398–422. Somerville, Mass.: Cascadilla Press. Reinhart, Tanya M. 1976 The syntactic domain of anaphora. Doctoral dissertation, MIT. Rice, Keren D. 1993 A reexamination of the feature [sonorant]: The status of ‘sonorant obstruents’. Language 69: 308–344. 1997 Japanese NC clusters and the redundancy of postnasal voicing. Linguistic Inquiry 28: 541–551. 2003 Featural markedness in phonology: Variation. The Second Glot International State-of-the-Article Book. Lisa Cheng and Rint Sybesma (eds.), 389–429. Berlin /New York: Mouton de Gruyter. Rice, Keren D. and J. Peter Avery 1991 On the relationship between laterality and coronality. In Phonetics and Phonology 2. The Special Status of Coronals: Internal and External

Bibliography

299

Evidence, Carol Paradis and Jean-François Prunet (eds.), 101–124. San Diego: Academic Press. Sakuma, Kanae 1929 Nihon Onseigaku [Japanese Phonetics]. Tokyo: Kyobunsha. 1931 Word accent of the Kyoto dialect. Study of Sounds. Phonetic Society of Japan. Sakurai, Shigeharu 1966 Kytsgo no hatsuon de chi subeki kotogara. In Nihongo hatsuon akusento jiten, Nihon Hs Kykai (ed.), 31–43. Tokyo: Nihon Hs Shuppan Kykai. 1985 Kyootsuu-go no hatsuon de chuui subeki kotogara [Notes on the pronunciation of Standard Japanese]. In Japanese Pronunciation and Accent Dictionary, Appendix to NHK (ed.), 128–143. Tokyo: NHK Publication. Sanada, Shinji 1981 Chiiki to no kakawari: Koutsuu to tsuushin no gairaigo [Foreign words of transportation and communication]. In Eibei Gairaigo no Sekai [The world of loanwords from the English language], Yoshifumi Hida (ed.). Tokyo: Nan’undou. 1991 Hyoujungo wa Ikani Seiritsu sitaka [How did Standard Japanese get established?]. 176–198. Tokyo: Soutakusya. Sato, Hirokazu 1989 Hukugoogo ni okeru akusento kisoku to rendaku kisoku [Accent and rendaku rules in compounds]. In Nihongo no Onsei On’in [The Phonetics and Phonology of Japanese], Miyoko Sugito (ed.), 233–265. Tokyo: Meiji Shoin. 2002 Decomposition into syllable complexes and the accenting of Japanese borrowed words. Journal of the Phonetic Society of Japan 7(1): 67–78. Sato, Ryoichi 2002 Gendai nihongo no hatsuon bumpu [Pronunciation distribution of Present-day Japanese]. Gendai nihongo kza [Lectures on Presentday Japanese] Vol. 3: Hatsuon [Pronunciation], 20–39. Tokyo: Meiji Shoin. Sat, Takeyoshi (ed.) 1995 Gaisetu Nihongo no rekishi [Survey of the History of Japanese]. Tokyo: Asakura. Sato, Yumiko 1993 The durations of syllable-final nasals and the mora hypothesis in Japanese. Phonetica 50: 44–67. Schane, Sanford A. 1984 Fundamentals of particle phonology. Phonology Yearbook 1: 129–155. 1995 Diphthongisation in particle phonology. In Handbook of Phonological Theory, John A. Goldsmith (ed.), 586–605. Oxford: Blackwell.

300 Bibliography Shibata, Takeshi 1962 On’in [Phonology]. Hogengaku gaisetsu [Phonology, General survey of dialectology]. Tokyo: Musashino Shoin. Shibatani, Masayoshi 1990 The Languages of Japan. Cambridge: Cambridge University Press. Shikano, Kiyohiro, Katsuteru Ito, Tatsuya Kawahara, Kazuya Takeda and Mikio Yamamoto (eds.) 2001 Speech Recognition Systems. Tokyo: Ohmsha. Shimizu, Katsumasa 1977 Voicing features in the perception and production of stop consonants by Japanese speakers. Studio Phonologica 11: 25–34. Shinohara, Shigeko 1997 The roles of the syllable and the mora in Japanese adaptations of French words. Cahiers de Linguistique Asie Orientale 25 (1): 87–112. Paris: CRLAO EHESS. Siegel, Dorothy C. 1974 Topics in English phonology. Doctoral dissertation, MIT. Smolensky, Paul 1994 Harmony, markedness, and phonological activity. Ms., Johns Hopkins University (Available on the Rutgers Optimality Archive, ROA-37). 1995 On the structure of the constraint component Con of UG. handout for talk at UCLA. 1997 Constraint interaction in generative grammar II: Local conjunction. Paper presented at the Hopkins Optimality Theory Workshop /University of Maryland Mayfest, May 8–12, 1997. Steriade, Donca 1995 Underspecification and markedness. In The Handbook of Phonological Theory, John A. Goldsmith (ed.), 114–174. Oxford: Blackwell. Strik, Helmer and Catia Cucchiarini 1999 Modeling pronunciation variation for ASR: A survey of the literature. Speech Communication 29: 225–246. Sugito, Miyoko 1965 Shibata-san to Imada-san: Tango-no chookakuteki benbetsu ni tsuiteno ichi koosatsu [Mr. Shiba-ta and Mr. Ima-da: A study in the auditory differentiation of words], Gengo Seikatsu 165 [S40-6], 64 –72 (Reproduced in Miyoko Sugito 1998, Nihongo Onsei no Kenkyu [Studies on Japanese Sounds]. Izumi-Shoin, Vol. 6: 3–15.) 1969 Akusento no aru museika boin [A study on accented voiceless vowels]. The Bulletin of the Phonetic Society of Japan 132: 1–3. 1969/70 Measurements of tone movement of vowels and hearing validity in relation to accent in Japanese. Studia Phonologica 5: 1–19. University of Kyoto.

Bibliography 1982

301

Nihongo Akusento no Kenkyuu [Studies on Japanese Accent]. Tokyo: Sanseido. 2000 SUGI SpeechAnalyzer. Yokohama: Fujitsu Animo. 2003 Timing relationships between prosodic and segmental control in Osaka Japanese word accent. Phonetica 60: 1–16. Sugito, Miyoko and Hajime Hirose 1978 An electromyographic study of the Kinki accent. Annual Bulletin of the Research Institute of Logopedics and Phoniatrics 12: 35–51. 1988 Production and perception of accented devoiced vowels in Japanese, Annual Bulletin of the Research Institute of Logopedics and Phoniatrics 22: 19–37. University of Tokyo. Susman, Amelia L. 1943 The accentual system of Winnebago. Doctoral dissertation, Columbia University, New York. Suzuki, Takao 1990 Nihongo to gaikokugo [The Japanese language and foreign languages]. Tokyo: Iwanami. Takagi, Ichinosuke, Tomohide Gomi and Susumu no (annotators) 1960 Manyooshuu 3 [Man’ysh 3]. Tokyo: Iwanami. Takayama, Michiaki 1992a Rendaku to renjoudaku [Rendaku and ‘sandhi voicing’]. Kuntengo to kuntenshirou 88: 115–124. 1992b Sei daku shoukou [Brief thoughts on sei-daku]. In Nihongo ronkyuu 2: Koten Nihongo to Jisho [Discussion on Japanese 2: Classic Japanese and dictionaries], Tajima, Ikudo and Kazuya Niwa (eds.), 17– 56. Osaka: Izumi shoin. 1993 Sokuon no atono dakuon [Voiced geminates]. Shimadai kokubun 21 [Bulletin of Japanese Literature of Shimane University 21], the Japanese Literature Society of Shimane University (ed.), 48–55. Takayama, Michiaki 2002 Lecture notes (revised from Takayama 2000, Nihongo on’inshi no hh” [Methodology of Japanese historical phonology], Nihongogaku 19 –11). Ms., Kyushu University. Takayama, Tomoaki 1993 Hasatsu-on to masatsu-on no gry to daku-shiin no henka: iwayuru ‘Yotsugana’ gry no rekishiteki ichizuke [Merger of affricates and fricatives and sound change of voiced obstruents: historical view of the so-called ‘Yotsugana’ merger]. Kokugo Kokubun 62 (4): 18–30. 1999 Shakuyougo no Rendaku /Kou’onka ni tsuite (1) [On Rendaku in Loanwords]. Report of the Special Research Project for the Typological Investigation of Language and Cultures of the East and West 1999, Part II, 375–385. Tsukuba: Tsukuba University.

302 Bibliography Takeda, Kazuya and Hisao Kuwabara 1987 Boin museika no youin bunseki to yosoku syuhou no kentou [Analysis and prediction of devocalizing phenomena]. Proceedings of the 1987 Autumn Meeting of the Acoustical Society of Japan 1: 105–106. Tamamura, Fumio 1989 Gokei [Word form] In Nihongo no goi imi [Words and meaning in Japanese], Fumio Tamamura (ed.), 23–51. Tokyo: Meiji shoin. Tanaka, Makir 1995 Kodai no buntai, bunshou [Style and writing in Kodai (before 1338)]. In Gaisetu Nihongo no Rekishi [Survey of the History of Japanese], Takeyoshi Sat (ed.), 190–206. Tokyo: Asakura. Tanaka, Shin-ichi 1992 Accentuation and prosodic constituenthood in Japanese. Tokyo Linguistic Forum 5: 195–216. 2001 The emergence of the ‘unaccented’: Possible patterns and variations in Japanese compound accentuation. In Issues in Japanese Phonology and Morphology, Jeroen M. van de Weijer and Tetsuo Nishihara (eds.), 159–192. Berlin /New York: Mouton de Gruyter. 2002a An OT-based integrated model of accent and accent shift phenomena in Japanese. Phonological Studies 5, the Phonological Society of Japan (ed.), 99–104. Tokyo: Kaitakusha. 2002b Three reasons for favoring constraint reranking over multiple faithfulness. In A Comprehensive Study on the Phonological Structure of Languages and Phonological Theory Shosuke Haraguchi (ed.), 121– 130. Technical Report of Basic Sciences (A)(1), Grant-in-Aid for Scientific Research by the Japan Society for the Promotion of Science. 2003a Review of Eric Robert Rosen, 2001, Phonological Processes Interacting with the Lexicon: Variable and Non-Regular Effects in Japanese Phonology. GLOT International. 2003b Japanese grammar in the general theory of prominence: Its conceptual basis, diachronic change, and acquisition. In A New Century of Phonology and Phonological Theory: A Festschrift for Professor Shosuke Haraguchi on the Occasion of His Sixtieth Birthday, Takeru Honma, Masao Okazaki, Toshiyuki Tabata and Shin-ichi Tanaka (eds.). Tokyo: Kaitakusha. 2005 Accent and Rhythm: From the Basics of Phonology to Optimality Theory. Tokyo: Kenkyusha. Tanaka, Shin’ichi and Haruo Kubozono 1999 Nihongo no Hatsuon Kyooshitsu [Introduction to Japanese Pronunciation]. Tokyo: Kurosio Publishers.

Bibliography

303

Tateishi, Koichi 2001 On’in jisho kurasu seeyaku no bunpu ni tsuite [On the distribution of constraints for phonological sub-lexica]. Paper presented at the 26th Annual Meeting of the Kansai Linguistic Society, Ryukoku University, Kyoto. 2002 Lexical stratification theories and (un)markedness. paper presented at LP 2002, Meikai University, September 3, 2002. 2003 Phonological patterns and lexical strata. In Proceedings of CIL 17, E. Hajicova, A. Kotesovcova and J. Mirovsky (eds.). Prague: Matfyzpress, MFF UK. Tj, Misao (ed.) 1954 Nihon Hoogengaku [Japanese Dialectology]. Tokyo: Yoshikawa Koubunkan. Tsujimura, Natsuko 1996 An Introduction to Japanese Linguistics. Oxford: Blackwell. Tsukishima, Hiroshi 1972 Kodai no moji [Characters in Kodai] (approximately 8c–11c, in this book). In Kooza kokugo-shi 2: On’in-shi – Moji-shi [History of sounds and characters], Norio Nakata (ed.), 311–444. Tokyo: Taishkan. Tsuru, Hisashi 1960 Manyoushuu ni okeru shakkun-gana no seidaku hyouki [Sei-daku notations of kun reading kana in Man’ysh]. Man’y 36: 20–32. 1977 Manyougana [Man’y-gana]. In Nihongo 8: Moji [Characters], Susumu no and Takeshi Shibata (eds.). Tokyo: Iwanami. Unger, J. Marshall 1977 Studies in Early Japanese Morphophonemics. Bloomington: Indiana University Linguistics Club [Doctoral dissertation, Yale University, 1975]. Uwano, Zendo, Aizawa Masao, Kato Kazuo and Sawaki Motoei 1989 On’in sran [Survey of Phonology]. Nihon hgen dai jiten [Encyclopedia of Japanese dialects], Munakata Tokugawa (ed.), 1–77. Tokyo: Shogakukan. Vance, Timothy J. 1980 The psychological status of a constraint on Japanese consonant alternation. Linguistics 18: 245–267. 1983 On the origin of voicing alternation in Japanese consonants. Journal of the American Oriental Society 102: 333–341. 1987 An Introduction to Japanese Phonology. Albany: State University of New York Press. 1992 Lexical phonology and Japanese vowel devoicing. In The Joy of Grammar, Brentari et al. (eds.). Amsterdam: John Benjamins.

304 Bibliography 1996

Sequential voicing in Sino-Japanese. Journal of the Association of Teachers of Japanese 30: 22–43. 2002 Semantic bifurcation in Japanese compound verbs. Japanese/ Korean Linguistics 10, Noriko M. Akatsuka and Susan Strauss (eds.), 365– 377. Stanford: CSLI. Varden, J. Kevin 1998 On high vowel devoicing in Standard Modern Japanese: Implications for current phonological theory. Doctoral dissertation, University of Washington. Venditti, Jennifer J. and Jan P.H. van Santen 1998 Modeling vowel duration for Japanese text-to-speech synthesis. Proceedings of the 5th International Conference on Spoken Language Processing (ICSLP98), Sydney: 2043–2046. Wada, Minoru 1947 A view patterns and marks of Japanese accent (in Japanese). Language; Autumn Issue 2: 29–44. Wenck, Günther 1959 Japanische Phonetik, Volume 4. Wiesbaden: Otto Harrasowitz. Weijer, Jeroen M. van de 1996 Segmental Structure and Complex Segments. Tübingen: Niemeyer. Wheeler, Max W. 1979 Phonology of Catalan. Oxford: Basil Blackwell. Yamada, Eiji 1990 Stress assignment in Tokyo Japanese: Stress shift and stress in suffixation. Fukuoka University Review of Literature and Humanities 22: 97–154. Yamaguchi, Yoshinori 1988 Kodaigo no Fukugougo ni kansuru Ichi Kousatsu [On Compounds in Ancient Japanese]. Nihongogaku 7 (5): 4–12. Tokyo: Meiji Shoin. Yamane, Noriko 2003 Chain shifts in intervocalic obstruents in Japanese. In A New Century of Phonology and Phonological Theory: A Festschrift for Professor Shosuke Haraguchi on the Occasion of His Sixtieth Birthday, Takeru Honma, Masao Okazaki, Toshiyuki Tabata and Shin-ichi Tanaka (eds.), 121–139. Tokyo: Kaitakusha. Yamane, Noriko and Shin-ichi Tanaka 2002 Gravitation and reranking algorithm: Toward a theory of diachronic change in grammar. On’in Kenkyuu [Phonological Studies] 5, the Phonological Society of Japan (ed.), 135–140. Tokyo: Kaitakusha. Yanagita, Kunio 1930 Kagy-k. Tokyo: Tk Shoin.

Bibliography

305

Yip, Moira J. W. 1980 The tonal phonology of Chinese. Doctoral dissertation, MIT. Yokotani, Teruo 1997 Accent shift beyond the foot boundary: Evidence from Tokyo Japanese compound nouns. Journal of the Phonetic Society of Japan 1(1): 54–62. Yoshida, Natsuya 2002 The effect of phonetic environment on vowel devoicing in Japanese. Kokugogaku [Japanese Linguistics] 53 (3): 34–47. Yoshida, Natsuya and Yoshinori Sagisaka 1990 Boin museika no youin bunseki [Factor analysis of vowel devoicing]. Technical Report of ATR Interpreting Telephony Research Laboratories (TR-I-0159). Yoshida, Shohei 1991 Some aspects of governing relations in Japanese phonology. Doctoral dissertation, School of Oriental and African Studies, University of London. 1996 Phonological Government in Japanese. Canberra: The Australian National University. Yoshida, Yuko Z. 1995 On pitch accent phenomena in Standard Japanese. Doctoral dissertation, School of Oriental and African Studies, University of London. [Published in 1999 by Holland Academic Graphics. The Hague.] Yoshioka, Hirohide 1981 Laryngeal adjustments in the production of the fricative consonants and devoiced vowels in Japanese. Phonetica 38: 236–251. Young, Steve J., Joop Jansen, Julian J. Odell, Dave Ollason and Phil C. Woodland 1999 The HTK Handbook. Entropic Research Laboratories. Zamma, Hideki 1999 Affixation and phonological phenomena: From Lexical Phonology to Lexical Specification Theory. On’in Kenkyuu [Phonological Studies] 2, the Phonological Society of Japan (ed.), 69–76. Tokyo: Kaitakusha. 2001 Accentuation of person names in Japanese and its theoretical implications. Tsukuba English Studies 20: 1–18. 2003 Suffixes and Stress/Accent Assignment in English and Japanese: More Than a Simple Dichotomy. On-line proceedings of Linguistics and Phonetics 2002 (LP2002), Meikai University, Tokyo. [http:// www.adn.nu/~ad31175/lp2002/lp2002main.htm].

Index of authors

Abramson, Arthur S. and Leigh Lisker, 72 Akinaga, Kazue, 261, 279 Anderson, John M. and Charles Jones, 74 Anderson, John M. and Colin J. Ewen, 74 Anonymous, 54 Archangeli, Diana B. and Douglas G. Pulleyblank, 144, 150n9 Anttila, Arto T., 204, 275, 277n1 Anttila, Arto T. and Cho Young-mee Yu, 123, 151n18, 275 Avery, J. Peter, 25 Avery, J. Peter and William J. Idsardi, 25, 28, 36f, 45n7

Charette, Monik and Asli Göksel, 86n3 Cho, Young-mee Yu, 123 Choi, Kyung-Ae, 67n42 Chomsky, A. Noam and Morris Halle, 14, 74, 77 Clements, George N., 14, 25, 28, 277n5 Clements, George N. and Susan R. Hertz, 77 Cohn, Abigail C., 82 Coleman, John S. and Janet B. Pierrehumbert, 204 Cremelie, Nick and Jean-Pierre Martens, 198

Backley, Phillip, 87n5, 87n6 Backley, Phillip and Takahashi Toyomi, 87n6 Bao, Zhiming, 77, 79 Beckman, Jill N., 137, 263 Beckman, Mary E., 232 Bell, Alan E., 229 Benua, Laura H., 174 Bird, Steven G., 191 Bloch, Bernard, 91, 102n11, 102n13, 103n18 Blumstein, Sheila E., William E.Cooper, Harold Goodglass, Sheila Statlender and Jonathan Gottlieb, 83 Boersma, Paul P. G., 204

End, Kunimoto, 57f, 60, 66n36, 67n45 Flemming, Edward S., 77 Frisch, Stefan A., 204 Fujimoto, Masako, 223 Fukada, Toshiaki, Takayoshi Yoshimura and Yoshinori Sagisaka, 198 Fukazawa, Haruka, 120n1 Fukazawa, Haruka and Mafuyu Kitahara, 3, 27, 105ff, 113, 115, 120, 121n10 Fukazawa, Haruka, Mafuyu Kitahara, and Mitsuhiko Ota, 105f, 112f, 177 Fukuda, Suzy E. and Shinji Fukuda, 5ff, 22 Fukui, Seiji, 21 Fukushima, Kunimichi, 67n44

Cabrera-Abreu, Mercedes, 85 Calabrese, Andrea, 25,28 Campbell, Nick and Sagisaka Yoshinori, 232

Dauer, Rebecca M., 229 de Lacy, Paul V., 143, 151n16, 262, 263

Gandour, Jackson T. and Rochana Dardarananda, 83 Grootaers, Willem A., 186

308 Index of authors Halle, Morris, 268 Halle, Morris and Kenneth N. Stevens, 74, 77, 79 Halle, Morris and Jean-Roger Vergnaud, 266 Hamada, Atsushi, 37, 51, 55, 57, 63n2, 65n28, 66n34, 66n40, 67n42, 68n46, 68n50,90, 101n6 Hamano, Shoko, 123, 140, 142, 148, 152n23 Han, Mieko, S., 205, 232 Hansson, Gunnar Ólafur, 151n11 Haraguchi, Shosuke, 2, 102n16, 121n3, 189n10, 261, 266, 269 Harris, John K. M., 71, 74, 77ff, 83, 85, 87n5 Harris, John K. M. and Geoffrey A. Lindsey, 71, 74, 76, 77, 82, 83, 87n5 Hasegawa, Kiyoshi, Katsuaki Horiuchi, Tsutomu Momozawa, and Saburo Yamamura, 93, 98 Hashimoto, Shinkichi, 101n7, 123, 129 Hattori, Noriko, 211, 240 Hattori, Shiro, 248 Hayata, Teruhiro, 69n56, 140, 142 Hayes, Bruce P., 245n2, 262, 263, 265, 266, 267, 268, 277n1 Hayes, Bruce P. and Donca Steriade, 152n21 Hepburn, James Curtis, 101n8 Hibiya, Junko, 123 Hinskens, Frans and Jeroen M. van de Weijer, 143 Hirayama, Manami, 32, 33 Hirayama, Teruo, Ichiro Oshima, Makio Ono, Makoto Kuno, Mariko Kuno and Takao Sugimura, 123, 152n22 Hirose, H. et al., 14 Honda, Kiyoshi, Hiroyuki Hirai, Shinobu Masaki and Yasuhiro Shimada, 14, 259

Huang, Xuedong, Alex Acero and Hsian-Wuen Hon, 194, 200 Hulst, Harry G. van der, 74, 86n3 Hume, Elizabeth V. and Georgios Tserdanelis, 151n16 Hyman, Larry M., 229 Hwang, Mei-Yuh and Xuedong Huang, 193 Ide, Itaru, 66n31 Iida, Takesato, 65n27 Imaizumi, Satoshi, Akiko Hayashi, and Toshisada Deguchi, 226 Inkelas, Sharon, 175 Inoue, Fumio, 123, 149n3, 149n5, 152n23 Inoue, Michiyasu, 66n31 Ishizuka, Tatsumaro, 9, 64n10 Itô, Junko, 20, 21, 239 Itô, Junko and Ralf-Armin Mester, 5, 9, 17f, 24n8, 25, 26, 27f, 29ff, 35ff, 39, 40, 41, 42, 44n1, 44n2, 45n3, 45n5, 45n8, 74f, 81, 105ff, 111f, 120n2, 121n5, 123, 124, 131ff, 137, 140, 142, 150n9, 152n21, 174, 177, 189n10 Itô, Junko, Yoshihisa Kitagawa and Ralf-Armin Mester, 22 Itô, Junko, Ralf-Armin Mester, and Jaye E. Padgett, 22, 25, 27ff, 35f, 41, 45n6, 74f, 76, 146 Itoh, Motonobu, Itaru F. Tatsumi, and Sumiko Sasanuma, 83 Iwabuchi, Etsutar, 67n40 Iwanami Shoten Henshbu, 94 Jakobson, Roman, 82 Jakobson, Roman, C. Gunnar M. Fant and Morris Halle, 74 Jessen, Michael and Catherine O. Ringen, 74 Jdaigo Jiten Hensh Iinkai, 101n7 Jun, Sun-Ah, 229 Jun, Sun-Ah and Mary E. Beckman, 229

Index of authors Jurafsky, Daniel and James H. Martin, 194 Kager, René W. J., 137, 150n9 Kamei, Takashi, 53, 56 Kamei, Takashi, Rokuro Kono and Eiichi Chino, 123, 152n22, 152n24 Kasuga, Kazuo, 49, 64n12 Kawai, Mieko, 103n23 Kawakami, Shin, 248 Kawasaki, Takako, 36 Kaye, Jonathan D., Jean Lowenstamm and Jean-Roger Vergnaud, 71, 74 Kazama, Rikiz, 93 Kenstowicz, Michael J., 245n2, 262 Kess, Joseph E. and Tadao Miyamoto, 42f Kikuchi, Hideaki and Kikuo Maekawa, 207 Kikuda, Norio, 98 Kindaichi, Haruhiko, Ooki Hayashi and Takesi Sibata, 9 Kindaichi, Kyosuke, 123, 129 Kiparsky, R. Paul V., 41, 174 Kitahara, Mafuyu, 248 Kitahara, Yasuo, 94 Kiyose, Gisabur N., 100n1 Kohler, Klaus J., 74, 229 Komatsu, Hideo, 53, 59, 65n23, 66n32, 66n35, 68n45 Kondo, Mariko, 4, 216, 226, 229ff, 230, 241, 244, 245n1, 245n2, 273 Kubozono, Haruo, 1, 2, 5ff, 11, 13, 14, 22, 44n2, 121n3, 157, 160, 173, 175n2, 189 Kula, Nancy Chongo and Lutz Marten, 86n3 Kuno, Susumu, 102n11 Kuroda, Shige-Yuki, 5, 25, 36, 45n8 Kuwabara, Hisao and Kazuya Takeda, 230 Labrune, Laurence, 25, 26, 28, 30, 44n2

309

Lange, Roland A., 101n7 Liberman, Mark Y. and Alan S. Prince, 14 Lisker, Leigh and Arthur S. Abramson, 72 Lombardi, Linda, 74, 76 Lyman, Benjamin S., 44n2 Mabuchi, Kazuo, 68n48 Maddieson, Ian, 142 Maddieson, Ian and Peter N. Ladefoged, 125, 149n4 Maeda, Hiroyuki, 148 Maekawa, Kikuo, 4, 215f, 218, 225, 226, 230, 248 Maekawa, Kikuo and Hideaki Kikuchi, 4, 205ff, 230, 238 Maekawa, Kikuo, Hanae Koiso, Sadaoki Furui and Hitoshi Isahara, 207 Maekawa, Kikuo, Hideaki Kikuchi, Yosuke Igarashi and Jennifer J. Venditti, 207 Marten, Lutz, 86n3 Martin, Samuel E., 25, 27, 29, 31, 32, 38f, 40, 43, 44n1, 44n2, 45n4, 45n5, 89, 92, 93, 101n7, 103n21 Maruyama, Rinpei, 53, 65n25 Mathias, Gerald B., 101n7 Matsui, F. Michinao, 248 Matsumoto, Takashi, 58 Matsumura, Akira, 94, 98 Matthews, Peter H., 102n15 McCarthy, John J., 274, 278n12 McCarthy, John J. and Alan S. Prince, 113, 123, 150n9 McCawley, James D., 27, 38f, 40, 44n1, 45n5, 102n16, 240 McGarrity, Laura W., 277n2 Mielke, Jeffrey, 143 Miller, Roy A., 101n4, 101n7, 151n13 Miyake, Marc H., 101n3 Mori, Hiromichi, 66n31 Murayama, Tadashige, 161

310 Index of authors Nakagawa, Yoshio, 187 Nakata, Norio and Hiroshi Tsukishima, 51, 52 Napoli, Donna J. and Marina A. Nespor, 14 Nasu, Akio, 19, 151n20 Nasukawa, Kuniya, 71ff, 75, 77, 82, 83ff, 87n5, 87n6, 87n7 Nihon Daijiten Kankkai, 101n4 Nihon Hoso Kyokai [NHK], 205, 247, 261 Nishihara, Tetsuo, 3, 151n15 Nishimiya, Kazutami, 49, 64n12 Ogura, Sinpei, 44n2 Ohala, John J., 150n9 Ohno, Kazutoshi, 2, 3, 5, 15, 28, 31, 32, 33, 34, 37, 44n2, 47ff, 100n1, 101n8, 121n3 Okumura, Mitsuo, 92, 97 Ono, Masahiro, 53 no, Susumu, 49, 64n12 Oohashi, Junichi, 123, 125, 149n4 Orgun, Cemil Orhan, 175 Otsu, Yukio, 5, 11f, 44n2, 45n3 tsubo, Heiji, 50, 51 Parker, Charles K., 32, 33 Pater, Joseph V., 34, 45n6 Pater, Joseph V. and Adam Werle, 151n17 Pierrehumbert, Janet B. and Mary E. Beckman, 103n6 Piggott, Glyne L., 25 Ploch, Stefan, 77, 86n4, 87n8 Polivanov, Yevgeny D., 248 Port, Robert F., Jonathan M. Dalby and Michael L. O’Dell, 232 Poser, William J., 31, 32, 42, 278n9 Prince, Alan S., 131 Prince, Alan S. and Paul Smolensky, 121n8, 123, 125, 131, 262, 265, 277n1, 278n11

Pulleyblank, Douglas G., 144ff, 151n19 Reinhart, Tanya M., 23n4 Rice, Keren D., 3, 25ff, 26, 28, 29ff, 36ff, 40, 41, 75, 121n3, 131, 190n13 Rice, Keren D. and J. Peter Avery, 25 Rodriguez, João, 60, 66n40, 67n43, 67n44, 68n47 Sakuma, Kanae, 216, 221, 248 Sakurai, Shigeharu, 103n17, 230, 240 Sanada, Shinji, 186f Sato, Hirokazu, 12, 22, 157 Sato, Ryoichi, 150n7 Sato, Yumiko, 232 Schane, Sanford A., 74 Shibata, Takeshi, 149n4 Shibatani, Masayoshi, 101n7, 151n12 Shikano, Kiyohiro, Katsuteru Ito, Tatsuya Kawahara, Kazuya Takeda and Mikio Yamamoto, 194 Shimizu, Katsumasa, 73, 80 Shinohara, Shigeko, 239 Siegel, Dorothy C., 174 Smolensky, Paul, 121n6 Steriade, Donca, 34 Strik, Helmer and Catia Cucchiarini, 198, 204 Sugito, Miyoko, 4, 8ff, 23n2, 23n3, 157ff, 247ff, 248, 249, 259, 261 Sugito, Miyoko and Hajime Hirose, 241, 247, 249 Susman, Amelia L., 267 Suzuki, Keiichiro, 2, 191ff Suzuki, Takao, 121n7 Takagi, Ichinosuke, Tomohide Gomi, and Susumu no, 54f, 59 Takayama, Michiaki, 60f, 68n50, 69n56, 123, 142, 147, 148, 152n22 Takayama, Tomoaki, 3, 63, 110, 120n2, 121n3, 123, 140, 151n10, 151n14

Index of authors Takeda, Kazuya and Hisao Kuwabara, 205, 206, 223, 230 Tamamura, Fumio, 2 Tanaka, Makir, 66n32 Tanaka, Shin-ichi, 4, 157, 175n8, 261ff, 268, 269, 270, 273, 274, 277n1, 278n9, 278n11 Tanaka, Shin’ichi and Haruo Kubozono, 23n5 Tateishi, Koichi, 106f, 109ff, 119, 120n2, 121n4, 121n7, 121n8 Tj, Misao, 149n1 Tsujimura, Natsuko, 81 Tsukishima, Hiroshi, 64n10 Tsuru, Hisashi, 49, 51, 64n8, 64n11, 64n12, 65n19 Unger, J. Marshall, 90, 123, 129 Uwano, Zendo, Masao Aizawa, Kazuo Kato, and Motoei Sawaki, 123 Vance, Timothy J., 2, 25, 26, 27, 29ff, 35, 40, 42f, 44n1, 44n2, 45n5, 45n9, 45n10, 59, 69n52, 81, 89ff, 90, 91, 93, 94f, 100n2, 101n6, 101n8, 102n12, 103n17, 103n22, 123, 129, 150n9, 189n5, 240 Varden, J. Kevin, 45n10

311

Venditti, Jennifer and Jan P. H. van Santen, 210 Wada, Minoru, 247 Wenck, Günther, 140 Weijer, Jeroen M. van de, 150n8 Wheeler, Max W., 229 Yamada, Eiji, 261 Yamaguchi, Yoshinori, 189n10 Yamane-Tanaka, Noriko, 3, 69n57, 123ff, 135, 149n2 Yamane, Noriko and Shin-ichi Tanaka, 123 Yanagita, Kunio, 123, 128, 129, 151n12 Yip, Moira J. W., 77, 79 Yokotani, Teruo, 261 Yoshida, Natsuya, 223 Yoshida, Natsuya and Yoshinori Sagisaka, 206, 223, 230 Yoshida, Shohei, 75, 86n1 Yoshida,Yuko Z., 85 Yoshioka, Hirohide, 238 Young, Steve J., Joop Jansen, Julian J. Odell, Dave Ollason and Phil C. Woodland, 207 Zamma, Hideki, 4, 157ff, 173

Index of languages

Bantu languages, 84, 85 Burmese, 73 Campa (Arawak), 80 Catalan, 79 Chinese, 3, 14, 42, 48ff, 180, 188, 189n1, 264 Classical, 180 Japanized 64n6 Mandarin 63n4 Dutch, 185 English, 14, 39ff, 44n1, 59, 72f, 78, 79, 80, 83, 84, 92, 93, 109, 114, 121n7, 179, 185, 264, 267f, 277n5

Modern, 5, 9, 22, 42, 59, 60, 89, 129ff, 179, 184 Middle, 129ff, 148 Nara, 90, 128 Old, 9, 90ff, 123ff, 129ff Osaka, 21, 175, 247ff pre-old, 90, 102n10 Shikoku, 128 Standard, 207, 209, 229 Tohoku, 61, 69n57, 84, 123ff Tokushima, 147 Tokyo, 21, 23n2, 24n10, 69n54, 84, 102, 124, 139f, 150n9, 175, 209, 248 western, 69n54

Ewe, 14

Korean, 57, 67n42, 67n43

Finnish, 73, 78, 84 French, 72f

Latin, 39 Lithuanian, 266

German, northern, 79 Germanic languages, 72 Greek, 39 Gujarati, 78

Pirahã, 277n5 Polish, 72, 79 Portuguese, 57, 110, 178, 180

Hindi, 78 Indonesian languages, 85 Italian, 14 Japanese, Aomori, 124 common, 207 eastern 69n54, 229 Kansai, 247f Kanto, 128 Kinki, 21, 128 Kochi, 139 Kyoto, 21, 128, 247 Kyushu, 61 literary, 42

Quichea, 80, 84 Quileute, 84 Reef Island-Santa Cruz languages, 84 Romance languages, 72 Russian, 72, 178f Serbo-Croatian, 79 Siouan languages, 267 Slavic languages, 72 Spanish, 72f, 78, 80, 84 Swedish, 72f Thai, 73, 78, 80, 83, 84 Winnebago, 267 Zoque, 80, 84 Zuya-go, 22

Index of subjects

accent and sonority, 262ff accent and tone, 262ff accentedness, 157ff, 252 see also voicing and accent acquisition, 71, 80, 82ff, 86, 119f, 151n17 alternations, 45f aphasia, 71, 80, 82ff, 86 chronological continuum, 128ff compounds, 29ff coordinate compounds, 19, 81, 94ff corpus, 194, 196, 198, 199, 202, 205ff devoiced accented vowels, 247ff, 262, 270ff, 278n10 see also vowel devoicing durational reduction, 229ff, 251ff electromyography, 249, 255, 256, 258 Element Theory, 74ff faithfulness, 106ff, 130, 137, 274 relativized faithfulness, 106, 115 F0-contour, falling, 77, 240f, 248ff rising, 77, 258; see also pitch foreign words, (loanwords, borrowings) 3, 21, 38, 39, 40ff, 45n8, 82, 100n1, 105ff, 149n5, 150n7, 177ff functional load, 59, 121n7

inflected words, 91ff intensity, 208, 231, 234ff, 244, 248, 264 laryngeal-source contrasts, 71ff lexicon, 2f, 4, 26, 35, 37, 38ff, 105ff, 132f, 177, 179, 183, 186, 189n1, 194, 196, 197f, 203 core-periphery structure, 105ff stratification, 2, 4, 26, 37ff, 48, 177, 188 see also Sino-Japanese, foreign words (loanwords, borrowings) Lyman’s Law, 2, 7ff, 22, 25ff, 81, 84, 86n2, 93, 108, 110, 121n11, 160, 278n7, 278n8 man’yougana, 48ff markedness, 9, 10, 15, 78, 105ff, 130, 131, 137, 142ff, 263, 278n12 minimal pairs, 126f, 140, 149n4, 185, 186 mora, 4, 8, 9, 11, 13, 15ff, 39, 95, 96, 103n22, 159ff, 205, 208, 217, 221, 223, 224f, 232ff, 239ff, 247ff, 266, 270, 274, 277n1, 278n7 moraic nasal, 66n35, 81, 95, 100n2, 101n6, 125, 126, 147, 149n4, 175n5, 208, 242 nasalization, (of voiced obstruents) 63ff; see also prenasalization

geminates, 17, 37, 149n4, 208, 215, 220, 224, 244 geographical continuum, 126ff

nativization, 108, 180, 186, 187 *NT constraint 35ff, 107ff, 190n13 numeral-classifier combinations, 23n5, 191ff

harmonic scales, 130ff, 261ff implicational relationships, 75, 78, 84f, 86, 123ff, 263ff, 267, 270, 276, 277n1

OCP, 7ff, 27, 113, 115, 160f, 165ff, 172f, 273, 275, 278n10 opacity, 261ff

314 Index of subjects optionality and directionality (of accent shift), 261ff physiological experiment, 238, 247ff pitch, 4, 13, 59, 85, 102n16, 205, 227, 236, 247ff, 264ff, 271 prenasalization, 3, 37, 81, 84, 85, 90, 123ff, 190n13 prosodic asymmetry, 13 rendaku, 1ff, 5ff, 25ff, 80f, 89ff, 108, 110, 113, 117, 121n3, 121n11, 157ff, 177ff, 269, 278n7 branching constraint on ~, 11ff, 22 mora constraint on ~, 15ff, 22 sei-daku distinction, 47ff sound values of ~, 54, 60, 61, 66n40 Sino-Japanese, 3, 16ff, 26, 33ff, 105, 107, 108, 149n5, 177ff, 268 vulgarized ~, 184, 187, 188, 190n12 speech rate, 221f, 226, 230, 232, 247ff speech style, 4, 206f, 226 speech recognition, 191ff, 207 Large Vocabulary Continuous Speech Recognition (LVCSR), 191ff Automatic Speech Recognition, 191ff specific language impairment, 5f

spontaneous speech, 205ff, 230 Sugito’s Law, 8ff, 23n2, 23n3, 158ff syllable structure, 4, 121n8, 215, 229ff sympathy (in OT), 263ff syncope, vowel ~, 90 UNIFORMITY constraint, 115ff universals, 75, 78, 86 voicing, postnasal, 3, 25ff, 76, 80f, 84, 110, 113, 121n7, 142, 147, 160, 174, 175n5, 180, 184 voice contrasts, 47ff, 76, 109f, 123, 140ff, 147 voicing and accent, 4, 33, 52, 59, 69n53, 157ff, 247ff, 261ff voicing and nasality, 47, 57ff, 66n40, 67n43, 67n44, 69n57, 77, 82, 83ff, 86, 86n4, 87n6, 123ff vowel devoicing, 1, 4, 71, 76, 79, 81, 82, 205ff, 229ff, 261, 262, 268, 271, 273 atypical environments, 217, 225ff consecutive devoicing, 4, 215ff, 224ff, 230f, 234ff, 241ff, 244, 268 manner interaction, 215, 219, 223ff vowel weakening, 231f, 234, 237, 244 word frequency, 218f writing system, 39, 42f, 45n9, 47ff