CONTINUITY IN LINGUISTIC SEMANTICS
LINGVISTICÆ INVESTIGATIONES: SUPPLEMENTA Studies in French & General Linguistics /...
55 downloads
726 Views
9MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
CONTINUITY IN LINGUISTIC SEMANTICS
LINGVISTICÆ INVESTIGATIONES: SUPPLEMENTA Studies in French & General Linguistics / Etudes en Linguistique Française et Générale This series has been established as a companion series to the periodical "LINGVISTICÆ INVESTIGATIONES", which started publication in 1977. It is published by the Laboratoire d'Automatique Documentaire et Linguistique du C.N.R.S. (Paris 7).
Series-Editors: Jean-Claude Chevalier (Université Paris VIII) Maurice Gross (Université Paris 7) Christian Leclère (L.A.D.L.)
Volume 19 Catherine Fuchs and Bernard Victorri (eds) Continuity in Linguistic Semantics
CONTINUITY IN LINGUISTIC SEMANTICS Edited by
CATHERINE FUCHS BERNARD VICTORRI Université de Caen, France
JOHN BENJAMINS PUBLISHING COMPANY AMSTERDAM/PHILADELPHIA
The paper used in this publication meets the minimum requirements of American National Standard for Information Sciences — Permanence of Paper for Printed Library Materials, ANSI Z39.48-1984.
Library of Congress Cataloging-in-Publication Data Continuity in linguistic semantics / edited by Catherine Fuchs. Bernard Victorri. p. cm. — (Linguisticae investigationes. Supplementa ISSN 0165-7569; v. 19) Includes bibliographical references and index. Contents: The limits of continuity : discreteness in cognitive semantics / Ronald Langacker - Continuity and modality / Antoine Culioli - Continuum in cognition and continuum in language / Hans-Jakob Seiler — Is there continuity in syntax? / Pierre Le Goffic - The use of computer corpora in the textual demonstrability of gradience in linguistic categories / Geoffrey Leech, Brian Francis & Xunfeng Xu ~ A "continuous definition" of polysemous items / Jacqueline Picoche « The challenges of continuity for a linguistic approach to semantics / Catherine Fuchs — What kind of models do we need for the simulation of understanding? / Daniel Kayser - Continuity, cognition, and linguistics / Jean-Michel Salanskis ~ Reflections of Hansjakob Seller's continuum / René Thorn ~ Attractor syntax / Jean Petitot — A discrete approach based on logic simulating continuity in lexical semantics / Violaine Prince ~ Coarse coding and the lexicon / Catherine L. Harris — Continuity, polysemy, and representation: understanding the verb "cut" / David Touretzky ~ The use of continuity in modelling semantic phenomena / Bernard Victorri. 1. Semantics. 2. Continuity. 3. Linguistic models. I. Fuchs, Catherine. II. Victorri, Bernard. III. Series. P325.C57 1994 401'.43--dc20 94-38916 ISBN 90 272 3128 1 (Eur.) / 1-55619-259-2 (US) (alk. paper) CIP © Copyright 1994 - John Benjamins B.V. No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any other means, without written permission from the publisher. John Benjamins Publishing Co. • P.O.Box 75577 • 1070 AN Amsterdam • The Netherlands John Benjamins North America • P.O.Box 27519 • Philadelphia, PA 19118 • USA
CONTENTS
Preface
3
PART I : LINGUISTIC ISSUES Ronald Langacker : The limits of continuity : discreteness in cognitive semantics Antoine Culioli : Continuity and modality Hans-Jakob Seiler : Continuum in cognition and continuum in language Pierre Le Goffic : Is there continuity in syntax ? Geoffrey Leech, Brian Francis & Xunfeng Xu : The use of computer corpora in the textual demonstrability of gradience in linguistic categories Jacqueline Picoche : A "continuous definition" ofpolysemous items : its basis, resources and limits Catherine Fuchs : The challenges of continuity for a linguistic approach to semantics
9 21 33 45
57 77 93
PART II : MODELLING ISSUES Daniel Kayser : What kind of models do we need for the simulation of understanding ? 111 127 Jean-Michel Salanskis : Continuity, cognition and linguitics René Thorn : Reflections on Hansjakob Seiler's continuum 155 Jean Petitot : Attractor syntax : morphodynamics and cognitive grammar 167 Violaine Prince : A discrete approach based on logic simulating continuity in lexical semantics 189 Catherine L. Harris : Coarse coding and the lexicon 205 David Touretzky : Continuity, polysemy and representation : understanding the verb 'cut' 231 Bernard Victorri : The use of continuity in modelling semantic phenomena 241
PREFACE Until recently, most of linguistic theories avoided recourse to the notion of continuity. Structuralism, which developed within a problematic of purely discrete relations of commutation (paradigmatic relations) and distribution (syntagmatic relations) among forms, contributed to the marginalisation of all phenomena outside such discrete framework. This tendency was accentued by transformational-generative grammars, two essential characteristics of which tend towards the elimination of continuity : on the one hand, the priority accorded to syntax, with its emphasis on categorisation (which leads to compositionality as far as semantics is concerned) and on the other, the utilisation of an algebraic forma lism which sanctions the discrete character of these models. It should be noted that contemporary theories of cognition also resort massively to discrete formalisms, partly under the combined influence of linguistic theories and of symbolic artificial intelligence. Thus the Gestalt theory seems to be almost forgotten today, despite the fact that, particularly as regards the psycho logy of perception, it had completely transformed the paradigms of research, by opposing to the then prevailing associationist ideas a point of view closer to the theory of dynamic systems (inspired by the fields theory of physics), in which the relation between the whole and its parts is the result of more global interactions, irreducible to discrete combinations of elementary sensations. Nevertheless, the predominance of discrete approaches does not mean that continuity has been completely absent from the scene. In linguistics, in the hey day of structuralism, the psychomecanic theory of Gustave Guillaume already ran counter to the dominant point of view by defining a kinetics which could account for a continuum of significations obtained in discourse by earlier or later "interceptions" in the movement of "tension" inherent in each linguistic unit in tongue. More recently, various trends in linguistics have been trying to go beyond the constraints imposed by discrete approaches : Culioli, Picoche, Lakoff, Langacker, Leech, Seiler and others. Despite their diversity, these approaches evidently share a certain number of common semantico-cognitive preoccupa tions (whether it is a question of cognitive grammars, invariants in the typology of languages, or of enunciative categories). However, these attempts have remained generally isolated and seldom succeeded in building up problematics sufficiently operative to oppose to the all-powerful discrete models.
4
CONTINUITY IN LINGUISTIC SEMANTICS
At the same time, mathematical and computer science tools have been set forth, which seem interesting for the modelling of continuous phenomena in lin guistic semantics. Thus, in mathematics, Thorn has shown that the framework of differential geometry and of dynamic systems could constitute an alternative to discrete formalisations for linguistics. Several researchers (Petitot, Bruter, Wildgen) have explored the possibilities offered by Thorn's "catastrophe theory" for the treatment of linguistic problems. More recently, the revival of interest in connectionist techniques in artificial intelligence has been accompanied by some attempts to apply these techniques to the description of language : the modelling of language learning processes (McClelland and Rumelhart, Touretzky, etc.), the treatment of semantic ambiguities (Cottrell, Waltz and Pollack, etc.) and lately even syntactic representations (Smolensky). Time has come now to take stock of these advances. This book provides a confrontation between linguists, philosophers, mathematicians and computer scientists, dealing with two major questions : which language phenomena call for continuous models, and what can the tools of formalisation contribute in this respect ? In order to focus the reflexion even further, the authors deliberately restricted themselves to the problems of linguistic semantics, linked to the lexi con, to grammatical categories (aspect, modality, determination, etc.) or to syn tactic structures. The book is thus divided into two main parts : Part one is devoted to linguistic issues : one of our main concerns has been to give priority to linguistic problematics, because the utilisation of ma thematical or computer science formalisms too often leads to an emphasis on the tool at the expense of the phenomena to be accounted for. The following questions are at stake : In linguistics, which semantic phenomena appear difficult or impos sible to describe in discrete terms ? How can recourse to the notion of continuity allow resolution of the difficulties encountered ? How can a description of these phenomena in terms of a continuum be articulated with the discrete character of linguistic units and their composition ? Should continuity be conceived as a con venient representation of very gradual, but nevertheless basically discrete, pheno mena, or must one postulate that continuity is an intrinsic characteristic of seman tic phenomena ? Part two is devoted to modelling issues, considered from a threefold point of view : namely a philosophical, a mathematical and a computer science viewpoint. The various contributions try to answer the following questions : - From an epistemological point of view, must the introduction of the notion of continuity be seen as a radical break with the tradition of formalisation in linguistics ? In particular, how can the introduction of this notion be reconcilied with a methodology based on the falsifiability of theories ? Is there necessarily a
PREFACE
5
link between this type of modelling and the cognitive approaches which are also based on the notion of continuity ? What new interactions might such an approach open up, particularly with a general cognitive theory like the Gestalt theory ? - On the mathematical plane, what is the relation betweeen the notions of "continuity" versus "discreteness" in linguistics and the various mathematical properties to which they can be compared (oppositions between continuous and discontinuous for a function, between continuous and countable in set theory, between continuous and discrete for a variety) ? Can linguistic "continuity" really be accounted for by a mathematical model ? Can one expect really operative pre dictions (quantitatively and qualitatively) from such a model ? What links are there between a continuous mathematical model and the quantitative mathematics already widely employed in linguistics, namely statistics ? - Finally, on the plane of computer science, how can continuity be implemented on a digital, and therefore discrete, machine ? Must a continuous mathematical model necessarily correspond to a continuous implementation in computer science ? Does connectionism provide a novel and completely satisfactory solution to this problem ? Catherine FUCHS Bernard VICTORRI
PART I LINGUISTIC
ISSUES
THE LIMITS OF CONTINUITY : DISCRETENESS IN COGNITIVE SEMANTICS RONALD W. LANGACKER University of California, San Diego, USA
I take it as being evident that many aspects of language structure are matters of degree. This is a common theme in both functional and cognitive linguistics, including my own work in cognitive grammar (Langacker 1987a, 1990, 1991). It would however be simplistic to assume that a commitment to cognitive (as opposed to formal) semantics necessarily correlates with the view that semantic structure is predominantly continuous (rather than discrete). I suspect, in fact, that the role of true continuity in linguistic semantics is rather limited. My goals here are to clarify some of the issues involved, to briefly discuss a certain amount of data, and to propose a basic generalization concerning the distribution of discrete vs. continuous phenomena. In approaching these matters, I will ignore the fact that linguistic structure reduces, ultimately, to the activity of discrete neurons that fire in discrete pulses. Our concern is rather with phenomena that emerge from such activity at higher le vels of organization, phenomena that could in principle be either discrete or conti nuous. I will also leave aside two aspects of discreteness that are too obvious and general to merit extended discussion : the fact that we code our experience primarily by means of discrete lexical items, each of which evokes individually only a limited portion of the overall notion we wish to express ; and the discrete nature of the choice (i.e. at a given position we have to choose either one lexical item or another, not some blend or in-between option). In using terms like discrete and continuous, what exactly do I mean ? A continuous parameter has the property that, between any two values (however close), an intermediate value can always be found. There are no "gaps" along the parameter, nor any specific values linked in relationships of immediate succession. By contrast, discreteness implies a direct "jump" between two distinct values, one of which is nonetheless the immediate successor of the other. To take an obvious example, the real numbers form a continuous series, whereas the integers are dis crete (there is no integer between 4 and 5). Many continuous parameters are of course discernible in conceptualization and linguistic semantics : length, pitch,
10
RONALD W. LANGACKER
brightness, the angle at which two lines intersect, etc. Yet the role of true conti nuity appears to be circumscribed in various ways. We must first distinguish actual continuity from other phenomena that tend to be confused with it. One thing that does not qualify as continuity is hesitancy or indeterminacy in the choice between two discrete options (which is not to deny that one's inclination to choose a particular option may be a matter of degree). For instance, although I may not know whether to call a certain object a cup or a mug, I nevertheless employ distinct and discretely different prototypical conceptions in making the judgment (cf. Wierzbicka 1984). Continuity is also not the same as va gueness or "fuzziness". It would be arbitrary, for example, to draw a specific line as the definitive boundary of a shoulder. This body part is only fuzzily bounded — there is no definite point at which it is necessarily thought of as ending. Yet we do conceive of it as a bounded region (this makes shoulder a count noun), and a boundary implies discontinuity (a "jump" between shoulder and non-shoulder). We impose the boundary despite being unsure or flexible in regard to its place ment. I believe, moreover, that certain linguistic phenomena often thought of as forming a continuum are better analyzed in terms of multiple discrete factors that intersect to yield a finely articulated range of possibilities. For instance, basic grammatical categories are sometimes seen as varying continuously ("squishily") between the two extremities anchored by nouns and verbs (Ross 1972). I have ar gued, however, that grammatical classes are definable on the basis of discrete se mantic properties (Langacker 1987a, part II ; 1987b). A noun designates a thing (defined abstractly as a "region"), while an adjective, preposition, participle, infi nitive, or verb designates a relation. Verbs are temporal (in the sense of desi gnating relationships that evolve through time and are scanned sequentially), whe reas the other categories are atemporal (being viewed holistically). Some rela tions are simple (i.e. they comprise just a single configuration), but others — verbs, infinitives, as well as certain participles and prepositions — are complex (comprising multiple configurations). When additional semantic traits are taken into account, together with polysemy and the prototype organization of individual categories, behavior can be anticipated that is "squishy" for all intents and pur poses. Still, merely indicating the position of elements along a continuous scale can at best only summarize their behavior. Specific semantic characterizations offer the prospect of explaining it. Let us turn now to conceptual parameters that can indeed be regarded as continuous. Even here there are serious qualifications. The most obvious point is that a continuous scale is usually not coded linguistically in a continuous manner. For temperature we have terms like hot, cold, warm, cool, scalding, and freezing, not to mention ways of expressing specific values (e.g. 13 ° C). We devise musi-
THE LIMITS OF CONTINUITY
11
cal scales to structure the domain of pitch, and for time we have many discrete units of segmentation and measurement. The most celebrated example, of course, is the idiosyncratic "tiling" of color space imposed by the basic color terms of each language. Color is also celebrated as a domain which harbors a certain kind of dis creteness despite the apparent continuity of its basic dimensions. I refer to the phe nomenon of "focal colors", which provide the prototypical values of basic color terms and have special cognitive salience even when such a term is lacking (Berlin and Kay 1969 ; Kay and McDaniel 1978). Focal colors mitigate the continuity of color space by making it "lumpy" rather than strictly homogeneous. As natural co gnitive reference points, the lumps are easily adopted as the basis for categori zing judgments, so that color categories tend to coalesce around them. Focal colors are just one manifestation of a reference-point ability that I have claimed to be both ubiquitous and fundamental in cognition and linguistic seman tics (Langacker 1993a). It is, I think, self-evident that we are able to evoke the conception of one entity for purposes of establishing "mental contact" with another. This reference-point ability is manifested in the physical/perceptual do main whenever we search for one object in order to find another, as reflected in sentences like the following : (1)
a. The drugstore is next to the post office b. There's a deer on that hill just above the large boulder
More abstractly, a reference-point relationship is central to the meaning of "possessives" : the man s wallet ; my cousin ; the girl's shoulder ; our train ; her attitude ; your situation ; Kennedy's assassination ; etc. I suggest, in fact, that it constitutes the one constant aspect of their meaning. This schematic semantic value accounts for both the extraordinary variety of the relationships coded by possessive elements and also their asymmetry. If the "possessor" is properly analyzed as a natural cognitive reference point vis-à-vis the "possessed", it stands to reason that these roles would not in general be freely reversible (*the wallet's man ; *the shoulder's girl ; *the assassination's Kennedy). With respect to basically continuous parameters, the reference-point ability has what might be termed a "quantizing" effect. We are not in general able to di rectly ascertain a precise value falling at an arbitrary location along a continuous scale, nor do languages provide separate terms for each possible value. Instead, the usual strategy is either to assimilate the value to a salient reference point (ignoring its deviation therefrom), or else to estimate and discretely characterize its position in relation to one or more such reference points. The result is a kind of quantization, wherein linguistically coded values either "jump" directly from one reference point to another, or alternatively, are calculated from reference points in some discrete fashion.
12
RONALD W. LANGACKER
Quantization is most apparent when a continuous parameter is structured by means of a discrete grid or numerical scale. The musical scale is a case in point. Not only do the basic terms (C, D, E, etc.) jump from one precise value to the next, but they also provide the basis — both conceptual and linguistic — for deter mining the only permissible intermediate values. Conceptually, F-sharp or B-flat lies mid way between two primary values (or else one quantum above or below such a value, given some conception of the magnitude of allowable incre ments/decrements). Linguistically, the expression that codes an intermediate value comprises the basic term and either of two discrete qualifiers (sharp/flat). A tempe rature scale is comparable except that there is more flexibility in specifying inter mediate values. In describing the temperature as being 13.2 ° C, we take 13° C. as a reference point and indicate that the actual value lies beyond it at a distance repre senting a certain fraction of the interval between 13° and 14°. And while it is true in principle that any real number can be used to specify a temperature, in everyday practice we confine ourselves to the integers or at most to fractional intermediate values that we can estimate in terms of quanta. Intuitively, for instance, I unders tand 13.2° as a step beyond 13° such that five steps of the same magnitude would take me to 14°. Fractions themselves neatly illustrate the type of phenomenon I have in mind. Expressions like 21/2 and 53/4 clearly take the integers as both linguistic and conceptual reference points. Moreover, they use other integers as the basis for computing a specific intermediate position : the denominator indicates how many steps there are between two successive integers, while the numerator specifies how many steps should be taken. Or consider the angle at which two lines inter sect. Although there is obviously a continuous range of possible values, it seems evident that certain discretely computed values have special cognitive status. Particularly salient is an angle of 90°, as reflected by terms like right angle and perpendicular. The reason, I suggest, is that perpendicularity represents the privi leged situation in which the two angles formed by one line joining another have precisely the same magnitude — if one angle is mentally superimposed on the other (which is thus invoked as a kind of reference point), they are found to coincide. We are also more likely to characterize an angle as being 45°, 30°, or 60° than, say, as 7°, 52 °, or 119°. These values are privileged psychologically because they bear easily computed relationships to a right angle. An angle of 45° is one whose complement within a right angle is identical to it in magnitude. As for 30° and 60°, we can easily imagine sweeping through a right angle in three discrete and equal steps, defining three component angles whose superimposition likewise results in judgments of identity. Quantization of this sort is by no means confined to quasi-mathematical do mains. For instance, compound color expressions like brick red, celery green, and
THE LIMITS OF CONTINUITY
13
sky blue can be thought of as evoking dual cognitive reference points. By itself, a term like red, green, or blue evokes a focal color, which in turn evokes the more inclusive region in color space that it anchors. A noun such as brick, celery, or sky names an entity that not only has a characteristic color but is sufficiently familiar to serve as a reference point. From these two reference points, we compute the proper notion : red tells us that brick is to be construed with respect to its color, and brick directs our attention to a particular location within the red region. Also varying continuously are parameters such as size, length, and distance. We can of course measure these numerically, as for temperature. The more usual strategy, however, is to assess magnitudes only with respect to broad categories standing in binary opposition : big vs. small ; long vs. short ; near vs. far. The categorization is therefore basically discrete despite the vagueness of the boundaries. (Continuous analogical coding — as in The train was looooong — is clearly a rather marginal phenomenon.) Furthermore, placing an object in such a category involves a single step, in either a positive or a negative direction, from a privileged value that serves as reference point for this purpose. For a given object type, that reference point comprises the range of values that everyday experience has led us to regard as being "normal" for that type. Thus a big flea is smaller in absolute terms than a small moth, and a short train is longer than a long centipede. The phenomenon is quite general. It seems to me that in every domain we operate primarily in terms of salient reference points, from which we arrive at other notions in ways that are largely discrete. When the effect of reference points and quantization is worked out systematically and fully appreciated, the role of true continuity in linguistic meaning will, I believe, appear rather limited. This is not to say that it has no role whatever. There are in fact important aspects of linguistic semantics for which continuity should probably be considered the default assumption. Let me offer the following broad generalization as a working hypothesis that may have some heuristic value : with respect to the "internal structure" of a linguistically coded conception, discreteness predominates ; on the other hand, semantic effects due to "external" factors — i.e. relationships with other conceptual structures — are basically continuous. Since conceptions are containers only by dint of metaphor (and are not really very container-like), the terms "internal" and "external" should neither be taken as implying a strict dichotomy nor pushed beyond the limits of their utility. Factors reasonably considered internal are an expression's conceptual "content" and many facets of its "construal". I have thus far focused on content, arguing that true con tinuity is circumscribed and circumvented in various ways. Construal is the phe nomenon whereby essentially the same content is susceptible to alternate "viewings", which represent distinct linguistic meanings. One aspect of construal,
14
RONALD W. LANGACKER
namely background, is by nature an external factor. Other, more internal aspects include specificity, scope, perspective, and prominence. Here, as with content, the role of true continuity is more limited than one might think. Specificity (or conversely, schematicity) refers to our manifest ability to conceive and portray a situation at any level of precision and detail, as exemplified in (2) : (2)
Something happened > A person saw an animal > A woman examined a snake > A tall young woman carefully scrutinized a small cobra
Observe, however, that we can generally only adjust the level of specificity in quantized fashion, either by adding a discrete element {woman > young woman) or else by shifting from one discrete level to another in a taxonomic hierarchy {examine > scrutinize). Furthermore, beyond their obvious discreteness such hierarchies are usually "lumpy", in that certain levels have greater cognitive salience than others. In particular, the "basic level" (e.g. snake in the hierarchy thing > animal > reptile > snake > cobra) is known to have special psychological status (Rosch 1978). I define an expression's scope as the array of conceptual content it invokes and relies upon for its characterization. By nature, it tends to be flexible and va riable — it is generally no easier to precisely delimit an expression's scope than it is to determine exactly how far a shoulder extends. Nevertheless, there is good linguistic evidence for believing not only that this construct has some kind of cognitive reality, but also that scopes are conceived as being bounded, however fuzzily (see Langacker 1993b). Grammatical constructions can refer to them specifically and equate them with other bounded regions. Consider the "nested locative" construction, as in (3) : (3) The camera is upstairs in the bedroom in the closet on the top shelf Intuitively, this construction involves a "zooming in" effect, wherein each successive locative in the sequence focuses on a smaller area contained within the previous one. More technically, we can say that the scope for interpreting each lo cative in the sequence is limited to the search domain of the preceding locative. (The search domain of a locative is defined as the area to which it confines the en tity being located, i.e. the set of locations that will satisfy its specifications.) A partonomy like arm > hand > finger > knuckle further illustrates both nesting and quantization with respect to scope : the conception of an arm overall provides the spatial scope for the characterization of hand ; the conception of a hand in turn constitutes the immediate spatial scope for finger ; and that of a finger, for knuckle.
THE LIMITS OF CONTINUITY
15
The term perspective subsumes more specific aspects of construal such as vantage point, orientation, direction of mental scanning, and subjecti vity/objectivity. Although some of these factors can in principle vary conti nuously, in practice they tend toward discreteness. Presumably, for instance, our conception of a cat includes numerous visual images, representing cats with diffe rent markings, in various postures, engaged in certain activities, etc. There is doubtless considerable variation and flexibility. Still, it seems apparent that these images tend to reflect certain canonical vantage points, and certain orientations of the cat within the visual field. These vantage points and orientations are of course those which predominate in our everyday visual experience. For example, images in which the cat is viewed from underneath, or is upside down within the visual field, are possible but hardly typical. The notion of mental scanning can be illustrated by the contrast between pairs of expressions like the following : (4)
a. The roof slopes steeply {upward/'downward} b. The road {widens/narrows} just outside of town c. The hill gently {rises from/falls to} the bank of the river
In each case a difference in meaning is quite evident, even though the two expressions describe precisely the same objective situation. Intuitively, moreover, the semantic contrast involves directionality, even though the situations described are static — objectively, nothing moves, hence there is no apparent basis for direc tionality. I do ascribe motion to these sentences, but not on the part of the subject : rather, it is the conceptualizer who "moves" in these expressions, scanning men tally through the scene in one direction or the other (Langacker 1990, ch. 5). For our purposes, the pertinent observation is that the contrast in directionality (e.g. between the conceptualizer scanning upward along the roof or downward) is clearly discrete. Subjectivity/objectivity is defined as the extent to which an entity is construed, asymmetrically, as the "subject" vs. the "object" of conception (Langacker 1985 ; 1990, ch. 12). In (4), for example, both the conceptualizer and his motion are construed subjectively, since the conceptualizer does not actually conceive of himself as scanning mentally through the scene, but merely does so implicitly as he focuses on the objective configuration thus assessed. Although subjectivity/objectivity is a matter of degree, here too there are grounds for belie ving that the scale is lumpy and partially quantized owing to the privileged status of certain canonical arrangements. Consider the role of the speaker, for instance. At one extreme, represented by the pronoun I, the speaker goes "onstage" to be the expression's referent ; as the explicit focus of attention, the speaker is construed quite objectively. Another standard arrangement finds the speaker
16
RONALD W. LANG ACKER
"offstage" but still within an expression's scope, hence intermediate in terms of subjectivity/objectivity. Examples include deictics (e.g. this ; here ; now) and sentences like those in (5), where the speaker functions as the default-case reference point. (5)
a. The mailbox is right across the street b. Please come as soon as you can
The last basic option is for the speaker to remain outside an expression's scope altogether, having no role in the conception conveyed apart from the conceptualizer role itself (which he has in every expression). In this event the speaker's construal is maximally subjective. There are many sorts of prominence, and while some kind of quantization may in each case be discernible, I suspect that in general it may be less obvious and less important in this area. Nonetheless, the two kinds of prominence that are most essential for grammatical purposes show definite quantum effects. One type is profiling, which might be characterized as "reference within a conceptualiza tion". Within its scope (i.e. the conceptual content it invokes), every expression profiles (designates) some substructure. Thus knuckle profiles a certain substruc ture within the conception of a finger, and finger within the conception of a hand. Many expressions (verbs, adjectives, prepositions, adverbs, etc.) profile rela tionships. For example, conquer designates a two-participant relationship that evolves through time, whereas the stative-adjectival conquered profiles the final resultant state of that process. {Conqueror profiles a thing, namely the agentive participant of conquer). Even though an expression's profile is not invariably sus ceptible to precise delimitation, the contrast between profile and non-profile is ba sically discrete and grammatically significant. In particular, the nature of its profile determines an expression's grammatical class. For expressions that profile relationships we need to recognize a second type of prominence, pertaining to the relational participants, whose grammatical import is hardly less substantial. It is usual for one participant — which I call the trajector — to stand out as the primary figure within the profiled relation. Additionally, there is often a second "focal" participant — termed the landmark — with the status of secondary figure. Observe that two expressions, e.g. before and after, may invoke the same conceptual content and even profile the same relationship within it (in this case one of temporal precedence), yet differ semantically because they impose opposite trajector/landmark alignments. While participant prominence may in general be a matter of degree, I believe that trajector and landmark status represent distinct quantum levels, and that they furnish the ultimate basis for the notions subject and object.
THE LIMITS OF CONTINUITY
17
Now that we have examined the "internal structure" of conceptions, inclu ding both content and construal, it is time to recall the working hypothesis advan ced earlier : internally, discreteness predominates (a more cautious phrasing is that continuity is circumscribed and mitigated in various ways) ; by contrast, semantic effects due to "external" factors — i.e. relationships with other conceptual struc tures — are basically continuous. These external factors will now be briefly discussed. While they tend to be neglected, I do not regard them as incidental or even subsidiary, but as integral components of linguistic meaning. Moreover, they would seem to be essentially continuous (although the discovery of significant quantization would not at all surprise me). An important aspect of linguistic semantics is our ability to construe one structure against the background provided by another. There are many kinds of background, including previous discourse, pertinent assumptions and expecta tions, and — in metaphor — the role of the source domain in conceiving and structuring the target domain (cf. Lakoff and Johnson 1980 ; Lakoff and Turner 1989 ; Turner 1987). While it is not hard to think of possible quantization in this realm, I wish to emphasize an important parameter that may well be continuous : the salience of the background structure, i.e. its level of activation in the construal of the target. For example, once a discourse referent is introduced its salience tends to diminish through the subsequent discourse unless and until it is mentioned again. There are of course discretely different ways of doing so (e.g. with a pronoun, or with a definite article plus noun), reflecting quantized estimates of the referent's current status (cf. Givón 1983 ; van Hoek 1992). But its salience per se (and hence the effect of its background presence when it remains implicit) presumably varies continuously. The relation between the source and target domains of a metaphor poses a number of thorny questions. To what extent do we understand the target domain prior to (or independently from) its structuring by the source domain? To what extent does target-domain reasoning depend on metaphorical structuring? To what extent do the source and target domains merge to form a "hybrid" conception (Fong 1988)? Possibly we are dealing here with basically continuous parameters. Be that as it may, there are clearly many expressions that originate through meta phorical extension even though the target domain can easily be grasped indepen dently. At least in such cases, we can speak of the gradual, presumably continuous "fading" of a metaphor, reflecting the declining likelihood and/or level of the source domain's activation on a given occasion of the expression's use. Intuitively, for example, the literal sense of fade ('decrease in color intensity') is still reasonably salient in expressions like fading metaphor, fade from memory, etc. By comparison, reflect reflects more weakly the source domain of light and mirrors.
18
RONALD W. LANG ACKER
A related phenomenon is analyzability, the extent to which the component elements of a complex expression are recognized within it and perceived as contri buting to its meaning. Thus complainer is more analyzable than computer, which in turn is more analyzable than ruler (i.e. 'instrument for measuring and drawing — "ruling"—straight lines'). We invariably interpret complainer as 'someone who complains', whereas we do not necessarily think of a computer specifically as 'something that computes', and a ruler is hardly ever thought of as 'something that rules [lines]'. I consider analyzability to be an important dimension of linguistic semantics. Indeed, I characterize an expression's meaning as comprising not just its composite semantic value, but also the entire compositional path which leads to it. In processing terms, analyzability is interpretable as the likelihood or degree to which component semantic values are activated along with the composite conception. Presently I have no linguistic evidence to suggest that this parameter is other than continuous. Finally, I assume an "encyclopedic" view of linguistic semantics which de nies the existence of any precise or rigid line of demarcation between knowledge that is "linguistic" and knowledge that is "extra-linguistic" (see Haiman 1980 and Langacker 1987, ch. 4). Our conception of a given type of entity — e.g. a cat, an apple, or a table — is almost always multifaceted, comprising a potentially openended set of specifications pertaining to any domain of knowledge in which it fi gures. Of course, these specifications vary greatly in their status. Some (like shape and primary function) are so "central" to an expression's meaning that they are virtually always activated when it is used. Other specifications (contingent know ledge, cultural associations) may be quite peripheral, being activated only in very special circumstances. A priori, it is reasonable to suppose that their likelihood and strength of activation vary continuously, being determined by such basically conti nuous factors as entrenchment, cognitive salience, and contextual priming. It is time now for a brief summary and conclusion. By way of summary, I will merely reiterate a basic working hypothesis : that discrete constructs are essential if not predominant for the characterization of linguistically coded conceptualizations, so far as their internal structure is concerned ; whereas those aspects of meaning which involve the relationship of such conceptions to one another would appear to vary continuously. By way of conclusion, let me emphasize that discreteness vs. continuity may itself be a matter of degree, the various shades and types of discreteness being distributed along a (quantized) continuum. Depending on what we examine and what we wish to emphasize, both discreteness and continuity can be discerned in virtually any aspect of language and cognition. Our task is not to choose between them, but rather to explicate the specific ways in which this fundamental opposition plays itself out across the full range of linguistically relevant phenomena.
THE LIMITS OF CONTINUITY
19
REFERENCES Berlin, Brent, and Paul Kay. 1969. Basic Color Terms : Their Universality and Evolution. Berkeley : University of California Press. Fong, Heatherbell. 1988. The Stony Idiom of the Brain : A Study in the Syntax and Semantics of Metaphors, San Diego : University of California doctoral dissertation. Givón, Talmy (ed.) 1983. Topic Continuity in Discourse : A Quantitative CrossLanguage Study, Amsterdam : John Benjamins. Haiman, John. 1980. Dictionaries and Encyclopedias, Lingua 50.329-357. Kay, Paul, and Chad K. McDaniel. 1978. The Linguistic Significance of the Meanings of Basic Color Terms, Language 54.610-646. Lakoff, George, and Mark Johnson. 1980. Metaphors We Live By, Chicago : University of Chicago Press. Lakoff, George, and Mark Turner. 1989. More than Cool Reason : A Field Guide to Poetic Metaphor, Chicago : University of Chicago Press. Langacker, Ronald W. 1985. Observations and Speculations on Subjectivity. In John Haiman (ed.), Iconicity in Syntax, 109-150, Amsterdam : John Benjamins. Langacker, Ronald W. 1987a. Foundations of Cognitive Grammar, vol. 1, Theoretical Prerequisites, Stanford : Stanford University Press. Langacker, Ronald W. 1987b. Nouns and Verbs, Language 63.53-94. Langacker, Ronald W. 1990. Concept, Image, and Symbol : The Cognitive Basis of Grammar, Berlin : Mouton de Gruyter. Langacker, Ronald W. 1991. Foundations of Cognitive Grammar, vol. 2, Descriptive Application, Stanford : Stanford University Press. Langacker, Ronald W. 1993a. Reference-Point Constructions. Cognitive Linguistics 4 : 1.1-38. Langacker, Ronald W. 1993b. Grammatical Traces of some "Invisible" Semantic Constructs, Language Sciences 15 : 4.323-335. Rosch, Eleanor. 1978. Principles of Categorization. In Eleanor Rosch and Barbara B. Lloyd (eds.), Cognition and Categorization, 27-47, Hillsdale, N. J. : Erlbaum. Ross, John R. 1972. The Category Squish : Endstation Hauptwort. Papers from the Regional Meeting of the Chicago Linguistic Society 8.312-328. Turner, Mark. 1987. Death is the Mother of Beauty. Chicago : University of Chicago Press.
20
RONALD W. LANGACKER
van Hoek, Karen. 1992. Paths Through Conceptual Structure : Constraints on Pronominal Anaphora, San Diego : University of California doctoral dissertation. Wierzbicka, Anna. 1984. Cups and Mugs : Lexicography and Conceptual Analysis. Australian Journal of Linguistics 4.205-255.
CONTINUITY AND MODALITY ANTOINE CULIOLI University of Paris VII (URA 1028, CNRS), France
Introduction In this paper I shall attempt to identify and describe two possible fields in which the implications of continuity are particularly manifest. Firstly I shall deal with the approach which is concerned with problems such as continuity and lin guistics, and more specifically semantics. Secondly I shall consider that no basic discrimination between syntax, semantics and pragmatics is called for and I purpose, here, to put forward an attempt to model the operations which allow us to establish a verifiable relation between representations on the one hand, and on the other, the traces of these operations which implement the transition from re presentations to textual phenomena. 1.
The choice of a theoretical framework and an attempt at a definition of continuity
The question of continuity can obviously be approached in two different perspectives. It can either be considered from a methodological point of view, which involves the problem of continuity in mathematics, and concerns in a more general way the construction of a system of metalinguistic representations, or it can be considered from a point of view based on observation, i.e., a "theory of observables", in such a way that we may see which points are affected by this problem of continuity. For reasons of competence — or perhaps incompetence in the areas of mathematics and computer science — my position will be that of the linguist. I shall thus present a certain number of examples, which are wellknown, and do not allow any discussion concerning facts, to bring into view the areas in which continuity occurs. Before discussing my examples, however, I should first like to add one or two considerations concerning the notion of continuity such as it appears to the linguist. To simplify the question, I would say that continuity occurs, in the strict sense of the word, at a basic level, in the relation between several different types of continuity on the one hand, and on the other, the phenomena which one en-
22
ANTOINE CULIOLI
counters in the linguistic domain, i.e. in the assimilation which can be made between empirical observations and the various types of continuity known in mathematics. Let us now consider the areas in which continuity occurs. This phenome non is particularly manifest in domains such as that of exclamatives, which I have had the opportunity of studying in some detail in various languages around the world (cf. Culioli 1974). Examples like Quel beau livre ! ("What a beautiful book ! ") show that there is constancy in a certain number of representations, which is of great interest, because here it is possible, in certain respects, to carry out a kind of assimilation based on the fact that the process is carried out as though one had an ordered series of occurrences of representations of, say, livre or beau livre in such a way that, in exactly the same way as for real numbers one would have an order by means of which one would eventually reach the ideal, inaccessible, detached representation, which would then provide the "really" something. The problem of continuity also occurs in another domain, which is that of time and space, and more particularly in the domain of time. Once again it is possible to demonstrate, just as I have done above, that we have, basically, in a pre-mathematical fashion, discovered the notion of the limit. In the domain of time, when one works on representations of events by means of intervals provi ded with topological properties — even the elementary, rudimentary topology which one is brought to introduce as a linguist — one becomes aware of the problem of the limit. See Figure 1 :
Figure 1 If one is located in A, in relation to B, one is obliged to take two points ; in A, one can never say that one has reached the boundary unless, being at a point xj in B, one says "that is it ; it's over" and in this case one reconstructs a last, imaginary point xi. But one cannot have a final point in A. This phenomenon occurs with striking regularity, whatever language one considers. I have never come across a counter-example to this type of thing, and the empirical facts related to modeling are extremely clear. This leads to a whole series of consequences. Furthermore, if one looks at the other side of the figure,
CONTINUITY AND MODALITY
23
the same problem arises concerning B in relation to A, etc., Thus, in a certain number of domains, which I shall not study here, one encounters this problem which is the problem of continuity in what might be called the "serious" sense of the term or at least the "real" sense compared to constructions which have been put forward for instance in mathematics. Another point in connection with which one encounters the problem of continuity is that which naturally occurs when one is brought to introduce the notion of cuts — could there be such a thing as continuity without cuts ? — Here, space permitting, it would be interesting to look more closely at the pro blem and demonstrate that one cannot provide adequate models in the domain of linguistic observation if one fails to associate the notion of cuts with that of con tinuity. By a cut I mean something akin, in a way which mathematicians may find horribly metaphorical, to Dedekind's cuts, such as expounded in the foundation texts. Another area in which continuity becomes an active factor can be observed when one introduces the property of deformability. This occurs in connection with abstract schematic forms : when they are made to undergo certain pressures, either by bringing something to bear upon them, or by immersing them in a space element provided with certain properties, these abstract forms produce, through what must be called deformations ("warps" or "shear" effects), local forms, which will become associated with the basic form, which is an abstract schematic form. This property of deformability leads us to another acceptation of the con cept of continuity which is the extent to which there is a relation between conti nuity and contiguity, a question which has become highly controversial, as we all know. This question of contiguity arises each time a deformation is brought about, since it can be supposed that between the resulting local forms there is a relation in which there is indeed discontinuity on the one hand but also an ele ment which will have a connectional property, in the form of "jumps" from one form to another. This notion of contiguity is fundamental to a whole series of considerations, particularly in connection with relations of causality. The whole notion of causality revolves around the relation between contiguity and conti nuity. On the basis of these preliminary remarks I shall now proceed to analyse my examples. These are of widely different types, because the notion of conti nuity is difficult to discuss from a linguistic point of view if one's examples be long to a single type, owing to the wide variety of representations of continuity one obtains, depending on the type of example chosen.
24
2.
ANTOINE CULIOLI
The example of pouvoir
My first example is : (1) 77 peut atteindre le sommet (He can / may reach the top) I shall not enter into problems raised by translation into English or any other language. This example does not present any particular problem, and one may wonder why it is open to a certain number of interpretations. I shall not dis cuss them all, but there is one which is absolutely fundamental : that of being "in a position" to reach the top. This does not imply that he actually reaches the top, simply that he is "in a position" to. The problem is now to find a way to repre sent pouvoir in such a case, allowing for this interpretation. If we use the imperfect tense — in French this alternative past tense exists : (2) Il pouvait atteindre le sommet (2) is open to two interpretations. Firstly in Il a dit qu' il pouvait atteindre le sommet, we have, in indirect speech, the simple 'translation' of (1) Il peut atteindre le sommet. Secondly, il pouvait atteindre le sommet means "77 aurait pu..." ("He could have..."). This "unreal" interpretation means that "he did not reach the top". What, from a semantic point of view, makes these two interpreta tions possible ? In relation to the different states of affairs, we shall attempt to discover that property of the forms as marks of operations which provides ope nings for such interpretations. If we now turn to the preterit tense (simple past) — I shall not enter into the question of the passé composé at this stage — : (3) Il put atteindre le sommet (3) means "He was able to..." and not "He could..." i.e., "il a effectivement atteint le sommet" ("He actually reached the top"). Several facts come to light when one attempts to provide a representation of this phenomenon (see Figure 2):
CONTINUITY AND MODALITY
25
We shall begin by constructing a point in T0 which will give rise to two branches, in such a way that, in relation to this point, we have, in fact, an antici patory construction of another point in TX, which will represent : (""), and which is the representation of a validatable state of affairs, such that it will perhaps be possible to say in Tx "such is the case". So here we have a "gap" between T0 and TX, in such a way that we know that there is a path to be covered, which may, if necessary, be covered by a subject S if the latter wishes at any time to be at T x , which is subsequent to T0, depending on variable circumstances. But we always have at this point of junc tion (which branches out as the representation is elaborated) the possibility of taking a second path, whereby something else may take place, and in this case the "something else" is empty. This means that here we will have, taking < r > for relation, "< r > is the case", which is "envisaged", and here we do not envi sage anything else but "< r > is the case". This representation of pouvoir in the case of (1) Il peut atteindre le sommet, shows how the item undergoes a filtering process, due both to the aspec tual properties of atteindre le sommet and to modal or in a general way seman tic properties. The aspectual property of atteindre is well-known : we have a transition, such that we may initially be at a point where we have not yet started, where we are "seeking" to reach, then, should the case arise, we succeed, i.e. we "have reached" the top. The modal property is what I personally call téléonomique or "goal-directed", which means that one has fixed for oneself a purpose which one has evaluated as "good" and one therefore considers the space T0 —> TX as "to be completed". If on the other hand our position is strictly epistemic, in the sense that we would not have a subject engaged in the covering of a gap evaluated as "good", in this case we would have at TX something which would not be constructed as empty. So in the case of (1) Il peut atteindre le sommet, we observe that with pouvoir we always have the representation of one term plus another term. Here the other term is constructed as 0 . This incidentally recalls, in a different style, something I read recently in a book as old as The English Verb by M. Joos (1964), in which there is a whole series of considerations on modals which are extremely interesting from an intuitive point of view. When we have (1) Il peut atteindre, we see that the path T0 —> TX repre sents a path to be completed, whereas the other one (To —> Ø) represents a simple relation. This is why Il peut atteindre... means "Il n'a pas encore atteint' ("He has not yet reached..."). Let us now come back to the imperfect, which, as we shall see, functions as a "translated" value of the point of origin. We shall now go on to state that in the construction of any system of reference we deal with the construction of a certain number of points of origin, one of which is an absolute point of origin,
26
ANTOINE CULIOLI
in relation to which we shall have (apart from the the origin of speech, which H. Reichenbach (1947) calls the "point of speech") two origins : one of them is a translated origin which will retain the properties of the absolute point of origin, and the other, a "disconnected" origin, will have specific properties. This allows us to account for empirical facts. These findings are founded on observations not on a single language but on a very wide variety of languages which are not rela ted in any way, either geographically or genetically. When we deal with a trans lation — let us call the absolute origin T0 and the translated origin T'0 — the translation may take place in the past or, under certain conditions, in the future. But in almost every case, it takes place in the past, and we maintain the proper ties. To illustrate this process let us take the following examples : (4) s'il pleut (if it rains) (5) s'il pleuvait (if it rained) (6) s'il vient à pleuvoir (if it happens to rain) (7) s'il venait à pleuvoir (should it happen to rain) (4), (5), (6) and (7) are quite acceptable. But let us now consider (8) to (11) : (8) *s' il doit pleuvoir (if it must rain) (9) s' il devait pleuvoir (should it rain) (10) *s' il va pleuvoir (if it's going to rain) (11) s'il allait pleuvoir (if it were to rain) (8) is unacceptable, because il doit pleuvoir would be deontic ; (9) is acceptable ; (10) is unacceptable (I mean with the strictly hypothetical value : otherwise cases such as s'il va faire ça ("if he's going to do that") are perfectly acceptable, especially when si ("if') has the value of puisque ("since")) ; and (11) is accep table. These examples show that when we bring about a translation of this type, we obtain, almost mechanically, a filtering effect implemented by the aspectual and modal properties of aller, devoir and venir à. All these findings should be taken into account but in the present case I shall simply take the fact that when we have (2) Il pouvait atteindre le sommet, it is either a strict translation of il peut, (and in this case we have maintained exactly the same property), or a sepa ration (see Figure 3) :
CONTINUITY AND MODALITY
27
Figure 3 This raisěs the problem of the relation between contiguity and continuity in terms of space (I use the term "space" in a metaphorical sense). We now have a value of separability. Let us return to T0 in relation to T0. I have used T for "Time-space", and in fact there is also a parameter S for "Subject" (which I shall not introduce here, to facilitate things). When we construct this we have two possibilities : - either there is continuity from one to the other : one takes as a starting point a former, previous state of things, and one proceeds up to the present (this in fact corresponds to the definition of the imperfect which can be found in Greek grammars) ; - or there is a separation : in this instance, there will be reference to the construction of an "analogous reality", such that what prevailed at a certain time no longer prevails ; this is typically the case in the classic French example Il y avait, à tel endroit, quelque chose (il y avait means that now, this is no longer the case). For amusement's sake, I could take similar examples from many other languages. Now coming back to (2) il pouvait, let us consider Figure 4 :
Figure 4 We may observe that we have constructed a disconnected point (which I shall represent as T0 1 ), which stands for a fictitious, disconnected reference point. Then, in relation to this fictitious point, which in fact represents our abso-
28
ANTOINE CULIOLI
lute origin translated and disconnected, we may recommence the trajectory of il pouvait. In the same way, when we say (1) il peut in relation to the present, we are in fact saying that, should the case arise, we will say "il a pu " at TX. If we say (2) il pouvait this means that we may subsequently be able to say "il a pu ". Interestingly, il pouvait means "il aurait pu " — an unreal value — and there fore means "il n' a pas pu". At this point, one must explain why one has this re versal of the situation. I shall not be able to do this here, space not permitting, but I feel these reversals are a part of the problem of continuity. Although I have a suggestion for a solution, my aim here is to raise the problem, not to put for ward a demonstration. Let us now take (3) il put. In French, il put has one property which is ex tremely clear. Here again I shall not go into all the preambles which have allo wed me to reach this conclusion. We construct this tense as a closed bounded in terval in such a way that the complements are empty. We have, in fact, a pure transition (see Figure 5) :
Figure 5 Exactly the same conclusions — although differently formulated — have been reached by H. Seiler (1952), in his study of the aorist in a remarkable book on aspect in modern Greek, which I consulted recently in connection with this type of phenomena. This is what has also been called the taking into account of the process as a whole. One can observe phenomena of this kind in studies on aspect. There are also other properties which I shall not expound. What, now, are the implications of this in terms of our representation? It means that we have constructed a disconnected reference point, which raises the question as to why, when one constructs, within a reference space, a transition related to a disconnected space, it is inevitably constructed as a whole. If we take a transition, we can, point by point, obtain this type of thing i.e., we may say that point is dominant over this point. If we observe a distortion, this means that we have a cut. If we have no distortion, this means that we have a stable state. If we are astride, so to speak, we have introduced a cut. If we take position on the left, it means that there is a prospect of moving to the right. If we take position on the
CONTINUITY AND MODALITY
29
right, this means that we reconstruct a last point, etc. And here in this particular case, in fact, we take into account both the left and the right boundaries at the same time. And it can be demonstrated that one of the properties of this discon nected reference point is that it allows the existence of this property. We then observe that the branch T0—>TX which, in Figure II above represented a gap, a vacuum, is going to be occupied by a closed bounded interval (see Figure 6) :
Figure 6 Somewhere we will have a transition from "here nothing happened" (the gap remains) and here a change occurs in the course of the transition, so that we obtain "you have reached the top", and (3) Il put atteindre le sommet means in French "ƒ/ atteignit effectivement le sommet " ("He actually reached the top"). We could now go on to study il a pu in the passé composé, and show why it has such and such a value. In fact we may, in this manner, construct all of the semantic interpretations. I feel that I use the word "interpretation" too freely and it could be thought that this is solely a question of interpretative semantics. This is not the case at all, and I use the term because it is convenient and applicable both for production and for recognition in cases such as this, and because the linguist attempts to analyse problems from the outside. 3.
Manifestations of continuity in other types of examples
In my last part I wish to draw the reader's attention to other types of pro blems. I shall briefly discuss three other points. The first concerns the fact that the English (12) it may well (as in it may well be due to...) cannot, by any means, be translated by il peut bien {cela peut bien être dû... is impossible, and does not have the meaning of it may well be due to...). If we wish to translate it into French we must say cela
30
ANTOINE CULIOLI
peut très bien, cela peut fort bien, cela peut parfaitement être dû or cela pourrait bien être dû. If we now analyse by means of a metalinguistic system of representations the English may as compared to can, if we analyse well, if we analyse the conditional, we can demonstrate in algorithmic fashion why the English may well cannot be translated by peut bien but must be translated by peut fort bien or pourrait bien. This is where things start to become interesting, since we have here empirical data which supports and confirms the analysis which is otherwise purely formal. The second point concerns specifically semantic problems. I should have liked to discuss why the German mögen, and the Dutch equivalent mogen as well, in a majority of cases, no longer mean what they originally did (i.e. "to have the power or the strength to"), and have eventually come to mean "to want" or "to wish" : (13) ich möchte means "I would like" and (14) ich mag in certain cases, "I like" (either "apples" or "someone" if I understand correctly). The same applies to Dutch except that it is used essentially for human animates, another verb being used for "apples" or "chocolate". The third point concerns semantic drifting and contiguity. I have made an extensive study of history in the Germanic field ; I do not intend to go into this, but I wish to draw attention to a very interesting case of semantic drifting. We have the choice, here, between two positions : either we say that languages have an element of "divine madness", as Whitehead said about mathematics, and that anything can happen in the linguistic field, or we consider that it is possible to provide an explanation, which is my position. If therefore an explanation can be provided, it stems from what I have referred to as contiguity, i.e. the construc tion of a semantic space such that when we pass from one value to the next we will have a series of successive transitions, which implies that there is contiguity from one value to another, which is a form of continuity. Another very interesting case is that of the use of the verb lassen in German or låta in Swedish which mean both "to let" and the causative "to have someone do something". Here again if we study in detail how we can pass from "I let him do something" to "I have him do something", we come across a very interesting problem since we have the impression that a total or almost total re versal of the meaning has been brought about. Here, once again, we could de monstrate how the transition from one stage to the next is brought about, I mean, from an abstract point of view, as we do not always have all the historical ele-
CONTINUITY AND MODALITY
31
merits at our disposal. This presupposes once again the construction of a space within which these distortions or warps can be brought about. All these examples — maagan, mögen, lassen, — revolve around inter-subject relations. Conclusion Whether one considers the problem of modality from the point of view of the notion of the limit, from the point of view of semantic and aspectual proper ties, or from the point of view of inter-subject relations, one becomes aware that the linguist cannot provide adequate processing for semantic problems without introducing the concept of continuity in a serious sense of the term.
32
ANTOINE CULIOLI
REFERENCES
Culioli, A. 1974. A propos des énoncés exclamatifs. Langue Française, Paris : Larousse, pp. 6-15. Culioli, A. 1990. Pour une linguistique de rénonciation, t. 1 : Opérations et re présentations, Paris : Ophrys. Joos, M. 1964. The English verb ; form and meaning, University of Wisconsin Press. Reichenbach, H. 1947. Elements of symbolic logic, New-York : Macmillan. Seiler, HJ. 1952. L'aspect et le temps dans le verbe néo-grec, Paris : Belles Lettres.
CONTINUUM IN COGNITION AND CONTINUUM IN LANGUAGE HANS JAKOB SEILER University of Köln, Germany
1.
Introduction
Continuum is one of the central notions in the work of the UNITYP research group at the University of Cologne1. In the following presentation we shall first expose a number of theses about the continuum reflecting our actual views about this notion. Our demonstration will then proceed by way of commenting on a recent major publication where this notion has been extensively put to use. The publication is entitled Partizipation : Das sprachliche Erfassen von Sachverhalten. [Participation : The representation of states of affairs by the means of language] (Seiler/Premper 1991). While some of the contributions to this round table are dealing with continuity in lexical semantics, it is the purport of my paper to show the usefulness of this notion in the domain of semantax, i.e. the semantics of syntactic relations — specifically the relation between the verb and its complements and adjuncts. 2.
Theses about the continuum
1) The continuum is a construct serving the purpose of putting some order into a variety of facts. As such it may be compared with such other constructs as the paradigm and the Porphyrian stemma as it appears, e.g., in the phrase struc ture trees of the generativists2. 2) The continuum and the discrete stand to each other not in a contradic tory, but in a contrary or complementary relation : the notion of continuum pre supposes discreteness ; it depicts an increase vs. decrease of properties between discrete steps in a linear ordering. The notion of discreteness in turn presupposes that of continuity. 1) The label UNITYP is an abbreviation of the descriptive title of our project : "Language universals research and typology with special reference to functional aspects". The project is funded by the Deutsche Forschungsgemeinschaft which is herewith gratefully acknowledged. 2) On a detailed discussion of this comparison see Seiler 1985 : 21 ff.
34
HANSJAKOB SEILER
3) The UNITYP framework insists on the distinction between a scale and a continuum (Seiler,1.c.16 f.). Scale means 'measuring staff' ; it is a static, unidi rectional means for measuring regular intervals of a 'minus' or a 'plus' of certain properties. The continuum, on the other hand, while corresponding to the 'measuring staff', has properties that come up to the phenomena themselves : directionality, dynamics, binarity, complementarity, parallelism, and reversi bility. 4) The continua which we can detect within one particular language or in cross-linguistic comparison are prefigured by corresponding continua on the co gnitive-conceptual level. If we imagine a cognitive-conceptual content, such as, e.g., POSSESSION (Seiler 1983), we will find that it can either be progressively specified and elaborated upon, or, on the contrary, simply posited as such, wi thout further specification. A cognitive-conceptual continuum with intermediate steps spans between these two extreme options. 5) The continua that we posit on the cognitive-conceptual level are ideali zations. They cannot be arrived at by empirical observations alone. Instead, they derive from a constant shifting back and forth between observations and rational reasoning. They are regular. Linguistic continua, on the other hand, can be irre gular, showing overlaps, or gaps, as e.g. when we say that language X has no case marking. But it is precisely such statements that are only possible on the background of a framework where overlaps and gaps do not occur. 6) There must be a common functional denominator to a continuum. The items on the scale of a linguistic continuum are semantically distinct. E.g. tran sitivity differs semantically from case marking. Yet they have a denominator in common which we situate on the cognitive-conceptual level. 7) Adjacency is a further defining trait of a continuum : adjacent positions are more similar to one another, share more properties, than non-adjacent ones. 3.
The cognitive-conceptual dimension of Participation
The notion of Participation pertains to the cognitive-conceptual level. It is a relation between a participatum and its participants. The participatum, "that which is participated in", can be, e.g., a situation or a process. The participants can be, e.g., actants or circumstants, or more complex entities. The mental repre sentation of this relation is brought about by a number of different options which we call techniques. The techniques can be arranged in a continuous order pro gressing from minimal to maximal elaboration on the relation of participation — or, in the reverse sense, from minimal to maximal condensation of the relation. In the overview (p. 35) you find an ordered array of 10 techniques — options for the mental representation of the relation of participation. In paren theses you find such familiar morpho-syntactic notions as nominal clauses, noun/verb distinction, verb classes, valence, etc. These are preceded by such terms as posited participation, distinction participants/participatum, generally
CONTINUUM
35
implied participants, etc. They pertain to the cognitive-conceptual level and are coined to keep the two levels distinct. They are not meant to replace morphosyntactic terminology, but to encompass it. At a first glance the elaboration proceeds as follows (see the overview) : in the first two techniques the mental representation is holistic, instantaneous, or global. Techniques 2 to 5 lead to an increasing differentiation on the participation. Technique 6 marks a turning point where differentiation gradually shifts from the participatum to the participants. Techniques 7 and 8 produce further differentiations among the participants. Now that both participatum and participants are fully specified, techniques 9 and 10 proceed to further specifying the relation between them. This is brought about by a special relator that acts as operator. The path just described simulates the stepwise build-up of the relation of participation in our minds. It can also be followed in the reverse, leading to an increasingly condensated representation. We call such an order of techniques a dimension, in our case the dimension of participation. Overview 1. Posited Participation (holophrastic expressions, nominal clauses) 2. Distinction participants/participatum (noun/verb distinction) 3. Generally implied participants (verb classes) 4. Specifically implied participants (valence) 5. Orientation (voice) 6. Transition (transitivity and intransitivity) 7. Role assignment (case marking) 8. Introduction of new participants (serial verbs) 9. Cause and effect (causatives) 10. Complex propositions (complex sentences) Now you may say that this is all right. But how does the gradual exfolia tion or condensation of the relation of participation work ? What are the opera tional steps ? How are the techniques delimited from one another ? How do we define the dimension? Here we must have recourse to one further notion, the notion of parameters in a specifically defined sense. They are principia comparationis with a plus pole and a minus pole and possible intermediate steps, thus in principle again continua. They represent possible universals of cognition, and, at the same time, possible universals of language. As such, they would have to be defined — a task that must be left to further work. But like the notions of techniques and di mension, the parameter is an operational notion in the first place. This means that the definitions would have to be operational rather than categorial. The pa rameters are listed separately on p. 37 and figure again in the vertical of the fol lowing chart.
HANSJAKOB SEILER 36
PARAMETERS 1. meta language / object language 2. context-sensitive/ context-indep. 3. categorial / thetic 4. relational / absolute 5. referential / general 6. time-stable / temporary 7. dynamic / stative 8. active / inactive 9. plurivalent / monovalent 10. centralized / decentralized 11. profile outgoing / ingoing situation 12. basic / derived 13. individualized / non-individualized 14. affected / nonaffected 15. total in vol v./ non-total involv. 16. overt relator / covert relator 17. volitional / nonvolitional 18. control / noncontrol 19. emotive / nonemotive 20. epistemic / nonepistemic
2. Distinction p'atum/p'ants
3. Generally implied p'ants
4. Specifically implied p'ants 5. Orientation
6. Transition
0
7. Role assignment
0
8. Introducing new p'ants
0
9 Cause + effect
0
10. Complex propositions
0
1. Positing p'ation
0
0
0 0
0 0
0
0
0
0
I
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
I
0
0
0
0
0
0
0
0
0
0
0
0
0
0
I 0
0
0
0
0
0
0
I
0
I
0
0
I
0
I I 0
I
0
I
I I
0
0
0
0
0
0
0
I
0
0
I
I
0
0
I
I
0
0
0
I
I
I
0
0
I
I
I
I
0
0
I
I
I
I
I
I I
I
I
I
I
I
I
0
I
I
0
I
I
I
I
I
CONTINUUM
37
List of parameters 1. metalanguage / object language 2. context-sensitive / context-independent 3. categorial / thetic 4. relational / absolute 5. referential / general 6. time-stable / temporary 7. dynamic / Stative 8. active / inactive 9. plurivalent / monovalent 10. centralized / decentralized 11. profile outgoing situation / profile ingoing situation 12. basic / derived 13. individualized / non-individualized 14. affected / non-affected 15. total involvement / non-total involvement 16. overt relator / covert relator 17. volitional / non-volitional 18. control / non-control 19. emotive / non-emotive 20. epistemic / non-epistemic This chart exhibits a cognitive-conceptual space featuring in the horizontal the techniques in the order from 1 to 10, and in the vertical 20 parameters in an order that follows the gradual build-up of the relation of participation as des cribed above. How did I arrive at these parameters, and how did I arrive at their ordering ? The procedure is not entirely governed by predetermined criteria. Rather, there is an initial understanding of such a Gestalt as the mental represen tation of the relation of participation. Parameters are chosen so as to guarantee maximum recurrence in other dimensions. Their ordering follows from rational considerations in the following vein. When perusing the sequence of parameters one can see that the first three are chosen with reference to a global characterization of Participation ; that 4 to 9 give rise to a progressive differentiation on the participation side ; that 10 to 12 mark a shift from the participatum to the participants ; that 13 to 15 produce further differentiation on the participants' side ; and that 16 to 20 have reference to the relation itself by introducing a RELATOR indicating whether the relation is one of cause, finality, consequence, etc. For lack of time and space it is not possible here to comment on every single parameter. Most of the names may be more or less self-explanatory.
38
HANSJAKOB SEILER
Nr. 11 "profile outgoing vs. ingoing situation" refers to a directed and bounded process (participatum) extending from its origin to its final stage and comprising an initiator that sets the process into going, and an undergoer affected by the process. In the technique "Orientation" that introduces this parameter the operation either profiles the outgoing situation and highlights the initiator, or the ingoing situation highlighting the undergoer. The total operation within this dimension proceeds in the following steps : 1) One to three new parameters are introduced in each technique as compared with the technique immediately preceding. The newly introduced parameters are those which distinguish it from the preceding technique. 2) The parameters inherited from a preceding technique are "active" for the following technique inasmuch as they produce new oppositions ; i.e. oppositions not encountered for that particular parameter in preceding techniques. On the chart this is symbolized by a vertical stroke. Otherwise they are "inactive", but they still form part of the total pool of parameters constituting the dimension. This is marked by a circle. To give an example : Parameter 4 (relational/absolute) applies to the opposition between noun and verb in technique nr. 2 : Nouns are typically absolute, non-relational ; verbs are typically relational, non-absolute. In technique nr. 3 the same parameter refers to the opposition between different verb classes, thus a new opposition. It is therefore "active" in this technique. In technique nr. 4 it refers again to the opposition between verb classes, and is therefore "inactive". 3) The active plus the newly introduced parameters define the technique, which thus appears as a bundle of parameters. 4) The dimension is defined by the thus ordered sequence of techniques. The schema visualizes the Gestalt of a continuum. The idea of decomposing such phenomena as transitivity or roles into a number of parameters is not new. Compare the work of Hopper and Thompson (1980) on transitivity, of Givón (1981) on passive, or Ch. Lehmann (1991a) on verb classes. Note that the bundles of parameters define the technique, but not necessa rily the linguistic means representing that technique. Thus, verb serialization is a means of representing technique nr. 8 ("introducing new participants"). But verb serialization serves other functions as well, such as TAM and directionality (Bisang 1991 : 510). The four parameters define the technique as represented by serial verbs only insofar as they represent participation, that is in their manifesta tion as coverbs. The chart says that techniques to the left and the right margin need fewer constitutive parameters as compared with the more central techniques. Transition shows the greatest number of active definitory parameters with regard
CONTINUUM
39
to the total pool. This amounts to saying that Transition is the prototypical tech nique of the entire dimension. 4.
Linguistic coding and typology
The question here is as follows : how are the various positions in the co gnitive-conceptual space as outlined above coded in one individual language and across languages ? At first sight it seems that at least in some languages the lin guistic coding reflects the cognitive-conceptual orderings in techniques and pa rameters fairly closely. As we follow the order from left to right we normally find a steady increase in morpho-syntactic "machinery" put to use in the repre sentation of Participation. In our UNITYP work we have described this by two universal operational principles, called indicativity vs. predicativity. Indicativity means that the cognitive concept is represented as inherently given, as being taken for granted. A simple renvoi suffices. Compare holophrastic utterances such as fire !, night !, or nominal clauses such as nomen omen. If the morpho-syntactic means are sparse, this is compensated by a high degree of pragmatic relevance : context-sensitivity, discourse pragmatics, etc. Predicativity means that the cognitive concept is represented by progres sive explication and differentiation, first on the participatum, then on the parti cipants, and finally on the signs for the relation itself. More and more "machinery" is being introduced, the number of oppositions increases. Complex sentences to the right-hand pole of the continuum represent the end-point. A third semiotic principle, iconicity, appears to have its preferential peak somewhere in the middle where the two other principles are about equal in force. In fact, transitivity - intransitivity, representing transition, exhibit a fairly balan ced distribution of markings on both the participatum and the participants, the reby iconically depicting the shift from the participatum to the participants. This technique also marks a turning point between constructions of government (e.g. valence) and constructions of modification (e.g. adverbials). Looking at one particular technique, say, "Cause and effect", we find that it aptly represents an invariant with corresponding linguistic variants again con tinuously ordered according to indicativity and predicativity. (German) : (1) i. Die Grossisten machen, dass die Ölpreise sinken "The wholesalers bring it about that the oil prices go down" (complex sentence, maximally predicative) ii. Die Grossisten lassen die Ölpreise sinken "... let... go down" (lassen as auxiliary verb, less predicative)
40
HANSJAKOB SEILER
iii. Die Grossisten senken die Ölpreise (one verb, umlaut, morphologically marked as causative) iv. Die Grossisten dumpen die Ölpreise (foreign word, unanalyzable, so-called "lexical causative", minimally predicative) Analogous continua between indicativity and predicativity can be observed for the other techniques as well — both intra — and cross-linguistically. If this is so, we predict that there will be overlaps in the linguistic coding as compared with our cognitive-conceptual space, where such overlaps do not occur. We have seen that the cognitive invariant "cause and effect" includes complex sentences as one of its coded variants. But complex sentences also ap pear under the technique "complex propositions" — and prototypically so. There is thus overlap. In a way, one might say that the linguistic codings produce a multi-dimensional space as compared with the two-dimensionality of our schema. On the other extreme of our causative continuum we found the so-called "lexical causatives". How can we draw a line between these alleged causatives and transitive action verbs? In a sequence of codings as arranged in (1), i.e. a continuum, it seems plausible that dumpen marks the end point. The continuum from i. to iv. exhibits a steady increase in grammaticalization. It operates on the parameter RELATOR that appears as bring it about that or to cause to and eventually ends up as an immaterialized, zero operator (Ch. Lehmann 1991b : 25 f.). Parameters can be over-extended or stretched. This is certainly the case when linguists are mislead into equating to kill with cause to die, or, when they even postulate causativity for such verbs as to sit under the odd paraphrase as "to cause one's body to 'rest upon the haunches'" (Wierzbicka 1975 : 520). Further overlaps are due to the fact that different techniques are not coded in disjunction, but quite often in combination with one another. On the other hand, there are also gaps. During the times of dominant structuralism it was con sidered unscientific to state that a language like Tagalog does not exhibit a clearly recognizable noun-verb distinction (Himmelmann 1991). Nowadays such assertions can be quite to the point, because, among other things, they may lead to inquiring into how these languages make up for the lack, e.g. by extension of one parameter in the technique "Orientation" ; in this case it would be the para meter "basic vs. derived". This amounts to saying that on the cognitive-concep tual level participatum and participants are distinct ; but on the level of linguistic coding it is not necessary that every language represents the distinction by a dis tinction between word classes noun and verb. A persistent endeavour to keep the levels distinct may help to settle the endless discussions about this problem.
CONTINUUM
41
It may also be helpful in statements of grammaticalization and of typology. To cite but one example : it has become customary to speak of "indirect case marking" when the morphological marking of case roles is on the verb instead of on the noun (Mallinson & Blake 1981 : 42 f.). On the cognitive-conceptual level we have the invariant "role assignment" (technique nr. 7). Morpho-syntactically there is an overlap between markings on the noun (= case marking) and markings on the verb (= cross-reference or agreement). Some languages exhibit one procedure to the exclusion of the other. Some other languages show a split along the parameter "centralized-decentralized" : decentralized participants are marked on the noun, centralized participants are marked on the verb. Typically, the maximally grammaticalized subject nominative is marked by verb agreement even in such languages that don't show any other verb agreement (English). And typically, a subject nominative has no case suffix on the noun even in such languages that otherwise do show case suffixes (German). This illustrates the pathway that grammaticalization has taken and is likely to take in analogous situations. 5.
Concluding remarks
1) We started from the assumption that the human mind has the ability to construct continua for the purpose of ordering the phenomena of this world. We set out to show the usefulness of such a construct. It is useful in leading us to distinguish between a level of invariance, where the continuum appears in its idealized form, and a level of variation, where continuity still obtains but over laps and gaps are not excluded. On this level it can be shown how semanto-syntactic phenomena that may seem to be disconnected otherwise, are related to one another : how verb classes are related to voice, voice to transitivity, transitivity to case marking, etc., but also distant relationships as between noun-verb dis tinction and voice, or causative and passive. 2) Taking up a question raised by G. Leech : "How do we demonstrate that gradients exist ?" The continuum appears to be the locus of linguistic energeia : gradual exfoliation vs. condensation ; substitutability between adjacent posi tions ; grammaticalization paths and paths of language change ; all amenable to empirical verification. 3) A note on the question of how we get to our cognitive-conceptual level : is not linguistic coding the only way of access ? Are we not moving in a circle ? — Yes, we are ; but in a non-vicious one. Initial observation of language data unveils the leading dynamic principles : indicativity, iconicity, predicativity, and the tendency toward gradient transitions. Rational considerations construct a lo gically consistent framework. Data from concrete languages can then be measu-
42
HANSJAKOB SEILER
red against such a framework. This, and only this, is the sense in which we want cognition-conceptuality to be understood. We do not mean to rely on the results of adjacent sciences — psychology, epistemology, etc. — for confirmation. 4) Both conceptual and linguistic continua admit of indefinitely recove rable intermediate values. On the linguistic side this can be shown by pointing out that techniques, steps on the overall continuum, admit of subcontinua which, in turn, are bundles of parameters, which are again continua. The antinomy bet ween a continuum model and a model of "multiple discrete factors intersecting to yield a finely articulated range of possibilities" as evoked by R. Langacker can be resolved by introducing the notion of parameters into the dimensional model.
CONTINUUM
43
REFERENCES Bisang, Walter. 1991. Verb Serialization, grammaticalization and attractor positions in Chinese, Hmong, Vietnamese, Thai and Khmer. In Seiler & Premper (eds.), pp. 509-562. Givon, Talmy. 1981. Typology and functional domains. Studies in Language 5.2, pp. 163-193. Himmelmann, Nikolaus. 1991. The Philippine challenge to universal grammar. Arbeitspapier Nr. 15 (Neue Folge), Universität zu Köln : Institut für Sprachwissenschaft. Hopper, Paul & Sandra A. Thompson. 1980. Transitivity in grammar and dis course. Language 56, pp. 251-299. Lehmann, Christian. 1991a. Predicate classes and Participation. In Seiler & Premper (eds.), pp. 183-239. Lehmann, Christian. 1991b. Relationality and the grammatical operation. In Seiler & Premper (eds.), pp. 13-28. Mallinson, Graham & Barry Blake. 1981. Language Typology. Cross-linguistic Studies in Syntax, Amsterdam et al. : North Holland Publ. Comp. Seiler, Hansjakob. 1983. Possession as an Operational Dimension of Language. Language Universals Series, vol. 2, Tübingen : Gunter Narr Verlag. Seiler, Hansjakob. 1985. Linguistic continua, their properties, and their interpretation. In Seiler & Brettschneider (eds.), pp. 14-24. Seiler & Brettschneider (eds.). 1985. Language Invariants and Mental Operations. International Interdisciplinary Conference held at Gummersbach/Cologne, Germany, September 18-23, 1983. Language Universals Series, vol. 5, Tübingen : Gunter Narr Verlag. Seiler, Hansjakob & Waldfried Premper (eds.). 1991. Participation. Das spra chliche Erfassen von Sachverhalten. Language Universals Series, vol. 6, Tübingen : Gunter Narr Verlag. Wierzbicka, Anna. 1975. Why 'kill' does not mean 'cause to die' : The semantics of action sentences. Foundations of language, 13 : 4, pp. 491-528.
IS THERE CONTINUITY IN SYNTAX ? PIERRE LE GOFFIC University of Paris III, France and URA 1234, CNRS, Caen
0.
Introduction
My aim in this paper is not to make propositions about continuity. Neither is it to give a mathematical or philosophical definition of it. Continuity, to my mind, can serve as "l'un de ces vocables, fort nombreux, auxquels il n'est demandé que de recouvrir d'un terme expressif une signification fuyante", as Guillaume said of metaphor (1968, p.171) : one of those intuitive and expressive terms which cover somewhat diffuse meanings ; it is possibly in this respect that it may be of some use to us. I wish therefore to put forward a criticism of discreteness, in those areas in which, wrongly or rightly, it is taken for granted. Thus, the question which I shall be examining is, in some ways, that single hour of metaphysics and critique in which one may indulge once a year, as Descartes is believed to have said. Is it really an established fact that there is, in language, a totally and indisputably dis crete framework or architecture ? 1.
Discreteness : a point of transition between two continuities
To begin, my representation of facts is extremely simple, possibly running the risk of over-simplification. I imagine language activity (in one certain respect, at least) as is represented in the simple diagram below :
46
PIERRE LE GOFFIC
In this sort of double funnel, the left part shows the phases that precede any interlocutory language act, i.e., the person who is about to speak forms, some how or other, a linguistic project, and puts it into shape. In the right-hand part, the receiver, on the basis of indications provided by the first speaker, tries in turn to build up for himself a representation which he can interpret and make pragmatic use of. Well, I would say, loosely speaking, that there is continuity at the start and at the finish. Where discreteness comes in, — necessarily to my mind —, is at the point of transition between the two. This point must necessarily be discrete since it is the only objectifiable part of the linguistic process ; it is the visible part of the iceberg. Here, in order to proceed, to become objectified, the message must be translated into formal, usable symbols. This, I think, is a question of common sense, which does not mean that guarantors and authors could not be found for it. There is a filtering process (somehow or other) between all the diffuse elements of experience and intended meaning, and the coding into discrete units. Then, the movement is continued and shifts to the interpretation side. Here one could talk at length about the infinite or at least inexhaustible character of meaning, in this part of the language act. Thus the point of transition necessarily occurs. I feel that discreteness, if anywhere, is here, i.e., at this necessary, compulsory transition point, which is the condition of communication between two ill-defined worlds which I have no means of describing but to which one can apply, roughly speaking, the term of continuity. One could simply add that continuity, in all cases, is multiform, if only with respect to the time and space dimensions, and all the metaphorical uses made of these. Language has systems of automatic adaptation to continuity, and this seems to me to be one of the features which make the formal apparatus of enunciation such a remarkable device : it is perfectly adapted to continuity. The objects in the world undergo changes in space and time, and a number of linguistic markers (among the most basic in language) have, among other functions, that of maintai ning a kind of permanence in the changing of these objects, thus giving us the comfortable illusion of fixity and stability that suits our pragmatic needs, in a world which is neverthess in a demonstrable state of permanent change. There is thus a sort of contradiction, which could be (and has been) developed from a phi losophical point of view. There is a kind of automatic adaptation of language in some of its essential aspects to this shifting character of reality. I shall not go into details about this, the important point being at the moment that there is this neces sarily, inescapably, discrete point of transition.
CONTINUITY IN SYNTAX
2.
47
The degree of discreteness in linguistic coding
But let us now consider this point to see if it really is all that discrete. Of course I do not intend to completely refute the observations made above and I shall attempt to remain within reason. Discreteness, here, has two facets : firstly the use of discrete units (phonemes and morphemes), and secondly, the categorization of these units in terms of discrete grammatical categories. As to the first point, the units must be discrete in order to be interpreted. And in this respect I shall say from the outset that there is no discussion possible : should there be any ambiguity, it will be a genuine alternative ambiguity (cf. Le Goffic 1981 ch. 7 ; Fuchs 1991 pp. 112-113). There is no continuity from one phoneme to the next, from a linguistic point of view. There can be continuity in the physical production of sound, but there is of course no continuity in the phonematic interpretation of these productions. As to the second point, everybody agrees that a term can be characterized, for example, as a noun or a verb, and further as having such and such a function in a sentence. So far, one reasons in discrete terms, with necessary choices : a term is a noun, so it is not a verb ; it is subject and therefore not something else. Although language can provide intermediate stages (like the infinitive or certain forms of nominalisation, sharing nominal and verbal features), one ultimately reaches a stage of discrete categorization. I would now like to examine these two points more carefully. 3.
Are units totally discrete ?
I have just expressed the intention of not going against the necessarily dis crete character of phonemes and morphemes. However, there are a few small problems concerning morphemes. Roughly speaking, if the cutting-up into mor phemes were possible without any restrictions, we would have a stock of per fectly determinable words. So it would be possible, with a few arrangements, to draw up a typical, fixed Hst of the lexicon of a language. Now, when one looks at history, including the history one fashions as a speaker, it is clear that a number of problems arise. If things are to be absolutely discrete, this means that each unit is to be confined to its own place, like a cell or an atom, adjacent to the next one, and so on. The units will not touch each other and will remain by definition separate. Where the contradiction arises, is when one unit gives rise to two, in the process of dissociation, or conversely when two units become a single one, in the process of fusion. Now this does occur, and the history of language bears witness to this, if not abundantly, at least clearly.
48
PIERRE LE GOFFIC
The relevant examples are habitually cast aside (although many of them are well established and well known), belonging to that minority of zero point some thing per cent which do not conform to the statistically normal situation used as a reference. The problem is that there is always a small fringe on the edge of the system which prevents one from encompassing it. This is a delicate problem, to which I am not sure what answer one could suggest from the outset, as it con cerns the status which should be given to that zero point something percent. In ordinary practice the rejection of these examples is a reasonably workable procedure. But this may be precisely the reason why it is ultimately impossible to encompass the system. So let us not cast them aside by principle. The history of various sciences may bring some support to the method of focusing on minority phenomena. I shall now proceed to a brief comment on a few examples, starting with cases of differentiation of units (leading to homonyms) : - the French verb voler : "to fly" and "to steal". We know that the flight of the bird is the primary meaning and that the language of venery (le faucon vole la perdrix "the falcon 'flies' the partridge", i.e. "the falcon, in its flight, gets hold of the partridge by force") gave rise to something which is accepted in synchrony as another separate unit ; - the French noun grève : "strand, beach" and "strike". The Place de Grève in Paris (strand along the Seine) was a gathering place in the XIXth century for workers wishing to stop work ; - in other cases differentiation led to graphic differences, as between dessin ("drawing") and dessein ("aim"), or between penser ("to think") and panser ("to dress (a wound)", related to pansement : "a bandage"). I still vividly remember my surprise (not to say my incredulity) when I discovered thai penser and panser were cognate, whereas panser, as in panser un cheval ("to groom, to rub down a horse"), had nothing to do with the panse ("belly") of the animal. Examples of this type abound, and detailed lists can be found in reference books (see e.g. Buyssens 1965, with numerous examples from English). Of the process of fusion, on the other hand, there are fewer examples : lan guage systems, supported by some Académie Française or other such social norm, offer more resistance to this type of process. Cases of fusion normally ap pear as reprehensible in the eyes of the norm, although they are fairly widespread : - a jour ouvrable ("working day") is a day when one oeuvre (old verb : "works"), but as it follows that it is also a day when the shops ouvrent ("are open"), it tends to be understood and used as meaning "openable day" ; - the adjective chaotique is found not only with its "genuine" spelling, which goes back to chaos, but also with the spelling cahotique, from the noun ca-
CONTINUITY IN SYNTAX
49
hot ("a jolt", "a bump"), as in un parcours cahotique ("a bumpy ride"), — which is a freak in the eyes of the norm ; - the verb dériver, as used in modern French, has a twofold origin, as Muller (1962) has explained : it is the continuation of the verb dériver ("to turn a river away from its course"), but draws at the same time on an old verb driver, of English origin ("to push" or "to be pushed" ; cf. Eng. to drive). The resulting meaning, a semantic cross between the two, is that of a slanting movement. Furthermore, it would of course be easy to elaborate upon the subject of fu sion between elements, taking into account the effect of associations governed by the subconscious, puns, etc. which channel semantic processes. Strikingly, there is no limit to the plasticity of semantic formations, and there is no unsurmountable boundary of contradiction : the fact that ils peignent can be said both of painters (verb peindre : "they paint") and of hair-dressers (verb peigner : "they comb") may well suffice to build up a unique semantic entity : a gesture common to those who use a comb or a paint-brush. In this matter, actual language data show more imagination at work than one can fancy (see Buyssens 1965 ; Le Goffic 1982). The essential point of all this is that it is not possible to "lock up" the stock of lexical items. In fine, a lexicon is a set of items in unstable equilibrium, in which the units exert upon each other conflicting or converging pressures (Le Goffic 1988) placing the system in jeopardy. This is how the system holds. One can, of course, take a stabilized viewpoint and ignore all this, which obviously gives a certain authority over the lexicon, but at the cost of neglecting phenomena which do exist, and whose neglect may prove harmful. 4.
Problems of grammatical categorization
A noun is a noun, a verb is a verb. Here again, this works in 99 % of the cases. This is important, sufficient perhaps, but one wonders what is to be done with the 1 % which does not work. Cases of terms difficult to classify are not ex ceptional, and one could draw up a catalogue, which would probably be most instructive. What is to be done with debout (originally a prepositional phrase de+bout, but generally considered as an adverb, with uses very close to those of an adjective : être assis ou debout, "to be sitting or standing") or with plein "full", commonly regarded as a preposition in plein les poches (lit. "the pockets [being] full" ; note the word order and the invariability of plein), etc.? This list could easily be lengthened, but I wish to concentrate on a few examples, beginning with that of the word pas. The story of how the word pas, which originally meant "a step", came to designate the negative, is fascinating. Je ne marche used to mean "I do not walk", and ... pas, "...(not even the minimum distance of) a step". Then pas became an obligatory part of negation, along with ne. There was ensuing competition bet-
50
PIERRE LE GOFFIC
ween the two, and finally in modern French, as we know, pas has more or less ousted and replaced ne. It is the pas which often represents negation in everyday spoken French. This is, to be sure, an extraordinary trajectory, resulting in a change of cate gory, since everybody agrees that pas is an adverb in this second use. Remarkably, this occurred in an area where one would expect logic to be most readily found. The logical operator of negation is somehow linguistically cut up in French, and shredded into two pieces. This historical and synchronic treatment of negation in French is a kind of slap in the face to this logic which we all need. Now is it possible to process a system in which changes of such magnitude can take place without resorting to continuity ? Once again, I use a term which would require more clarification, — which I am not in a position to provide. It is clear, nevertheless, that this type of question cannot be avoided. Can a system in which such phenomena occur be totally locked into a discrete perspective ? There must be, somewhere, an opening for transition, a possibility of continuity. To escape this, one might say that this is History, even ancient History, since the facts involved here extended over seven or eight centuries. But is there anything which allows one to say that changes of the same type are not taking place currently, though invisibly from a synchronic point of view ? This assump tion would imply that language itself (French) has changed on one essential point since the period when it allowed this type of evolution. What evidence do we have to claim that language has changed, qualitatively, since the Middle Ages, so that it no longer provides openings for this sort of evolution ? Thus, if one wishes to re ject the example of pas, one must prove that language itself has undergone a change, which demand may not be easy to satisfy. The word pas thus still raises problems today. Should one say about this example that one swallow does not make a sum mer, I would reply that, in this domain, it does : a single example suffices in fact to raise a problem concerning the quality of language. Were there but this single case, in the whole of French or indeed in all the languages of the world, I think that the problem would be raised all the same. But there are many more. Quantity and measure, for example, are not easy to categorize in surface terms. They are part noun, part adverb. I shall simply say that beaucoup and longtemps are obviously noun terms, but their functions show a mixture of nomi nal and adverbial features. Thus, in the field of quantification, I might, if I wished to use a somewhat provocative type of speech, say that discrete filtering, here, is not particularly successful. The number of difficulties regarding categorization, as can be seen, cannot be restricted to a narrow list, and I tend to think that this number is directly related
CONTINUITY IN SYNTAX
51
to the number of difficulties in syntax. In other words, I wonder whether every difficulty in syntax does not have something to do with problems of continuity. 5. The example of Fr. que Last, I would like to say a few words about Fr. que, the well known "complementizer", as the generativists call it : the wide range of its uses and the difficulty to break them down into separate units make it a particularly appropriate example for a discussion about continuity. If one attempts to outline the general functioning of the term, a certain num ber of uses can be retained as basic (cf. Le Goffic 1992) : a) the interrogative pronoun as in Que fais-tu ? ("What are you doing ?"), Que faire ? ("What is to be done ?") or, used indirectly, in Je ne sais que faire ("I don't know what to do" ; indirect interrogative examples are more difficult to find). b) "relative without an antecedent", as in Advienne que pourra ("Come what may" "Come whatever may [happen]"), parallel to the use of qui in Rira bien qui rira le dernier ("He who laughs last laughs longest"). In fact this type of use, confined to proverbs or limited phraseology in the case of qui, is nearly totally obsolete as regards que : the example quoted above is practically unique ; it is nevertheless structurally important. As has often been noticed, this label "relative without an antecedent" is inappropriate : the demonstration of this can be given easily, though indirectly, by considering the use of qui : in Rira bien qui rira le dernier, and more obviously in Embrassez qui voulez ("Kiss whomever you want to"), qui does not belong to the paradigm of the relative (it could not be object), but to that of the interrogative. For that reason, I call the qu- words occurring in these structures integrative pronouns (following Damourette et Pichon). c) the relative pronoun : le livre que j'ai lu ("the book which I read"). d) the "completive", as in je dis que ça va ("I say that it is all right"). A second train (series of uses) must be added, not with the pronoun que but with a homonym, the (indefinite) adverb of quantity que, which goes back to the Latin quam (and not quid or quod, which are the etyma of the pronoun que) : a) que exclamative adverb of degree as in Que c'est bon ! ("How good it is !"), Que d'efforts il a dû faire ! ("How many efforts did he have to make !"), which can also be found in indirect uses under certain conditions, as in Vous savez que d'efforts il a dû faire ("You know how many efforts he had to make"). b) que integrative adverb, as in 77 ment que c'est une honte ("He lies to which degree it is a shame"). This example (colloquial) may seem marginal, but it
52
PIERRE LE GOFFIC
is at the base of a whole series of extremely important and indisputable uses, viz. comparatives and consecutives : Paul est plus grand que Jean, "Paul is taller than John", i.e. : "Paul is superiorly tall, [in comparison with] to which degree John is tall". We thus obtain six types of examples which I readily consider basic : Que (1) (2) (3) (4)
pronoun : Interrogative : que faire ? ; je ne sais que faire Integrative : advienne que pourra Relative : le livre que j'ai lu Completive : je dis que ça va
Que adverb : (1) Exclamative : que c'est bon ! ; vous savez que de mal il a eu ! (2) Integrative : il ment que c'est une honte ! ; Paul est plus grand que Jean Here, it seems to me, are most propicious grounds for a debate about discre teness and continuity. What unity is there to all this ? It is certainly to be found in the basic fea ture of indefiniteness, in other words in the scanning ("parcours", "Verlauf') of all possible values. The whole series of the qu- terms {qui, que, quoi, quel, quand, as well as, though less apparently, où, comme, comment, combien, wi thout forgetting que as an adverb) have in common this indefiniteness, this scan ning process, applied to various domains : animate vs inanimate beings, quality, time, space, manner, quantity, degree). All the uses of que (pronoun or adverb) can be derived from that basic fun damental value. Let us start with the pronoun. In the interrogative, the quest is open and presented as requiring an ans wer. When I utter Que fais-tu ?, I am in search of "whatever you are doing", i.e. of the right instance X to validate "you are doing X", and it is urgent that you, the interlocutor, should "clock in". In the integrative use advienne que pourra, on the other hand, there is no urgency since any value settled for will do, as any value that validates "pouvoir [advenir]" ipso facto validates advienne ("let it come, let it happen"). In this type of use (as in rira bien qui rira le dernier), the indefinite value of que (or qui) is particularly perceptible. What now is the relation between a relative and an indefinite pronoun ? How can a relative stem from an indefinite ? This is an important and delicate position. I will adopt the following line of explanation : the qu- relative is still (logically, at the start) an indefinite pronoun, whereas the antecedent is the term which fulfills (saturates) it, and somehow provides the apropriate value, the ans-
CONTINUITY IN SYNTAX
53
wer to the question raised by the indefinite. So it seems that the relative is the re sult of a kind of capting process exerted by the antecedent upon an integrative, the result being the development of an apparently anaphoric relation. Latin examples would bear witness to a state of language in which relative pronouns and their socalled "antecedents" are clearly autonomous, the latter being often not before (ante) but after the former, thus clearly appearing as the saturating term, the res ponse (and not as the antecedent of an anaphoric relation). One can reasonably as sume that this represents an early stage of the formation of our relative system, but of course many aspects of the process (e.g. the above-mentioned change of the paradigm) are yet to be elucidated. As now regards the completive, it appears at first to be completely sepa rate. There is no indefinite value perceptible, no anaphora, the que has no func tion, it is said to be a different part of speech, etc. But here again, if one looks ca refully — and history bears this out — the completive appears to be related to the other uses of que : it is a kind of degenerated relative or integrative. To take an example of this "degeneration" (one of those troublesome atypical examples which are generally discarded by grammarians, who hardly ever venture to cate gorize it) : C'est une chose étrange que cet aveuglement (lit. "It is a strange thing that this blindness", i.e. "this blindness is a strange thing"), or Qu'est-ce que la métempsychose ? (lit. "What is it that metempsychosis ?", i.e. "What is metempsychosis ?"). What, here, is the que ? One is led to assume an ellipsis of the verb "to be" (Qu'est-ce que la métempsychose [est] ?), which hypothesis is supported by the fact that the verb être comes out clearly if the sentence is modalized : Qu'est-ce que la métempsychose pourrait être d'autre ? ("What else could metempsychosis be ?"). Ultimately the underlying structure is something like ce que cet aveuglement est, est une chose étrange ("what this blindness is, is a strange thing"), ce que la métempsychose est, est quoi ?, in which modern French requires ce que in place of the underlying integrative (obsolete) *que N est (as in advienne que pourra, quoted above ; parallel to Eng. what N is). One step further, one finds je dis que S, with the typical completive que, nevertheless analyzable as a degraded integrative (je dis (ce) que P [est], lit. "I say what S is"). Besides, a relative que with ellipsis of être will yield a completive in Il a cette particularité qu'il est gaucher ("he has that particularity that he is lefthanded") = il a cette particularité que IL-EST-GAUCHER [est], the clause following que being the subject of the elliptic verb "to be", que being the relative, whose antecedent is cette particularité, and whose function is "attribute of S" : "5 (=il est gaucher) est cette particularité". In fact the number of que items labelled comple tives which do have antecedents is very high : I shall only mention the numerous cases in which the completive has to take the form ce que (e.g. Je tiens à ce que vous veniez "I insist that you should come").
54
PIERRE LE GOFFIC
Turning now to the adverbial que, exclamative as in Que c'est bon !, it is a homonym of the pronoun que, although related to it. But here again we come across the fact that quantity, as I mentioned above, is something which involves both noun and adverb categorization. In fact que has nominal features in Que de gens se trompent ! ("How many people make mistakes !"). Besides, it is often (optionally or necessarily) replaced by ce que {Ce que c'est bon Î, with a relative que), in which case it may be difficult to categorize them as pronominal or adver bial : Vous ne pouvez pas savoir ce qu'il a pu f aire comme bêtises ("You cannot tell what mischief he has been up to"), Vous ne pouvez pas savoir ce qu'il est ennuyeux (" You cannot tell how boring he is"). As regards the integrative use of the adverb que, as in 77 ment que c'est une honte !, or, in a more important field, in comparative (correlative) clauses like Paul est plus grand que Jean, it still retains its value of "adverb of indefinite de gree" : Paul is taller that John, whatever John's tallness may be. To summarize briefly : the major distinctions usually recognized about que appear as mere salient stages. My position is that it may well be possible to find a gradation, possibly continuous, between them (as well as ambivalence phe nomena : in Qui as-tu dit que tu voulais voir ?, "Who did you say you wanted to see ?", the highly controversial nature of que seems to me to be at the intersection between relative and completive functions). I wish to conclude, once again, not with a proposition about what continuity is and how it should be processed, but rather, a tone below, with a criticism of discreteness. Ultima verba : whenever a problem arises involving possible con tinuity, it is always possible to try and defend discreteness by taking smaller units, by refining down to smaller terms. Thus one succeeds in breaking up what see med to be a continuous, dynamic entity, into a more manageable series of small discrete units. This reminds me of the paradox of Achilles and the tortoise. Perhaps there is a kind of false problem in this opposition between the terms of continuity and discreteness. What I wish to emphasise is that the discrete catego ries which we work with are too approximate, if not basically inadequate, and I do not think linguistics has anything to gain by ignoring the existence of the zero comma something percent.
CONTINUITY IN SYNTAX
55
REFERENCES Buyssens, Emile. 1965. Linguistique historique, Paris : P.U.F, et Bruxelles : P.U. Culioli, Antoine. 1990. Pour une linguistique de l'énonciation, Paris : Ophrys. Damourette, Jacques and Edouard Pichon. 1911-1940. Essai de Grammaire de la Langue Française (7 vol.), Paris : d'Artrey. Fuchs, Catherine. 1991. L'hétérogénéité interprétative. In H. Parret (ed.), Le sens et ses hétérogénéités, Paris : éditions du CNRS (coll. Sciences du langage), pp. 107-120. Guillaume, Gustave. 1969. Langage et Science du Langage, Paris : Nizet, Québec : Presses de l'Université Laval. Le Goffic, Pierre. 1981. Ambiguïté linguistique et activité de langage, Thèse de Doctorat d'Etat, Université de Paris-VII. Le Goffic, Pierre. 1982. Ambiguïté et ambivalence en linguistique, DRLAV 27, pp. 83-105. Le Goffic, Pierre. 1988. Tensions antagonistes sur les systèmes : les rapports entre diachronie et synchronie. In A. Joly (ed.) La linguistique génétique : Histoire et Théories, Lille : Presses Universitaires de Lille, pp. 333-342. Le Goffic, Pierre. 1992. Que en français : essai de vue d'ensemble. In Subordination (Travaux Linguistiques du Cerlico, 5), Rennes : Presses Universitaires de Rennes 2, pp. 43-71. Muller, Charles. 1962. Polysémie et homonymie dans l'élaboration du lexique contemporain, Etudes de Linguistique Appliquée 1, pp. 49-54.
THE USE OF COMPUTER CORPORA IN THE TEXTUAL DEMONSTRABILITY OF GRADIENCE IN LINGUISTIC CATEGORIES GEOFFREY LEECH, BRIAN FRANCIS AND XUNFENG XU Lancaster University, UK 1.
Introduction
This paper explores the empirical analysis of non-discrete categories in semantics, and in linguistics generally. The classical, Aristotelian tradition re quires linguistic categories, like other categories, to be identified by a set of pro perties giving sufficient and necessary criteria for membership. Contrasting categories, so defined, have disjoint membership, and clear-cut boundaries. This view of linguistic categories is convenient for logical processing, but unrealistic. Various theories or tools of analysis have been proposed to provide an alternative view of linguistic categories in terms of non-discrete modelling. Examples are : Rosch's prototype theory of psychological categories (Rosch 1978) ; Zadeh's (1965) fuzzy set theory ; Bolinger's (1961) concept of gradience ; Quirk's (1965) notion of serial relationship. More recently there is the work of Lakoff (1987), Langacker (1990) and others within cognitive linguistics. These approaches to non-discrete categorization seem to be dealing with the same phenomena in somewhat varying terms. For the present purpose, the focus will be on gradience, the phenomenon of a scale (or gradient) relating two contrasting categories in terms of relative similarity and difference. Gradience means that members of two related categories differ in degree, along a scale running from "the typical x" to "the typical y". From an informal point of view, the more closely one examines linguistic categories, the more one tends to discover gradience, and the less one tends to believe in the "classical" Aristotelian view. However, one problem is : how do we demonstrate that such gradients exist ? Semantic gradience, assuming it exists, is a mental phenomenon, unamenable to observation. Therefore, to sub stantiate the claim that a gradient exists between two categories, we have to provide indirect evidence of its existence. For example, we may carry out elicitation procedures : experiments, tests, or surveys which elicit responses from na-
58
GEOFFREY LEECH, BRIAN FRANCIS AND XUNFENG XU
tive speakers of the language, given an appropriate stimulus, such as the visual objects used in Labov's (1973) investigation of the non-discrete semantics of the word cup. An alternative method, and in some ways a better one, is to examine the linguistic productions of native speakers in natural circumstances, when they are unconstrained by experimental conditions. In both cases, we study the obser vable reality of "parole" (or "performance") as an indirect means of reaching the underlying reality of "langue" (or "competence"). The second of these alternatives is the one pursued in this paper. It means studying the texts people produce, whether these texts consist of spoken or written material. Techniques of such text analysis have greater power than for merly, through the recent development of computer corpora 1 : large collections of machine-readable texts or text samples which can be searched and manipulated by computer. Such corpora are often designed to contain a "balanced" sampling, so as to be broadly representative of the language, or at least of some text types of the language. The present study is based on the analy sis of a one-million word corpus of written British English which has been widely used for linguistic research : the Lancaster-Oslo/Bergen (or LOB) Corpus2. A second problem is : how can we give a precise description of such nondiscrete phenomena as gradience ? How can one be exact about phenomena which, it seems, are by their very nature vague and indeterminate ? We will argue that precision is possible : that by studying the distribution of such phenomena in a sample of texts, one can arrive at a reasonably precise statistical model of their occurrence. This model can then, if desired, be tested on further data, or can be adjusted experimentally to provide a better fit to the corpus data. It may be objected that such a model is a model of language in use, and does not give access to the underlying cognitive realities. This is true : the factors showing up in the corpus data may have varied explanations. Nevertheless, it is reasonable to expect that non-discrete cognitive phenomena will be reflected substantially in the way native speakers make use of their language in actual performance. The contrary position — that this is not a reasonable expectation — is one that needs special pleading, in accordance with "Occam's razor". However, those who find this type of argumentation troublesome may simply be willing to accept that the results of corpus analysis
1) Two books which explain recent developments in computer text corpora, with particular reference to English, are Johansson and Stenstrîm (1991) and Aijmer and Altenberg (1991). A view of the current state of the art in corpus analysis is given in Leech and Fligelstone (1992). Of more particular relevance to the present paper is Leech (1992). 2) Details of the LOB Corpus are provided in Johansson etal.(1978).
GRADIENCE
59
are a contribution to the theory of language performance — and let the matter rest there. 2.
An Example of Gradience : the English Genitive and of-Constructions
To illustrate the proposed method of investigating and measuring gra dience, we will consider the case of the English genitive construction (as in the president's speech), and compare it with the frequently synonymous English ofconstruction (as in the speech of the president). For convenience, we distinguish these two constructions by the formulae : [X's Y] and [the Y of X]. The question for which we seek an answer, using corpus analysis, is : what determines the na tive speaker's choice between one form rather than another ? In practice, gram marians have identified a number of criterial factors, of which three of the most important are : (a) The semantic category of X For example, it is often claimed that if X has human reference, [X's Y] is normally preferred to [the Y of X] : George's car rather than the car of George. However, this is not the whole story, as [X's Y] is sometimes used with non-hu man reference, e.g. the earth's orbit, and [the Y of X] is sometimes used with human reference, e.g. the assassination of Abraham Lincoln. (b) The semantic relation between X and Y For example, the typical semantic relation between X and Y in [X's Y] is supposed to be "possession", as in George's car. On the other hand, even with a human X and a non-human Y, there are some semantic relations which favour the of-construction : the photograph of George will typically be preferred to George's photograph if the meaning is "the photograph representing George". (c) The kind of text type (style) in which the construction occurs It may be claimed, for example, that the genitive is more likely to occur in certain types of writing (e.g. popular journalism and broadcasting) than others. There are many other factors which might be added to these : there is, for example, a syntactic factor that might be added to the above three : (d) The ratio of the length of X to the length of Y For reasons of functional sentence perspective or end-weight, we expect [X's Y] to favour a shorter X and a longer Y, whereas [the Y of X] will tend to favour the opposite. Thus, other things being equal, this would give a higher li kelihood to John's second wife than to the second wife of John, but a lower like lihood to Napoleon Bonaparte's wife than to the wife of Napoleon Bonaparte. Some corpus-based quantitative studies of the genitive have already been carried out (Jahr Sorheim 1980, Altenberg 1982), taking into account the above
60
GEOFFREY LEECH, BRIAN FRANCIS AND XUNFENG XU
factors and many others1. We will not attempt more than the analysis of the three factors (a)-(c) above. However, the innovation of the present study is that we employ a statistical technique, logistic modelling, which enables a number of factors and sub-factors to be simultaneously built into the model, and to be included or excluded from consideration at will, so that we can derive from the classified corpus data the model (given those factors and sub-factors) that best fits the data. One complication is that each factor itself can be thought of as represen ting an ordered categorical scale. For example, factor (a) above is not just a yesno matter ("X is either human or non-human"), since the likelihood of choosing [X's Y] is apparently conditioned by the degree to which X tends towards human reference. If X refers to an animal ("a lion's strength") or to a human or ganization ("the government's policy"), this may be intermediate, in likelihood, between a human noun and an inanimate or abstract noun. However, the model itself can decide on the relative levels of likelihood associated with different classes of noun, showing the linear relationships between these sub-factors (which we will call levels) on an interval scale. 3.
Technique of Analysis
The basis for the model is the calculation of the ODDS in favour of the genitive, i.e. :
for any combination of factors. From the corpus can be derived the optimal values for the factors and their levels, as well as interaction effects between them. So, for example, it is possible to determine which of the factors, and which of the levels within the factors, are the most important in determining the choice either of the genitive or of the of-construction. Further, it is possible to place the factors and levels in an order of importance ; to discover whether any factors or levels are redundant to the analysis ; to use a significance test (in the present case chi-squared) to determine the goodness of fit of a model to the data. The statistical model will, so to speak, "map" the gradience between the two categories.
1) Existing corpus-based studies of the English genitive are Jahr Sorheim (1980) and Altenberg (1982), Altenberg's study being of 17th century English.
GRADIENCE
61
Note, however, that the model is not entirely governed by predetermined objective criteria. In two respects, the human analyst has to intervene to make decisions. First, he/she decides (preferably after an informal analysis of some part of the corpus) which are the factors and levels which play a principal role in determining the likelihood of choice. According to Altenberg (1982 : 296), there are over 40 factors that might affect the choice of the genitive. But for practical reasons, at present, it is necessary to restrict the analysis to the three factors which we believe to be the most important. In the longer term, we will need to experiment with different factors, adding some and possibly subtracting others, to see how a better fit to the data can be obtained. Secondly, the analyst has to decide how to classify each example in the corpus by reference to the factors and levels included in the model. In some cases this decision is virtually automatic (e.g. in deciding whether a noun has human reference or not), whereas in others it requires judgement. There is nevertheless a hope that the analysis of examples will be objective enough to be largely capable of exact replication by different analysts1. 4.
Step-by-step Methodology
In the following, we detail the techniques of analysis in the order in which they were undertaken in our experimental study. i) Selection of appropriate textual data We decided to use the LOB Corpus (see 1 above) of written British English as our source of textual data. It was clearly impractical to use the whole of the corpus, so we chose several sections of the corpus of equal size : viz. parts of sections A, B and C for journalistic writing, J for scientific and learned wri ting, and K for general fiction2. In choosing these three stylistically contrasting text types, we had in mind the need to test the influence of style, or text type, on the odds in favour of the genitive.
1) Some aspects of the classification of examples are more problematic than others. We can generally expect consistency between different analysts for Factors (a), (c) and (d). For Factor (b) (semantic relation between X and Y), the criteria for judging membership of classes have to be clarified — by reference, for example, to paraphrase or entailment tests — if inconsistency is to be avoided. Even then, some residual cases may be unclear. Evidently there is fuzziness or gradience even in the classes which are necessary to the definition of gradience itself. 2) The sections of the LOB Corpus chosen for this experimental analysis correspond to those chosen from the Brown Corpus by Meyer (1987) in his research on punctuation.
62
GEOFFREY LEECH, BRIAN FRANCIS AND XUNFENG XU
ii) Classification of examples For the purposes of this analysis, it was necessary to focus on those occurrences of [X's Y] which could, in principle, be replaced by [the Y of X], and those occurrences of [the Y of X] which could, in principle, be replaced by [X's Y]. The use of "in principle" here does not mean that each example had to be transformable, in its context, into an example of the opposite category. Rather, it means that we limited the analysis to transformable classes : i.e. classes of the genitive or the of-construction for which members of the opposite category also occur. For example, the possessive genitive is a "transformable class" because alongside possessive examples like my best friend's house there are also examples such as the house of my best friend. The fact that some examples of the possessive (e.g. Richard's new car), especially in their context, would be difficult or impossible to transform acceptably did not enter into the analysis. On this basis we excluded classes (such as the use of of with quantifiers some of many of, etc.) which were non-transformable, and arrived at the follo wing classification of the examples according to the three factors and their constituent levels. (In the tables, we add the frequency of each category in the sample for [X's Y] and [the Y of X] : Factor (a) Semantic Category of X [X's Y]
[Y of X]
223
186
H
Human nouns - including names
A
Animal nouns (excluding human)
3
26
0
Organization nouns (collective)
24
76
P
Place nouns - including names
28
61
T
Time nouns
13
41
C
Concrete inanimate nouns
0
467
B
Abstract nouns (excluding time)
1
399
Total
292
1256
GRADIENCE
63
Factor (b) Semantic Relation between X and Y [X's Y]
[Y of X]
POSS.
Possessive
115
391
SUBJ.
Subjective
82
160
OBJE.
Objective
0
262
ORIG.
Origin
55
31
MEAS.
Measure
13
41
ATTR.
Attributive
7
89
PART.
Partitive
20
282
Total
292
1256
Factor (c) Text Type (or Style) [X's Y]
[Y of X]
Journalistic style
177
456
Learned style
41
638
Fictional style
74
162
Total
292
1256
It is useful to add a few words here on Factors (b) and (c). Factor (b) : The classification of levels under Factor (b) needs some ex planation. Possessive is understood broadly to include all relations which entail the paraphrase "X has (a) Y", apart from those in the attributive and partitive cate gories. E.g. : the banker's son the orchestra's music stand the mind of the reader Subjective applies where there is a sentence analogue in which X is the subject of a verb corresponding to the head of Y. E.g. : nazi Germany's request Lawrence's imagination the formation of the heavy elements Objective applies where there is a sentence analogue in which X is the ob ject of a verb corresponding to the head of Y.
64
GEOFFREY LEECH, BRIAN FRANCIS AND XUNFENG XU
E.g. : the inverse distribution of plants and animals [Coleridge's] definition of the secondary imagination the amusement of passers-by Origin applies where the construction entails "X is the originator (author, causer, maker, etc.) of Y". E.g. : Chaucer's poem the music of these three composers the opinion of the council Measure applies where X provides a measurement (in terms of time, dis tance, etc.) of Y. E.g. : last year's average wage yesterday's encounter the deplorable Homicide Act of 1957 Attributive applies where Y describes an attribute of X : Y has an abstract noun as its head, and the sentence analogues are both "X has (a) Y"and "X is Y'", where Y' is the corresponding adjective. E.g. : Mark's blindness the immensity of her work the extraordinary instability of the Anglo-Saxon public Partitive applies where there is a sentence analogue "X has (a) Y", and where, in addition, the semantic relation of Y to X is that of part to whole. E.g. : Michael's face the Football League's Division Two the foyer of the Leopold Hotel Factor (c) : It is already worth noting from the figures under Factor (c) that there is marked inequality between the frequencies of the genitive in the different styles. The Journalistic text type accounts for 60.62 per cent of the total number of genitives, as compared with 25.34 per cent in the Fictional texts, and 14.04 in the Learned texts. At this point it is also worth noting the important role of the computer in making the analysis practicable. By means of a concordance programme, it was possible to locate all examples of the genitive and the of construction, and to inspect their context, with little trouble. Also, the concordance programme allo wed us to encode each factor and level in an abbreviated way, and to sort the examples on the basis of these codes. In this way, it was relatively easy to verify the classifications, to check them for consistency, and to count the number of occurrences in each class. In principle, we could have undertaken the analysis
GRADIENCE
65
without the availability of the machine-readable corpus ; in practice, this task would have been Herculean, and would have been prone to error and inconsis tency. iii) Tabulating the results of the analysis As a result of this analysis, we arrived, in effect, at a 3-dimensional matrix, with each cell containing a numerical value (viz. a frequency count). For the purpose of input to GLIM, the result could be prepared as a set of three 2-dimensional tables, one for each pair of factors, as shown in Tables 1-3. Table 1 : Observed Proportion of the Genitive in Journalistic Style H
A
0
P
T
c
B
TOTAL
POSS.
46/72
0/0
8/33
16/43
0/0
0/28
1/57
71/233
SUBJ.
36/50
0/0
4/13
8/0
0/0
0/7
0/28
48/106
OBJE.
0/13
0/2
0/6
0/2
0/0
0/25
0/54
0/102
ORIG.
36/48
0/0
4/6
0/0
0/0
0/0
0/0
40/54
MEAS.
0/0
0/0
0/0
0/0
7/19
0/0
0/0
7/19
ATTR.
3/7
0/0
3/7
0/3
0/0
0/12
0/21
6/50
PART.
2/9
0/0
3/13
0/3
0/0
0/24
0/20
5/69
TOTAL
123/199
0/2
22/68
24/59
7/19
0/96
1/180
177/633
Table 2 : Observed Proportion of the Genitive in Learned Style H
A
0
P
T
C
B
TOTAL
POSS.
8/43
0/7
0/9
1/8
0/0
0/65
0/59
9/191
SUBJ.
13/28
0/13
1/3
2/3
0/0
0/29
0/23
16/99
OBJE.
0/11
0/3
0/1
0/1
0/0
0/75
0/53
0/144
ORIG.
13/23
0/0
0/4
0/0
0/0
0/0
0/0
13/27
MEAS.
0/0
0/0
0/0
0/0
3/29
0/0
0/0
3/29
ATTR.
0/1
0/0
0/0
0/0
0/0
0/12
0/20
0/33
0/0
0/112
0/27
0/156
3/29
0/293
0/182
41/679
PART.
0/6
0/0
0/2
0/9
| TOTAL
34/112
0/23
1/19
3/21
66
GEOFFREY LEECH, BRIAN FRANCIS AND XUNFENG XU Table 3 : Observed Proportion of the Genitive in Fictional Style H
A
0
P
T
C
B
TOTAL
POSS.
31/50
3/3
1/2
0/3
0/0
0/9
0/15
35/82
SUBJ.
17/20
0/1
0/1
1/2
0/0
0/8
0/5
18/37
OBJE.
0/5
0/0
0/0
0/1
0/0
0/4
0/6
0/16
ORIG.
2/2
0/0
0/0
0/0
0/0
0/0
0/3
2/5
MEAS.
0/0
0/0
0/0
0/0
3/6
0/0
0/0
3/6
| ATTR.
1/2
0/0
0/0
0/1
0/0
0/4
0/6
1/13
| PART.
15/19
0/0
0/0
0/2
0/0
0/53
0/3
15/77
| TOTAL
66/98
3/4
1/3
1/9
3/6
0/78
0/38
74/236
iv) Using GLIM for statistical modelling The statistical package GLIM (Francis, Green and Payne 1992) was used to fit a selection of statistical models to the observed data. The techniques used are described below in section 5. v) Drawing conclusions from the analysis This, like the previous step, is crucial in the methodology, and is explained in section 6 below. 5.
Statistical Modelling
The idea of statistical modelling is to present a simplified representation of the underlying process by separating out systematic features of the data from random variation. To do this, a probability model is chosen to represent the ran dom variation ; the systematic features of the data are represented by a (hopefully small) number of parameters. So, the statistical task is to find a statistical model which is parsimonious (has a small number of parameters) but which fits the data well according to the underlying probability distribution (GOODNESS-OF-FIT). Thus Occam's razor plays an important part in statistical modelling. As an example, one of the classical statistical models is simple linear re gression, where the probability distribution is the Normal distribution, and the mean of the Normal distribution is modelled by a regression equation — a linear combination of variables with unknown regression parameters. More recently, techniques have been developed for modelling proportions, or ratios of counts. These techniques use the Binomial distribution as the proba bility distribution, and model the probability of a particular outcome as a func tion of a set of explanatory factors. A common model relates the probability of
GRADIENCE
67
an outcome to a linear combination of explanatory factors through the logit function log(p/(l-p)), or log-odds ratio (Collett 1991). We therefore try to cons truct a linear model predicting the log-odds ratio for any combination of factors. We then try to build a model for the probability of the English genitive construction. The simplest model is that the probability is constant and is not de pendent on the semantic category, the semantic relation or the style. The varia tion in the proportion of English genitive construction for each combination of factors would then be just Binomial variation. This model can be tested, a good ness of fit measure calculated and the model rejected or accepted, and a new model tested, and so on. More generally, we model the probability of giving the English genitive as a function of a set of individual characteristics. Formally, we assume that the probability of giving a particular response for a particular cell i is pi. For each cell of the table, we observe the number of genitive constructs Ri and the total number of constructs Ni. The logistic model assumes that : Ri is distributed with a Binomial distribution with mean pi Ri ~ Binomial (pi, Ni) Pi is related through a function f(.) to a linear function of the explanatory factors
where the bj are unknown parameters to be estimated. f(pi) is taken to be the logit function
Following the usual GLIM convention, each of the potential explanatory factors was converted by GLIM into a set of dummy variables, one dummy va riable for each level of the factor. A backward elimination procedure was adop ted, with the all two-way interaction model being fitted at the initial stage. At each subsequent stage, the least important variable or interaction was selected and removed from the model using the GLIM scaled deviance (likelihood ratio statistic) as a criterion for comparing nested models — a difference in scaled deviance between two models (the smaller of which has s parameters less than the other) will have a chi-squared distribution on s degrees of freedom if the term removed is unimportant (i.e. its parameters can be set to zero). The proce dure was terminated when all remaining variables were significant. For further details of logistic regression, GLIM and statistical modelling see Aitkin et al. (1989), Agresti (1990) or the new GLIM4 manual (Francis, Green and Payne 1992).
68
GEOFFREY LEECH, BRIAN FRANCIS AND XUNFENG XU
The results of the analysis are shown below. Table 4a shows the analysis of deviance table. We can see that starting from the all-two-way interaction mo del, first the CATEGORY by STYLE interaction is removed, then the RELATION by STYLE interaction, and finally the RELATION by CATEGORY interaction. We are left with a final model involving the main effects of CATEGORY, RELATION and STYLE. Table 4b shows the effect of removing each of these terms from the model. All terms are seen to be important, and the model cannot be simplified further. Does the model fit well ? The scaled deviance from this model is 63.01 on 71 degrees of freedom, which provides the goodness of fit test for the final mo del. This value is compared with a chi-squared distribution on 71 df. Its 95 % percentage point is 91.67, and as 63.01 is substantially below this figure, the model fits well. We can assess the importance of each of the terms in the final model by further examination of Table 4b. All factors are highly significant, and all are very important in predicting the proportion of genitive constructs. However, when CATEGORY is excluded from the model, the scaled deviance changes by a massive 361.0 on 5 degrees of freedom (72 for each degree of freedom). This makes CATEGORY the most significant term. The next most significant term is STYLE, and finally RELATION. Table 4a : Analysis of deviance table Difference in scaled deviance from previous model
Difference in df
p-value
39
12.32
10
0.2642
35.14
51
14.88
12
0.2481
63.01
71
27.87
20
0.1125
MODEL
Deviance
df
CATEGORY+RELATION+ STYLE+CATEGORY.STYLE + RELATION.STYLE+ RELATION.CATEGORY
7.94
29
CATEGORY+RELATION+ STYLE+RELATION.STYLE+ RELATION.CATEGORY
20.26
CATEGORY+RELATION+ STYLE+ RELATION.CATEGORY CATEGORY+RELATION+ STYLE (final model)
GRADIENCE
69
Table 4b : Effect of deleting terms from the final model Term deleted
change in deviance
change in df
STYLE
70.28
2
p<0.0001
RELATION
88.33
5
p<0.0001
CATEGORY
361.0
5
p<0.0001
p-value
The final model can be presented in the following way. If we now define pijk to be the probability of obtaining the genitive construct for CATEGORY i, RELATION j and STYLE k, then the model can be written as :
The estimates of these effects are given below in Table 5. Table 5 : Estimates of parameters in the final model Parameter name
Description
Estimate
Standard error
K
constant
0.33
0.18
CATEGORY(l)
Human
0.0
CATEGORY(2)
Animal
-1.73
0.67
CATEGORY(3)
Organisation
-1.38
0.28
CATEGORY(4)
Place
-0.87
0.28
CATEGORY(5)
Time
-0.85
0.37
CATEGORY(6)
Concrete
-13.38
28.92
CATEGORY(7)
Abstract
-5.80
1.01
RELATION(l)
Possessive
0.0
RELATION(2)
Subjective
0.86
0.24
RELATION(3)
Objective
-10.85
22.55
RELATION(4)
Origin
1.11
0.30
RELATION(5)
Measure
0.0
RELATION(6)
Attributive
-0.48
0.50
RELATION(7)
Partitive
-0.53
0.33
STYLE(l)
Journalistic
0.0
STYLE(2)
Learned
-1.60
0.24
STYLE(3)
Fictional
0.44
0.25
70
GEOFFREY LEECH, BRIAN FRANCIS AND XUNFENG XU
These estimates can be used in a number of ways. First, they can be used to construct fitted probabilities for any cell in the three-way table. For example, the fitted probability for place as object in journalistic style can be obtained by calculating the fitted log odds ratio for the required cells (0.33-1.38+1.11+0.0 = 0.06), then calculating the odds-ratio (Exp(0.06) = 1.06), then calculating the fitted probability (1.06/(1+1.06) = 0.515). Table 6 contains the fitted probabili ties for every combination of STYLE, RELATION and CATEGORY. Table 6 : Fitted probabilities for combinations of the 3 factors CATEGORY STYLE RELATION POSS 1
2
3
B
H
A
0
P
T
C
0.582
0.197
0.258
0.369
0.000
0.004
0.580
0.000
0.010
0.000
o.ooo
0.000
0.013
|
-
|
SUBJ
0.767
0.367
0.452
OBJE
0.000
0.000
0.000
0.000
ORIG
0.809
0.428
0.514
0.640
-
MEAS
-
-
-
-
0.371
ATTR
0.462
0.132
0.177
0.265
0.000
PART
0.451
0.127
0.171
0.257
-
0.003
0.000
0.002
POSS
0.219
0.047
0.066
0.106
0.000
0.001
SUBJ
0.399
0.105
0.143
0.218
0.000
0.002
-
OBJE
0.000
0.000
0.000
0.000
ORIG
0.461
0.131
0.176
0.264
-
MEAS
-
-
-
-
0.106
-
-
0.000
0.001
0.000
0.000
0.000
0.006
0.000
0.015
0.000
0.000
0.000
0.019
ATTR
0.148
0.030
0.042
0.068
PART
0.142
0.028
0.040
0.065
POSS
0.684
0.276
0.351
0.476
SUBJ
0.836
0.474
0.561
0.682
OBJE
0.000
0.000
0.000
0.000
ORIG
0.868
0.537
0.622
0.734
-
|
0.000
0.000
0.000
0.003 -
MEAS
-
-
-
-
0.478
-
-
ATTR
0.571
0.191
0.250
0.359
0.004
PART
0.561
0.184
0.242
0.349
-
0.000 0.000
0.004
GRADIENCE
71
Secondly, the estimates can be interpreted directly as log-odds ratios. As an example, the parameter for ATTRIBUTIVE is -0.48, so the log-odds ratio for ATTRIBUTIVE compared to POSSESSIVE is -0.48, and the odds-ratio is 0.61, which is close to 0.66 — two chances for ATTRIBUTIVE to every three chances for POSSESSIVE, assuming the levels of STYLE and CATEGORY are held constant in the comparison. They can therefore be used to order the factor categories in increasing likelihood of using the genitive, after controlling for the effects of the other factors. 6.
Conclusions Drawn from the Analysis
In a sense, no conclusions need be drawn, since the statistical model is it self the final result of the analysis : a three-dimensional model of the gradient relating the genitive to the of-construction in English. It is useful, however, to pick out from this result some of the important features of the gradient, as revea led by the model. These are conclusions which could not have been arrived at by non-empirical means : the use of corpus data was a necessary means to this end. Conclusion I : The model fits the data well, on the basis of the goodness of fit test. Conclusion II : All three factors (a)-(c) are important factors in determining the choice between the genitive and the of-construction. Conclusion III : The order of significance of the factors is : Factor (a) (Semantic class of X) - most significant Factor (c) (Style, or Text Type) Factor (b) (Relation of X to Y) - least significant Conclusion IV : Within Factor (a), the ordering of levels in terms of the fitted probabilities of choosing a genitive in preference to an of-construction is as follows : 1. 2. 3. 4. 5. 6.
X is human X is a place X is a human organization X is animal (but not human) X is abstract (apart from time) X is concrete and inanimate (apart from place)
(H) (P) (O) (A) (B) (C)
72
GEOEFREY LEECH, BRIAN FRANCIS AND XUNFENG XU
The existence of a gradient between the two constructions is particularly clear from the characteristics of this factor. Level 1 (human X) makes a very strong contribution to the choice of the genitive — and thus confirms the stereo typic explanation of the genitive given in many grammar books. On the other hand, the fact that levels 2, 3 and 4 identify classes associated with quasi-human characteristics 1 is evidence for the genitive's being in this respect being a "fuzzy-edged" category. At the bottom of the gradient, levels 5 and 6 (especially 6) make a negligible contribution to the choice of the genitive — or, to put it dif ferently, overwhelmingly favour the choice of the of-construction. Oversimplifying, we might see level 1 as the "hard core" of the genitive cate gory, level 6 as being the "hard core" of the of-construction category, and levels 2-5 as being intermediate. (The level "time" where X is a temporal expression is not included in this ordering, as it cannot be separated from the level "measure" of Factor (b). In fact, both "time" and "measurement" are anomalous in this mo del, and appear to suggest the existence of an independent class of genitives. For this purpose, however, we need a larger sample). Conclusion V : Within Factor (b), the ordering of levels in terms of the fitted probabilities of the selection of the genitive in preference to the of-construction is as follows : 1. 2. 3. 4. 5. 6.
Origin Subjective Possessive Attributive Partitive Objective
(ORIG) (SUBJ) (POSS) (ATTR) (PART) (OBJE)
The position of the level of "Origin", rather than that of "Possessive", at the top of the scale is something of a surprise, but it is worth remembering that the stereotypic genitive category "possessive" is defined in a broad way here, and is by no means restricted to possession in the sense of ownership. Also, with Origin at the top of the scale, it is possible to interpret the above ordering in terms of a single notional gradient of "power/influence/dominance", clearly as sociated with the corresponding gradient of "humanness" we find in Factor (a).
1) Not only animal and organization nouns, but also geographical nouns often have quasihuman implications, as illustrated by the following examples : E.g. : Huddersfield's long history the Earth's development
GRADIENCE
73
That is, the presence of "Origin" at the top of the scale may have a connection with the observation that the originator of something — its creator, begetter, author, etc. — has the primary influence over it. The subject of a verb (associated with level 2) is also typically the agent — the participant in an action who wields power, in contrast to the powerlessness of the object (placed at the bottom of the scale). It is striking, in this context, that the verb-object relation is overwhelmingly (though not inevitably) associated with preference for an ofconstruction, just as the subject-verb relation is overwhelmingly (though not inevitably) associated with preference for the genitive. Conclusion VI : Within Factor (c) STYLE, the ordering of levels with reference to the fitted probabilities of preferring a genitive to an of-construction is as follows : 1. 2. 3.
Fictional texts Journalistic texts Learned texts
As Table 6 shows, the remoteness of the learned text-type from the other two is very noticeable : further research over a wider range of text types might reveal a more general trend for the genitive to be favoured in more informal styles (influenced by the spoken language), as contrasted with more formal styles of writing (cf. Altenberg 1982 : 299). Conclusion VII : From the data, the fitted probabilities of the genitive for any construct re ferring to a concrete object, or where the semantic relation between X and Y is objective, is very close to zero. We have observed no occurrences of a genitive in either of these two groups. It may be that the occurrences of genitives in these two groups is rare, or possibly we have stumbled on a discrete linguistic rule. Only analysis of further data will reveal the truth. 7.
General Conclusion
The technique of logistic modelling, as a way of extracting a gradience model from corpus data, has been applied in this paper only in a limited, expe rimental way. However, the results are encouraging, and strengthen our intention to pursue this line of inquiry further. One of the strengths of this technique is that it yields a precise way of measuring how far a model approximates to a theoretically optimal way of ac counting for the data. Hence, if we take the present study as a starting point, we can experiment fruitfully with the addition of new factors (e.g. we are already working on Factor (d) mentioned in section 2 above), or with the refinement of
74
GEOFFREY LEECH, BRIAN FRANCIS AND XUNFENG XU
the existing Factors (a)-(c), all of which have proved important. With every modification of the model, it is possible to find out exactly how far and in what respects that model improves the result, by achieving greater goodness of fit. Another way of extending the analysis and testing the results further is to make use of fresh corpus data. By taking a new corpus sampled from the same text types, we could test out the validity of the model, by predicting its perfor mance on the new data. Alternatively, we could extend the range of corpus data to new text types, and especially to samples of spoken English, which might show a rather different pattern of distribution from the written data so far analy sed. A third promising line of further research based on this model is the inves tigation of other assumed cases of gradience in language performance. These may belong to any of the strata of language. For example, on the syntactic level, we would like to investigate the choice of the zero relative pronoun in English, as contrasted with one of the pronouns who/whom/whose, which, or that. Here there is a multiple contrast rather than a simple binary one, and so a further stage of analysis would be to use the same method to investigate the choice (say) bet ween who and whom, or between which and that. On the level of graphology, the choice between the use or non-use of a comma at certain syntactic positions (e.g. in coordination) could be investigated by the same means. On the lexical level, the choice between quasi-synonymous words (such as nearly and almost, or between appear and seem) lends itself to the same treatment. The suspicion is that this method of analysis will reveal patterns of linguistic choice in performance to be far from chaotic : that they will show considerable stability across different users, different contexts, and different time periods. The evidence of many corpus studies so far undertaken, often using simple quantitative measures, has pointed clearly in that direction1. Such studies also demonstrate clearly the need for the non-discrete modelling of linguistic categories in performance. Two questions of interest remain : (i) How widely will the technique illustrated here be applicable to natural language phenomena at various levels ? And (ii) how far will it now be possible, with the aid of corpora, to make precise predictive statements about linguistic phenomena which up to the present time have been studied only broadly and impressionistically ?
1) A recent bibliography of corpus-based publications (Altenberg 1991) lists many examples of such studies.
GRADIENCE
75
REFERENCES
Agresti, A. 1990. Categorical Data Analysis, New York : Wiley. Aitkin, M.A. ; D.A. Anderson ; B.J. Francis and J.P. Hinde. 1989. Statistical Modelling in GLIM, Oxford : Oxford University Press. Altenberg, B. 1982. The Genitive v. the Of-Construction : A Study of Syntactic Variation in 17th Century English, Lund : CWK. Altenberg, B. 1991. A bibliography ofpublications relating to English computer corpora. In Johansson and Stenstrîm, pp. 355-396. Bolinger, D.L. 1961. Generality, Gradience and the All-or-None, The Hague : Mouton. Collett, D. 1991. Modelling Binary Data, London : Chapman and Hall. Francis, B.J. ; M. Green and C.P. Payne. 1992. The GUM4 Manual, Oxford : Oxford University Press. Jahr-Sorheim, M.C. 1980. The s-Genitive in Present-day English, Oslo : Department of English, Oslo University. Johansson, S. ; G. Leech and H. Goodluck. 1978. Manual of Information to accompany the Lancaster-Oslo/Bergen Corpus of British English, for use with digital computers, Oslo : Department of English, Oslo University. Johansson, S. and A.B. Stenstrîm (eds.) 1991. English Computer Corpora : Selected Papers and Research Guide, Berlin & New York : Mouton de Gruyter. Labov, W. 1973. The boundaries of words and their meanings. In C.J.N. Bailey and R.W. Shuy (eds.), New Ways of Analyzing Variation in English, Washington D.C. : Georgetown U.P., pp. 340-73. Lakoff, G. 1987. Women, Fire, and Dangerous Things, Chicago : Chicago U.P. Langacker, R.W. 1990. Foundations of Cognitive Grammar, vol. 1 : Theoretical Prerequisites, Palo Alto : Stanford U.P. Leech, G. 1992. Corpora and theories of linguistic performance. In J. Svartvik (ed.), Directions in Corpus Linguistics, Berlin & New York : Mouton de Gruyter, pp. 125-148. Leech, G. and S. Fligelstone. 1992. Computer and corpus analysis. In C.S. Butler (ed.), Computers and Written Texts, Oxford : Blackwell, pp. 115140.
76
GEOFFREY LEECH, BRIAN FRANCIS AND XUNFENG XU
Meyer, C.F. 1987. A Linguistic Study of American Punctuation, FrankfurtMain : Peter Lang. Quirk, R. 1965. Descriptive statement and serial relationship, Language 41, pp. 205-217. Rosch, E. 1978. Principles of categorization. In E. Rosch and B.B. Lloyd (eds.), Cognition and Categorization, Hillsdale : Erlbaum, pp. 27-48. Zadeh, L.A. 1965. Fuzzy sets, Information and Control 8, pp. 338-353.
A "CONTINUOUS DEFINITION" OF POLYSEMOUS ITEMS : ITS BASIS, RESOURCES AND LIMITS JACQUELINE PICOCHE University of Amiens, France
I propose to begin with a demonstration (as hawkers say to boost their wares on street markets). I am going to build, under your eyes, a "continuous de finition" of the French verb marcher ; after which we shall be able to start theori zing and deal, as promised, with the basis, the resources and the limits of this type of definition and, moreover, show its specificity as compared with other methods currently used by fellow linguists. 1.
How to build a "continuous; definition"
As a springboard I shall take the following article, photocopied by permis sion of the publisher, taken from a dictionary which, in other respects, I greatly admire and use daily : the Petit Robert (PR) (1987 edition). M A R C H E R [maR e]. v.intr. (Marchier « piétiner », trans., XII e ; frq. •markôn "marquer, imprimer le pas"). I. (XVe). ♦ 1o Se déplacer par mouvements et appuis successifs des jambes et des pieds sans quitter le sol (V. Marche, pas). Enfant qui commence à marcher. «Je ne puis méditer qu'en marchant » (ROUSS.). Manière de marcher : allure, démarche, marche. Marcher à petits pas rapides. V. Trotter, trottiner. Marcher d'un pas lent. Marcher bon train, vite. V. Presser (le pas). Marcher avec peine. V. Traîner (se). Marcher en boitant ; avec une canne, des béquilles. Marcher dans la rue, sur une route. V. Déambuler, promener (se) ; piéton. Marcher à reculons : reculer, rétrograder. Marcher droit* (1, II). - (Danse) Faire des pas ordinaires. Marcher sur la pointe des pieds. ♦ 2° Avancer (en parlant des êtres animés). Marcher à quatre pattes*. Acrobates qui marchent sur les mains. - (Animaux) Animaux qui marchent sur les doigts (V. Digitigrade), sur la plante des pieds (V. Plantigrade). ♦ 3° Aller à pied. Marcher vers la ville. V. Diriger (se), rendre (se). Fig. « Le monde avec lenteur marche vers la sagesse » (VOLT). - Marcher au supplice. Marcher sans but, à l'aventure. V. Errer, flâner. - Marcher sur qqn, aller vers lui avec violence, hostilité. Marcher devant, derrière qqn. - Fig. Marcher avec qqn, la main dans la main, comme un seul homme : être d'accord. ♦ 4° (Troupes). Faire mouvement, Marcher sur une ville, contre un adversaire. Marcher au combat. V. Monter. ♦ 5° Fig. et fam. (1852). Acquiescer,
78
JACQUELINE PICOCHE
donner son adhésion (à qqch.). V. Accepter, consentir. Marcher dans la combine. « Non, monsieur ! je ne marche pas !» (MALRAUX). - Croire naïvement quelque histoire. Il a marché dans mon histoire. Il ne marche pas, il court : il fait plus encore que marcher (Cf. Donner dans le panneau, se faire avoir). 0 Faire marcher qqn, obtenir de lui ce qu'on veut (par la force, la menace, la persuasion, la ruse). Spécialt.Abuser en faisant prendre pour vrai ce qui ne l'est pas. V. Berner, tromper. « Le prince se moquait d'elle et la faisait marcher » (MADELIN). ♦ 6° S'avancer dans un véhicule, à cheval. Nous avons très bien marché au début, mais à Lyon la voiture est tombée en panne. ♦ 7° (Choses). Se mouvoir de manière continue. Automobile, train qui marche à 150 km à l'heure. V. Rouler. ♦ 8° Fonctionner (en parlant d'un mécanisme). Montre, pendule qui marche mal. Faire marcher une machine, une radio. ♦ 9° Fig. (1865). Produire l'effet souhaité. Ses affaires, ses études marchent bien (Cf. Ça carbure, ça ronfle). Ce procédé, cette ruse a marché. Marcher comme sur des roulettes*. V. Aller. II. MARCHER SUR, DANS... ♦ 1o Mettre le pied (sur qqch.) tout en avançant. Défense de marcher sur les pelouses. - Loc. fig. Marcher sur les pas, sur les traces de qqn. V. Imiter. -Marcher sur les brisées* d'un rival. Marcher sur le corps, sur le ventre d'un concurrent. V. Passer. -Marcher sur des charbons* ardents, sur des oeufs*. ♦ 2° Poser le pied (sur qqch.), sans idée d'autre mouvement. Marcher dans une flaque d'eau. Il a marché en plein dedans. Marcher sur les pieds de qqn. 0 Fig. Marcher sur ses principes. V. Fouler, piétiner. ♢ ANT. Arrêter (s'), stopper. - HOM Marché.
You will notice that it is divided into two main parts, introduced by the Roman numerals I and II, section I comprising the greater part of the uses of the verb, and section II singling out two prepositional constructions in which the compiler considers the idea of a progression in space to be neutralized. On reading the article one is struck by one major discontinuity and several minor ones. The major discontinuity occurs with 5. There is nothing to explain or even suggest what semantic continuity there could possibly be between the fact of "se déplacer par mouvements et appuis successifs des jambes et des pieds sans quitter le sol" ("to progress by lifting and setting down each foot in turn, never having both feet off the ground at once") and that of "acquiescer ; consentir" ("to acquiesce" ; "to consent") termed "fam. and fig." without the "figure" in question being in the least clarified. This is all the more surprising as 6 and 7 introduce once more uses comprising progress in space, thus semantically close to 3 and 4. Had the precaution but been taken, for 4, of quoting the military command En avant, marche ! ("Forward, march !"), one might have been able to see, in the consenting person, the image of an obedient infantryman. However, I do not believe this interpretation to be the most coherent with the rest. The minor discontinuities are to be found in 8 and 9, 8 "fonctionner" ("to work") having the possibility of being felt as a metaphor of 7 "se mouvoir de manière continue" ("to move in a continuous manner") and 9 "produire l'effet sou-
POLYSEMOUS ITEMS
79
haité" ("to produce the desired effect") as a metaphor of 8, although there is no thing to specify this. The reference to "aller" ("to go") which appears in 9, is a comparison, and a comparison is not a reason. Logically, if its definition is correct, the fifth acceptation should be treated as a homonym and appear as a separate entry. However, strangely enough, a par ticularly atomistic dictionary, the Dictionnaire du Français Contemporain (DFC), enters a single article, very close to that of the PR, which proves that the au thors sensed a certain continuity but were unable or unwilling to explicitate it. On the other hand, the Lexis provides three homonymic entries for marcher : 1) "se mouvoir en déplaçant les pieds l'un après l'autre" ("to progress by moving the feet one after the other") ; 2) "fonctionner" (of a mechanism : "to work") ; "faire des progrès" (en parlant d'une activité quelconque) (of some activity : "to make progress") ; 3) "donner son acceptation, croire naïvement" ("to give one's consent", "to believe naively"). To come back to the article from the Petit Robert, setting aside the case of "croire naïvement" which appears most inappropriately in 5th position, there re mains a discontinuity between sections I and II It is difficult to understand why "poser le pied sur" ("to place one's foot on") and "poser le pied dans" ("to place one's foot in") should be separated from acceptations I. 1, 2, 3, 4, 6, 7, by the non-spatial acceptations 8 and 9. The note "sans idée d'autre mouvement" ("without the idea of any other movement") for II.2 is questionable. One does not often tread in a puddle or on someone's foot without moving one foot after the other ! Therefore since, the homonymic solution being rejected, a single article must be subdivided, my first suggestion would be to list the spatial acceptations as opposed to the non-spatial ones. All the uses bearing reference to solid, ter restrial space — able to be measured in numbers of steps — and to animate beings moving upon it by means of their legs, setting one foot regularly in front of the other without ever completely leaving the ground (all of which terms are ex tremely long and complicated to define if one wishes to reach, by proceeding from one definition to the next, the layer of the "primitives") should be grouped together at the beginning óf the article. But firstly this reordering still fails to explain the major discontinuity in 5 and the minor ones in 8 and 9. 7 contains the adjective "continu" ("continuous") which heads in the right direction but is, it seems to me, insufficient. This notion should have been explicitly introduced right from 1 in order to bring out the mode of action proper to the verb marcher, which is an "activity" and not a "state", an "achievement" or an "accomplishment" and shows the verbal process in the progressiveness of its development, which is at the base of all the apparently "discontinuous" sense effects.
80
JACQUELINE PICOCHE
It is the notion of continuous, regular and normal activity that implements the transition to 8 (concrete subject, machine, having, although inanimate, an internal activity, without progression in space) then to 9 (abstract subject, set of situations and human relationships which develop in time in a manner normal, regular and satisfactory to the person organizing them : a firm, business matters) and, to my mind, only in the very last place to 5 in which the subject is a human being who falls unwittingly into a plot, and acts in a manner expected by his manipulator, to the latter's greatest satisfaction. These sense effects are the result of the use of this basic mode of action, to the exclusion of other "semes" (semantic components), and, consequently, it is best revealed by these. But it is unfortunate that the presence of the other semes should have prevented the author from perceiving it or at least noting it in the uses involving movement. This proves the necessity of "shuttles" between the different uses of the polysemous item if one does not wish to overlook certain important semantic features. On the other hand, 1 includes the seme "without leaving the ground", which is not immediately obvious, and is obtained through comparison with other verbs of movement, notably courir ("to run") and sauter ("to jump"), other "activities" of which the mode of action is used in a quite different manner. Now if one compares these verbs not only in their spatial uses but in the whole of their polysemy, one notes that in French walking is a far less peculiar manner of getting about than running, trotting, galloping, jumping, creeping, swimming or flying. It is, par excellence, the normal way of moving, the least tiring, the most regular and the most efficient for the biped vetebrates that men are, a proof that language is anthropocentric, as one may have guessed. Whoever fails to make this understood in his definition of the spatial uses of marcher deprives himself of the ability to demonstrate how they engender the non-spatial uses, and to give a "continuous definition" of the whole. I am thus led to suggest for the verb marcher the following article : MARCHER V. intrans. I. Un homme ou un animal terrestre, muni de pieds ou de pattes marche lorsqu'il se déplace sur le sol sans le quitter, en mettant régulièrement un pied devant l'autre. Ce mouvement, et l'espace qu'il délimite s'appelle un pas. Le sujet se déplace progressi vement, pas à pas. A défaut de véhicule, c'est pour lui la manière normale d'avancer, la moins fatigante, donc la plus satisfaisante : Jean marche, son chien marche à côté de lui ; les fourmis marchent à la file. Divers compléments circonstantiels facultatifs peuvent préciser : 1) la manière de marcher : vite, lentement, à pas de loup, à pas de géant. 2) la partie du pied sur laquelle on prend appui (prép. sur) : marcher sur la pointe des pieds.
POLYSEMOUS ITEMS
81
Les animaux plantigrades marchent sur la plante des pieds ; les animaux digi tigrades sur les doigts de pieds ; éventuellement un homme peut utiliser aussi les mains pour marcher, c'est marcher à quatre pattes, ou même, exceptionnellement, seulement les mains : l'acrobate marche sur les mains. 3) la nature du sol sur lequel on prend appui : a) s'il est dur (prép. sur) : Jean marche sur le trottoir, sur l'herbe, sur l'asphalte. b) s'il est mou, si l'on y enfonce (prép. dans ) : Jean marche dans le sable, dans la boue, dans l'eau (dans ce cas, ayant mis les pieds dans l'eau, on marche en fait sur le fond). c) s'il s'y trouve un objet sur lequel le marcheur met le pied, soit par inadver tance, soit dans une intention destructrice ou méchante : Jean a écrasé un coquillage en marchant dessus — il marche sur la queue du chat, sur les pieds de son voisin. Par métaphore, dans quelques locutions plus ou moins figées, marcher sur peut signifier "détruire tout ce qui fait obstacle à l'action du sujet": Jean marche sur ses principes, sur les brisées de son rival, il lui marcherait sur le ventre, sur le corps (v. piétiner). 4) le lieu du déplacement : Jean marche à travers champs, dans sa chambre, le long de la rivière. 5) le terme du mouvement ; dans ce cas, marcher se substitue à aller, en produi sant divers effets de sens tenant à sa modalité d'action progressive qui montre l'action dans son déroulement. Dans certains cas, le sème "à pied" peut être neutralisé au profit de ces effets de sens : a) Jean marche vers son bureau, ajoute à va à son bureau le sème "à pied", et aussi la notion d'activité progressive. b) Les fourmis marchent vers la fourmilière plus expressif que vont vers la fourmilière. La modalité d'action du verbe est utilisée pour exprimer la ténacité et la persévérance de ces insectes. c) Garibaldi marche sur Rome beaucoup plus expressif que Garibaldi va à Rome, ou même se dirige vers Rome ; la modalité d'action est utilisée pour exprimer la progression régulière et puissante d'une armée (et peu importe que ce soient ou non des fantassins). d) Jean marche au feu, au combat, à la mort, à la gloire, au supplice : le com plément de marcher à est généralement quelque chose d'exceptionnel et de dangereux ; la modalité d'action est utilisée pour exprimer un effort héroïque, dans son déroule ment temporel. II. Un véhicule à moteur marche 1) il se déplace régulièrement : En ce moment, la voiture marche à 120 km heure — Le verbe marcher peut aussi s'employer avec des véhicules non terrestres : navire, avion, fusée, vaisseau spatial — Par métonymie, les passagers peuvent aussi dire qu'ils marchent : Entre Paris et Nantes, nous avons marché à 110 de moyenne ; nous avons bien marché. 2) il fonctionne (condition nécessaire pour qu'il se déplace) : cette voiture marche bien, le moteur vient d'être révisé : la modalité d'action est utilisée pour exprimer le caractère normal et régulier de ce fonctionnement.
82
JACQUELINE PICOCHE
III. Un sujet marche sans se déplacer 1) une machine fonctionne normalement, régulièrement, par un mouvement in terne sur place (mouvement d'horlogerie, moteur qui tourne) ou une simple consomma tion d'énergie, de façon conforme aux plans de son constructeur et donc satisfaisante : ma montre, ma machine à laver, mon poste de télé marchent. 2) une combinaison quelconque produit l'effet attendu. a) un ensemble de situations et de relations humaines évoluent dans le temps de manière normale, régulière et satisfaisante pour celui qui les organise : L'entreprise de Jean marche bien, ses affaires marchent [ici la substitution de aller n'est possible que pour un sujet abstrait, et encore, pas sans complément, mais pas pour un sujet animé : Comment vont les affaires de Jean ? — Elles vont bien ! Elles marchent bien ! elles marchent !* elles vont — mais Comment va Jean ? Il va bien!*Il marche bien il marche]. b) (fam.) un être humain entre inconsciemment dans la machination de celui qui le manipule, à la satisfaction de celui-ci. Il croit naïvement ce qu'il lui dit, accepte ce qu'il lui propose : Paul raconte à Jean les histoires les plus invraisemblables, Jean marche toujours ! — Paul a fait une proposition à Jean ; Jean a marché dans la combine !
One may be surprised that this article includes not two but three main sec tions. This is because I am particularly partial to transitional uses, which confound atomistic lexicologists, and which I make a principle of highlighting. Now the motor vehicle, which presents the twofold characteristic of moving in space and of being a machine, provided me with an ideal transition between the spatial and the non-spatial uses of the verb marcher. I deliberately say "spatial" and not "concrete", "non-spatial" and not "abstract", so as not to be encumbered with con crete machines, abstract business matters and human beings concrete in some res pects and abstract in others. But it is quite obvious that the uses of "marcher" in I involving ground featuring various degrees of hardness, and limbed animals, are "concrete", and that the uses in IE are much more abstract. Further clarification could also be provided, at the risk of becoming so mewhat ponderous, about the relation between the full meaning of the verb marcher and its subducted meaning "fonctionner" ("to work"), which, according to the PR, is defined as "accomplir une fonction" ("to carry out a function"), which function is defined as "action, rôle caractéristique d'un élément dans un ensemble" ("action, characteristic role of an element within a set"). To be quite clear, it might have been advisable to introduce, in definition I.1. — at the risk of drawing protests from a few learned biologists — (hard luck for them! our definition is a purely linguistic matter, and does not come under the philosophy of science) the finalistic notion that feet are made for walking and that the walking animal causes them to fulfill the role ordained for them by the Creator, in the same way, (with subduction) as a machine "marche" ("works") or "fonctionne" ("functions") when it does that which its maker created it to do.
POLYSEMOUS ITEMS
83
Thus it is clear that, whenever possible, as in this case, I refuse the homo nymic solution and adopt a polysemous one. This is for two reasons, one nega tive, one positive : The negative reason is that this presents no drawbacks. I fail to see in what way a foreign user wishing to make sure that in such and such a context he can or must use the verb marcher should be any more confused by a single article divided into three sections than by three unconnected articles. The positive reason is that the whole is prior to and superior to the part and that the definition of each acceptation is improved by considering the polysemy as a whole. This type of definition substitutes intelligibility for absurdity. Instead of saying to himself "How irritating all these homonyms are !" the foreign user will be able to see the shaping of a symbolism of the verb marcher proper to the French language and as a result he will be able to enter more deeply into the language he wishes to acquire. It is now time to turn to the second part of this paper on "continuous defi nition", its basis, its resources, its limits and, I shall add, its specificity. 2.
Basis, resources and limits of a "continuous definition"
2.1. Basis The method used is inspired by Gustave Guillaume from whom I have bor rowed several theoretical concepts, of which it may be useful to clarify the termi nology : From section I 1 to section III 2, there is a continuous movement of thought which I call "kinetism". This movement may be held up at various points of its development, these interceptions being named "prehensions". Acceptation I (to move by means of the feet) is semantically the richest, I qualify it as the "full prehension", acceptation III 2 "to produce the desired effect" is semantically the poorest. I qualify it as a "subducted prehension". The semantic mechanism which comprises the full acceptation, the most subducted acceptation and the kinetism which unites them is the "potential signifícate" which then gives rise to the "actual significates", theoretically unlimited in number, corresponding to the various "prehensions" and to varying degrees of subduction. This outlook is dynamic ; it implies the existence of simple, deep-seated and unconscious mechanisms of tongue, which give rise to numerous and in finitely varied discourse effects ; it implies too, that the mechanisms of tongue can be reached by induction from the discourse effects. The diversity of these ef fects, the multiplicity of their contexts and of the synonyms which can be sugges ted for them, is not an objection to the theory of continuity. The validity of a pro position for a potential signifícate can be tested by its ability to account for all the uses listed for the word, and if necessary, to integrate new ones, and secondly by
84
JACQUELINE PICOCHE
its specificity and the fact that it is inapplicable to any other word in the language. My humble experience in this field confirms my intuition that this is possible : I have never come across two words having the same potential signifícate. Consequently when one discovers (or thinks one discovers) the potential signifícate of a polysemous item, one should take care to write the definitions of the different acceptations in such a way that the transition from one to the other becomes intelligible as much in terms of syntax as in the designation of the semes. After this the synonymous definitions may be brought in, enabling the acceptation under consideration to be situated within a paradigm of interchangeable words. 2.2. Resources The mechanisms of subduction which ensure the continuity of numerous polysemous systems can proceed from richer concrete to poorer, or from concrete to abstract, or from richer abstract to poorer abstract. One of the main assets of these mechanisms is to elucidate the transition from the concrete uses to the abstract ones. The former, denoting particular and specific realities, can be defi ned by semes themselves definable by a long chain of arborescent definitions be fore reaching the layer of semantic primitives ; the latter, definable by a small num ber of semes, bordering the layer of semantic primitives and denoting general ideas, are like impoverished calques of the concrete uses. Metaphor, whether a genuine living figure, or whether lexicalized, is es sentially accountable for by this type of explanation. As they take their place in the shuttle between full meaning and subducted meaning, numerous fixed expressions, in which the word under consideration cannot be substituted for any other, show up hidden semantic features, and take their logical place in the ordering of meanings. Finally, the notion of kinetism fixes a compulsory order of acceptations, guaranteeing intelligibility, and this is not the least of its contributions. The example of the verb marcher has been chosen for reasons of simplicity. But the continuity of the acceptations of a polysemous item functioning by subduction is not necessarily similar to a row of trees along a road, with lines of perspective flowing towards the horizon ; it can also be compared to the rays of a star in relation to the centre, without any circumference linking them together. In other words, some items have several kinetisms proceeding towards very dif ferent subducted meanings. In this case, the ordering of the acceptations is freer than in the previous instance, but it remains essential to bring to light the paralle lism between subducted meanings and full meanings through the wording of the definitions. The full prehension is normally very rich, and may even never be completely realized in discourse. In this case I prefer to use the term "semantic ar-
POLYSEMOUS ITEMS
85
chetypes". This is true in particular of important concrete words, denoting natural realities, parts of the human body, and certain animals. 2.3. Limits Not all polysemous systems can be explained by subduction mechanisms. Some of them do not feature any degree of kinetism. This is the case in words of which the various acceptations have only a small number of semes in common ( in extreme cases a single one) which feature either as "genus" or as "specific diffe rence". In my Structures sémantiques du lexique français, I attempted to demons trate what made up the unity, in synchrony, of the polysemous items hôtel ("hotel", "mansion", "public building") and bureau ("office", "desk", "study", "board"). In the first example, I suggest the genus "bâtiment ayant une certaine importance et une certaine notoriété dans la localité où il se trouve, et abritant des êtres humains" (building of some importance and of some degree of recognition in the locality where it is situated, and sheltering human beings) ; after which it must be specified 1) that it is used to accommodate travellers, or 2) to house various administrative services, or 3) that it is a historical monument. A vague genus is thus specialized and enriched with extra semes by various established contexts. In the second example, which has an even greater tendency towards homonymy, va rious genera (a sort of table, a place of work, a set of people) have in common one specific difference : that of being dedicated to a non-leisure activity, involving pa perwork (or electronics) and organisation. There are even chain-like polysemous systems where one acceptation will have one or several semes in common with another acceptation, but not with all the others. Metonymy, whether a genuine living figure or whether lexicalized, is es sentially accountable for by this type of explanation. In cases such as these the lexicologist is much freer as to the order to adopt (although it is always preferable to proceed from the whole to the part, from grea ter to smaller, from the commonest to the rarest...) and as to the wording of his definitions. This indicates that this type of polysemy is not as strongly cohesive as the other. Experience proves that to insist upon weaving into the text of the defini tion a thread, of questionable thickness, linking one acceptation to another, some times gives strange results, more interesting from an etymological point of view than from a synchronic one. It thus appears that the homonymic solution is inevitable, or that it is found preferable for various reasons. The first type of polysemy (that of marcher) involves an experience of life seen through the mental structures of a certain linguistic community, one element of a greater "vision of the world", a spring of symbolism and poetry ever ready to burst out, or an original linking-up of concepts. In the second type, on the con-
86
JACQUELINE PICOCHE
trary, (that of hôtel and bureau) the unifying principle is no more than a "common denominator", a sort of semic residue left behind by a disjunctive evolution. 2.4.
Specificity I shall attempt to situate this manner of defining polysemous items in relation to four others : that of Eleanor Rosch, that of R.Martin, that of Igor Mel'chuk and that of Charles Ruhl. * Eleanor Rosch. To my definition of acceptation I 1 it could be objected that there are irregular, abnormal and painful ways of walking. To this I shall reply that they are not prototypical and that consequently I have not to take this type of referent into account for a full definition. It is at the discretion of the speaker to produce in discourse sense effects at varying distances from the prototype, corresponding to "prehensions" at various points of "subduction". Am I therefore an adept of the theories of the American psycholinguist Eleanor Rosch, whose views have recently been expounded in an excellent book by G.Kleiber ? No, in spite of a few points of convergence. Her problem is that of categorisation. Which mental representations cor respond, in a certain society, to the various categories of referents? How can one account for the mechanisms of discrimination and generalisation which prompt the human mind to categorize and call the flying being which has just alighted on a branch animal, oiseau or moineau ("animal", "bird" or "sparrow") ? Her approach is based on socio-linguistic surveys. The result is less a definition of words than a description of their referents, of social and not encyclopaedic origin, rich in se mantic features, none of which are necessary, so that counter-examples occurring due to strict categorisation become harmless. It is a fact that my "full" definitions are "prototypical" and that my "prehensions", as they gradually eliminate different features, are not unlike Rosch's "degrees of typicality". But there are some major differences : My problem is not categorisation but polysemy. The semantic features which I retain in my definitions are dictated to me not by a survey on a representa tive sample of speakers, but by the taking into consideration of habitual occur rences, syntactic and lexical, idiomatic or otherwise, which the French language has woven around this word. The number of semantic features retained, indefinite in the first theory, is limited by phraseology in the second. The prototype theory is interesting essentially from a psychological point of view, and its linguistic repercussions are limited : this manner of processing meaning is referential and not differential, and fails to provide a universal model of lexical description. The method of the psychological test is suitable for little more than the intermediate categories of nouns with concrete referents : natural and man-made objects corresponding to what Aristotle called "species", but not for
POLYSEMOUS ITEMS
87
"genera" or for "sub-species". Abstract nouns and grammatical categories other than nouns are more or less refractory. It leaves no place for figures, for the fun damental linguistic phenomena of metaphor and metonymy. In short, it tends to take words for labels stuck onto the real world. A dynamic vision of the sign, on the contrary, held to be relatively inde pendant of its referents, enables one to suggest a "potential signifícate" for any word presenting a polysemy, regardless of its degree of abstraction or its place in a taxonomy, and can explicitate figurative meanings, the relation between concrete and abstract uses and the transition from one to the other. Hence a wider field of application and a more specifically linguistic approach : definitions of words, not of things. * Robert Martin treats the question of polysemy in Pour une logique du sens, in which, theorizing the habitual practice and terminology of dictionaries, in particular the metalinguistic indicators of "p.ext., p.rest., p.meton., p.anal., au fig." and by combining the criteria of "addition of semes", of "deletion of semes", of "modification" or "similarity of construction", of "restriction" or "extension" af fecting the subject or the objects of the verbs, he obtains the two large categories of "polysemy of meaning" and "polysemy of acceptation", subdivided into nume rous sub-categories, in such a way that according to him there are six different types of polysemy for the noun. This conception, well designed to satisfy logical minds, is based on a practice which has proved its worth over the past few centuries. To me, however, it appears deficient in two respects : firstly it assumes the "semes" used to define the words and their denomination to be uncontestable facts. Now, as we know, the number of distinctive semes which oppose one lexical unit to another of adja cent meaning depends on the number of these units and, the longer one makes the list — an open list in principle — of parasynonyms and antonyms, the more semes one discovers to integrate into the "sememe(s)" in question of the polysemous item under consideration. Moreover, a mine of "semantic features" hitherto generally unexploited can be found in the other "sememes" of the same polysemous item, as you have just seen with the example of marcher. Do they deserve to be called "semes", or should this term be reserved for the distinctive differential features which oppose parasynonyms? This point of terminology seems to me of little importance, and I think it would be convenient to call any element of meaning entering into a definition a "seme". Finally, semes are not expressed in a heaven-sent metalanguage, but with the words of natural language, which generally offers us, for a certain notion, a whole range of lexical possibilities, and the choice of one or the other of these
88
JACQUELINE PICOCHE
possibilities is not without effect upon the coherence of the lexicographical treat ment of a polysemous item. Here thus are the two problems which the above theory has failed to solve. First problem : how, from amongst all the possible semes, is one to select the "right" semes, and how is one to denominate them in such a way as to elucidate the principle of unity one has detected (or thinks one has detected) through the va rious strategies of the analysis of meaning ? Second problem : the order in which one sets out the different sememes of a polysemous item is not without significance : there is a "right end" which one must hold if one wishes the skein to unwind correctly. Now the two metalinguistic indicators "par ext., par rest." in particular, as well as the operation of "subtraction" or "addition" of semes seem to be interchangeable in theory, whereas they certainly are not in reality. Semes are more like cells in a living organism than blocks in a building game ! Suppose you are responsible for the word terre ("earth"). I challenge you to construct an intelligible article starting with terre labourable ("arable soil") and terre à potier ("potter's clay") and, "par ext." and by "addition of semes" ending with the planet terre ("earth"). If one adopts the opposite order, everything becomes clear. How, then, among the different orders possible, is one to select the most intelligible? The theory in question fails to say or, more exactly, seems to consider the question resolved in advance by lexicogra phical tradition, and the order in which the different examples chosen are set out, indisputable. On these two points, I think I advance at least the beginning of a solution. * Igor Mel'chuk in his Dictionnaire explicatif et combinatoire du français contemporain, opposes "lexical items" : lexical units taken in a single specific acceptation, idiomatic or quasi-idiomatic, like lumière, coup de foudre, voie ferrée (light, stroke of lightning, railway), to "vocables" : set of all the "lexical items" of which the signifiers are identical and the significates directly or indirecly linked. The lexical item is the basic unit of description in the DECFC and the conditions of its use are examined with the greatest care and in the greatest possible detail. Grouping these together into "vocables" by means of "semantic bridges" is the very least of the author's worries, and he only does so when he notices the exis tence of semes common to two lexical items. But how has he chosen and denomi nated his semes ? This brings us right back to the problem referred to above. If he defines the parts with no consideration for the whole, the reconstructionof the whole will be largely determined by chance, and the parts stand a good chance of remaining "membra disjecta", which is in fact often the case, as one finds, for instance, in the DECFC, two separate lexical items, sur pied with no semantic
POLYSEMOUS ITEMS
89
bridge to the vocable pied, and one lexical item à la tête with no semantic bridge to the vocable tête, etc. One of his assistants, G. Dostie, at the congress of the Société de Linguistique Romane (1992) gave a paper, admirable in its kind, on the two lexical items je comprends ! and penses-tu !. She demonstrated their "discursive" as opposed to "descriptive" value by a study of their relation both to the situation of utterance and to the speaker, of whom they express the psychological state when faced with an element from the verbal or situational context, a reaction to circumstances about which he begins to think, quite comparable to that of zut ! voyons ! tiens ! ma parole ! écoute ! or better, to oui and to non, since these are illocutory acts of agreement or disagreement, of acceptance or refusal. After a study of their syntactic status, after putting them to various tests, and observing the principle of substitutability for the item defined she finally gave the following definitions of them : Je comprends ! : "J'exprime la croyance que la proposition, que je dégage de ton énoncé précédent, et que j'ai déjà évaluée, est vraie" (I express the belief that the proposition, which I infer from your previous utterance, and which I have already evaluated, is true). Penses-tu ! : "j'exprime la croyance que la proposition X, que je dégage de ton énoncé précédent, et que tu présentes comme possible, est fausse" (I express the belief that proposition X, which I infer from your previous utterance, and which you present as possible, is false). In the course of a discussion, I made her admit that she had no intention of establishing the least "semantic bridge" between these "lexical items" and the "vocables" penser and comprendre, because the speaker has not the slightest fee ling of a semantic kinship between these two units. After a night's reflexion I had a paper sent to her, written along these lines : Examples : (1) (1') (2) (2') (3) (3') (4) (4')
Ecoute, Jean, il faut partir ! (Jean ne réagit pas) Mais enfin, écoute-moi, écoute ce que je te dis, il faut partir ! Jean va réussir à son concours — Tu penses vraiment ce que tu dis ? Jean va réussir son concours — Tu penses ! — Penses-tu ! Jean a un mérite fou ! —Je comprends et je partage ton admiration Jean un un mérite fou ! —Je comprends ! Tu vois, Jean est dans le vrai pour telle raison—Non, je ne vois pas ! Voyons, réfléchis un peu ! ça se voit à l'oeil nu !
90
JACQUELINE PICOCHE
Questions : I. Would you dare to say that (1) (2 (3) (4) and (1') (2') (3') (4') contain different and disconnected verbs, between which no semantic bridge is possible ? II. When a syntactician tells you : "verbs of which the subject is nonagentive cannot be passivated", as I have just heard, you accept this although the notion of "non-agentive subject" is completely unconscious for the non-linguistic speaker. Why, from me, do you not accept an explanation based on his linguistic subconsciousness, and why do you insist on his feeling the semantic link between two lexical items, which is moreover something highly subjective and variable from one subject to another ? If I receive an answer, I may have the opportunity of communicating it to you in another paper. * Charles Ruhl. About this last author, who is exploring a path very rarely followed these days, I shall only speak with precaution and reserve, because I only know his book On monosemy, a study in linguistic semantics through a long review by R. Landheer, and because between him and myself there is a major terminological obstacle since he calls "monosemy" what I call "polysemy". Having said this, I think I understand what he means because I went through that stage before reaching the theory I am expounding today. He postulates that words have a single meaning, general and highly abstract, which must be carefully distinguished from the contextual factors, which give rise to the multiplicity of meanings listed in dictionaries. Here I recognize the opposition between the "potential signifícate", a fact of tongue, simple, deep-seated and unconscious, and the discourse effects, infinitely numerous. The trouble is that in his analysis of the verb bear, this single meaning, which is supposed to unite all the uses in discourse is "so highly abstract that it becomes impossible to paraphrase it adequately". This reminds me of a paper that I gave at the beginning of my research on the verb voir ("to see"), at Robert Martin's seminar. To find the link between De ma fenêtre je vois le port (from my window I see the port) and L'histoire et la chimie ça na rien à voir ensemble (history and chemistry have nothing to do with each other), I found nothing better than the "highly abstract" notion of "relation", which afforded me a few well-deserved ironical remarks. Which transitive verb does not express the idea of "relation" ? and when several lexical items, likewise highly abstract, are to be found associated in the same sentence, one wonders what the context can contribute to each of them. It seems to me that the weak point of this type of explanation is that it treats the first of our two types of polysemous systems as the second, that it endeavours, for the powerful semantic machines that large polysemous items are, to establish a principle of static unity, a lowest common denominator, while in the case of
POLYSEMOUS ITEMS
91
these, kinetism, proceeding from the most complex to the most evanescent, is far more enlightening. Is this principle of dynamic unity an illusion or a truly fruitful explanatory principle, as I have the feeling ? This is what we shall see when I have finished exploring the meaning of particularly polysemous lexical words which are part of the first thousand on the list of decreasing frequencies in the TLF. This is what I am currently working on, with a single collaborator, with a view to elaborating a sort of dictionary, worded as simply as possible, aimed at serving as a basis for systematic teaching of vocabulary at all levels, starting from primary level, abroad and in the various countries of the French-speaking community, in particular in Africa.
92
JACQUELINE PICOCHE
REFERENCES Dostie, G. La description sémantique de quelques expressions discursives en lexi cographie : "je comprends !" et "penses-tu !" dans le dictionnaire explicatif et combinatoire du français contemporain. A paraître dans les Actes du XXe Congrès international de la Société de Linguistique Romane, Université de Zürich, du 6 au 11 avril 1992 Douay, C. and D. Roulland. Les mots de Gustave Guillaume, vocabulaire technique de la psychomécanique du langage, Rennes : Presses Universitaires, 217 p. Dubois, J. et alii. 1966. Dictionnaire du français contemporain (DFC), Paris : Larousse, 1224 p. Dubois, J. et alii. 1975. Lexis, dictionnaire de la langue française, Paris : Larousse, 1950 p. Kleiber, G. 1990. La sémantique du prototype. Catégories et sens lexical, Paris : PUF, 199 p. Martin, R. 1983. Pour une logique du sens, Paris : PUF, pp. 63-83 Mel'chuk, I. 1984. Dictionnaire explicatif et combinatoire du français contemporain, recherches sémantiques (DECFC), vol. I, Les Presses de l'Université de Montréal, 172 p. ; vol. II, ibid., 1988, 332 p. Picoche, J. 1986. Structures sémantiques du lexique français, Paris : Nathan, 154 p. Robert, P., A. Rey and J. Rey-Debove. 1987. Le Petit Robert, dictionnaire alphabétique et analogique de la langue française (PR), Paris : Le Robert, 2173 p. Ruhl, C. 1989. On monosemy, a study in linguistic semantics, Albany, NY : State University of New York Press, 299 p. Reviewed by R. Landheer, Journal of Pragmatics 15 (1991, febr.), pp. 210-215
THE CHALLENGES OF CONTINUITY FOR A LINGUISTIC APPROACH TO SEMANTICS CATHERINE FUCHS CNRS (URA 1234, University of Caen), France
0.
Introduction
For several decades linguistics has been marked by the predominance of discrete approaches. I refer in particular to hard-line structuralism, to the genera tive and transformational trends, and to the various ways of resorting to "classical" logical formalisms. In these approaches, continuity has no place : one works with classes of disjoint units or values, mutually exclusive, the characteri sation of which rests upon sets of necessary and sufficient conditions ; any attempt towards flexibility is seen as blurring the phenomena and leading to a drop in efficiency and scientific specificity. However, on the fringe of these mainstream trends, there have always been "dissidents", denouncing the "reductionism" of discrete processing methods (in particular in their hardest version, that of binary oppositions) advocating more flexible approaches to linguistic phenomena, and invoking more or less explicitly the notion of continuity. In Europe (these trends are probably not as well-known on the international scene as American schools), I shall quote in particular G. Guillaume's psychomechanics of language (with the notion of "kinetism") ; I shall also mention, in the domain of enunciative operations, the works of A. Culioli on the systems of non-binary values (the so-called "cam" model) and on the topology of "notional domains" ; I shall also refer, in the field of topology of languages, to the work of H. Seller's team at Cologne (the "UNITYP" theory). Since the last few years, it would appear that this marginal situation is in the way of undergoing a change : we are currently witnessing the emergeance of explicitly continuous methods of approach, particularly in the field of "cognitive linguistics (or semantics)" developing in the U.S.A. at a tangent from generativetransformational grammar. These various linguistic trends in favour of continuity themselves undergo cross-influences from theories which come from other horizons : in particular
94
CATHERINE FUCHS
from logic (L. Zadeh's theory of fuzzy sets) and psychology (E. Rosch's theory of typicality ; the Gestalt theory). The fact remains that the notion of continuity is currently being put forward in the works of a variety of linguists, and is being invoked on all levels of lan guage analysis (I refer here to written language, without prejudice to works on oral speech, in particular on intonation). This is manifest in semantics proper, both lexical (cf. J. Picoche 1986 on the French lexicon, and herewith on the example of the verb marcher) and grammatical (cf. A. Culioli on the modality of pouvoir and H. Seiler on the relations of "participation"). But also on other levels, linked with semantics : the morpho-semantic level (when dealing with compo sitional mechanisms, and the fixing of expressions, one speaks of smaller or greater degrees of lexicalisation) the syntactic-semantic level (series of elemen tary conditions enable one to define the degrees of probability for occurrences of certain structures : cf. herewith G. Leech on the English structures "X's Y" and "the Y of X" ; see also G. Legendre 1989 and G. Legendre et al. 1990 on the unaccusativity of intransitive verbs) ; the pragmatic-semantic level (cf. G. Leech 1983 ch. 9 in which acts of language are treated as "scalar units" or C. Kerbrat-Orecchioni 1991 who invokes a "continuum" between question and as sertion acts ; likewise for the processing of figures of speech and non-literal mea nings : see for example G. Lakoff & M. Johnson 1980 on metaphors). As always, when a trend becomes a fashion, this new situation does not fail to bring with it some danger of fetishism. 1.
The linguistic phenomena
Let us note firsly that the phenomena which lead linguists to invoke conti nuity are generally located on the level of a semantic approach to language, and not on the level of reference to extra-linguistic facts. Indeed there seems to be a kind of intrinsic continuity in the world itself and in the perception of it, which is translated in language productions by phenomena of reference to the extra-linguistic, like : - the so-called "generality" phenomena : the naming of a referent corres ponds to varying degrees of precision, whether it be an object or a world situation (progressive precision being achievable by paradigms like the passage, for example, from genus to species : a tree —> a linden, or by syntactic means : a linden —> a silver linden —> a small-leafed silver linden —> etc.) ; cf. the example given herewith by R. Langacker : Something happened —> a person saw an animal—> a woman examined a snake—> a tall young woman carefully scrutinized a small garter-snake ; see also B. Pottier 1987 p. 307 : a knife—> an ivory flick-knife studded with rubies.
LINGUISTIC CONTINUITY
95
- the so-called "referential vagueness" phenomena : for a given referent a certain denomination or qualification is more or less adequate (thus the word siège "seat" will be quite acceptable for a chair or an armchair easily identifiable as such, and less acceptable for a circular surface placed on a stand by way of a stool, and far less acceptable for a tree-trunk lying on the ground) ; see C. Fuchs 1986. - the indetermination phenomena (the unsaid, the implicit) : the semantic information communicated about a given event is more or less determinate (thus for example an utterance like Il est bien tôt "It is rather early" remains indetermi nate as to what time location point this anticipation is to be referred to : too early to get up (—> to get up on a day off —> to get up on a day off when the weather is bad —> etc.) ? too early to leave work (—> to leave work today —> to leave work today just when there is an urgent job to finish —> etc.) ? ; see R. Martin 1985 p.149. The active factor here is the infinite scale of communication situations and conceptualizations : this type of continuity (if indeed there is continuity) is not in language ; quite on the contrary, language resorts to discrete processes to construct "reference points" on this continuity : cf. R. Langacker herewith. In contrast with that extra-linguistic continuity, continuity invoked as speci fically linguistic is that which is involved in conceptualizing the world via lan guage, i.e. which is located on the level of meaning (and not on the reference le vel) and which concerns the link between signifiers and significates, between markers and underlying linguistic-cognitive operations. This level is generally considered to include : - so-called "semantic vagueness" phenomena : the signifícate of a sign (discrete by definition) is subject to varying degrees of "amplitude" (to use the ex pression of B. Pottier 1987), whether the sign be in antithesis to another (like tall / short) or whether it be a polymorphous sign, the meaning of which will be sha ped by the context (like the preposition de in French). - polysemy phenomena (distinct as such from cases of alternative ambi guity between two mutually exclusive significates) : the meaning of a single sign can be described in terms of a plurality of non-disjoint significates, whether it be a lexical morpheme like marcher (cf. J. Picoche herewith) or like cut (cf. D. Touretzky herewith) or a grammatical morpheme like the French adverb encore (cf. B. Victorri & C. Fuchs 1992) or like the English preposition over (cf. G. Lakoff 1987 pp. 416-461). Polysemy phenomena also include shifts of meaning in synchronic or diachronic usage of a sign, and figurative uses of a sign. - phenomena of semantic continuity within grammatical categories (categories of time, aspect, modality, quantification-determination, predication) :
96
CATHERINE FUCHS
the values corresponding to the different markers of these categories are not dis joint (cf. herewith H. Seiler on predication) and it is possible to describe in topo logical terms the effects of continuity and discontinuity which characterize them (cf. herewith A. Culioli on pouvoir). Having made this distinction, it is nonetheless necessary to add that the boundary between the different phenomena mentioned above is not intangible : as J.M. Sadock 1986 notes, the boundary between generality, vagueness, polysemy and non-literal meaning is itself "fuzzy", that is, hable to variations ; this variabi lity, which depends on the metalinguistic demands one makes on the description, can be explained in terms of the different levels of interpretive depth (further de tails to follow). Let us add that linguistic continuity mainly concerns the link between signi fier and signifícate, but it can also concern the identification and categorization of the signs themselves (cf. herewith P. Le Goffic). 2.
Underlying problematics
For a number of linguists using the term, "continuity" does not have a pre cise mathematical meaning ; as D. Kayser notes herewith, it often indicates a kind of dissatisfaction towards the more rigid classical discrete approaches. Should lin guistic science remain at this stage of intuitive — if not to say magical — incanta tions, it would run the risk of offering literature of a merely allusive type : meta phorical, non falsifiable, i.e. ultimately unscientific (however perceptive the in sight may be) ; this risk seems very real to me, as much for psychomechanics as for cognitive semantics — in this respect the reaction of R. Langacker in favour of the discrete is very significant. It is therefore necessary that the status of these references to continuity should be clarified from a formal point of view. However, before being in a posi tion to specify technically what mathematical continuity(ies) linguists are referring to, and within what type of modelisation (continuous or not) the different pheno mena they describe should be placed (these questions I shall leave to specialists), it seems to me that there is some necessity to clarify the problematics that underlie this appeal to continuity. They appear to be of two major types, namely, proble matics of graduality and problematics of movement. The first order of problems concerns facts of graduality : given two fixed points A and B represented as mutually exclusive locators, linguists are confoun ded by the existence of intermediate points, which lead them to think that there is not a total discontinuity between A and B but a gradual transition from one to the other, by a series of small successive jumps.
LINGUISTIC CONTINUITY
97
The status of the intermediate points differs depending on the domains and the authors. In some cases they can be strict intermediaries, i.e. points distinct from A and B (here the term "gradient" is used, as by D. Bolinger 1961, to refer to a "scale of degrees" between A and B conceived as the two extreme "poles" of this "continuity"), in other instances they can be undecidable cases, partaking both of A and of B (here the term "fuzzy" boundary is used, as by L. Zadeh, between A and B, setting up a sort of mixed zone). The question of the finite or infinite, countable or uncountable characteristics of the intermediate stages, is ra rely brought up as such ; in practice, more often than not a finite number of inter mediate stages is deemed satisfactory. In actual fact it can be said that the problematic of graduality is nevertheless very often dealt with in discrete terms : A and B are generally each broken down into a set of more elementay features (particles or parameters), enabling one to de fine the relative similarities and differences between the different points of the gradient by means of characterisation of the "fuzzy" cases in terms of smaller or greater numbers of characteristic features of A and / or B possessed by each of the intermediate points. As I shall attempt to show later, the problematic of gra duality does not, in itself, compel the invocation of continuity, unless one wishes to contrast cases of graduality with cases of discontinuity. Let us note furthermore that facts of graduality can be processed in terms of probability (cf. for example G. Leech herewith). Continuity is also invoked in the name of a second type of problematic, which is that of a movement, conceived as imprinted in the very essence of lan guage : it is considered that, in tongue, units intrinsically possess the property of scanning in a dynamic way, a trajectory linking certain points, and that it is this very trajectory that constitutes their meaning. What is then challenged in the clas sical discrete aproaches is the static character of the semantic representations which they impose. This idea of dynamics can be invoked to characterize the meanings of units of tongue (lexical or grammatical markers, syntactic constructions) both in syn chrony (to describe the field of their semantic values) and in diachrony (to ac count for the historical shifts from one semantic value to another). Particularly illustrative in this respect is Gustave Guillaume's theory of psychomechanics : the basic notions of this theory are that of "kinetism" ("oriented dynamics underlying every linguistic phenomenon (...) the semiology of tongue in fact represents not states but movements which, intercepted by the momentary aim of speech, then determine particular sense effects", C. Douay & D. Roulland 1990, pp. 45-46) (our translation, C. Fuchs), that of the "radical bi nary tensor" ("the form of language consists in the setting up of an abstract rela tion between two adverse terms ; the dynamic character of the systems, which cor-
98
CATHERINE FUCHS
responds to a two-way scanning of this binary relation, stems from the fact that the systems are responsible for representing thought in movement (...) the term of tensor refers to the operative nature of these movements as they shift successively from tension to distension", ibid. p.186) and that of "operative time" ("thought, in order to institute into language its own processes, and thus avoid having to perpe tually improvise its means of expression (must) mark and signify the stages of its progression. Operative time in system represents the patterned scanning carried out by thought and, in accordance with this marking, undergoes signifying inter ceptions which are its articulations (...) operative time is held to be the dynamic vector of 'genesis'", ibid. pp. 184-185). In a perspective strongly inspired by psychomechanics, let us also mention B. Pottier (1980, re-ed.1992, pp. 35-44) who classifies semantic categorizations of natural languages into five main dyna mic models, two of which are qualified as "continuous" (repectively "continuous binary" and "continuous ternary"). One notices that in this dynamic approach to semantics the questions of graduality bear a certain relevance, insofar as "prehensions" (or "interceptive cuts") are seen to operate in speech (i.e. when effective values are constructed in context) : these cuts may in principle be located at any point of the movement, thus defining a — theoretically — infinite number of intermediate stages between the two extreme points (or "maximised" positions) of the tensor. In many instances G. Guillaume insists upon the countlessness of the cuts, "the number of which — great, and historically greater and greater — tends theoretically towards the in finite, in the absence of any assignable limit" (1964, re-ed. 1969, p. 205) (our translation, C. Fuchs) : "Representations in tongue, limited in number, precede acts of expression, unlimited in number, which they make possible and condition" (ibid. pp. 149-150) (our translation, C. Fuchs). He notes in particular that these cuts can correspond to referentially opposite effects (cf. his famous example : L'instant d'après, le train déraillait "A moment later the train derailed" which can be interpreted as il aurait déraillé, si....(etdonc il n'a pas déraillé) "it would have derailed, if... (and therefore it did not derail)", or as il a effectivement déraillé "it actually derailed"). Another theory which is representative of this problematic of movement in language (and which aims to reach a much more operational level) is A. Culioli's theory of enunciative operations (cf. 1990 and herewith). The author describes with particular emphasis the "intuitive geometry" concealed in language, both on the level of categories (i.e. sets of markers) which, like for example aspect or modality, define a genuine "semantic topology" (cf. the notions of the open or closed "interval", of the "boundary" and of the "last point") and on the level of the set of values which can be associated with each marker (which can be qualified
LINGUISTIC CONTINUITY
99
as a "distortion area", producing effects of contiguity and effects of disconnec tion). 3.
Arguments : the example of polysemy
In themselves, the two problematics which have just been referred to, both that of graduality and that of movement, do not by any means necessitate conti nuous processing : discrete linguistic analysis processes are perfectly appropriate, as are modelisations based on discrete formalisations (much less "costly" in many respects, than formalisms based on continuity). Setting aside at this stage the question of formalisation, I would like to com pare, on strictly linguistic grounds, the discrete perspective with the continuous, and put forward a few arguments in favour of the latter. As an illustration, I shall take the example of polysemy processing. To describe the semantic behaviour of a polysemous item, linguistic analysis begins classically by "marking out the meanings", which consists in the systema tic classification of the different meanings of the item : the finite list of separate meanings theoretically observable in actual contextual function thus becomes the set of potentialities of tongue (for a criticism of this process of backward invest ment of the tongue signifícate through the discourse effects, see M. Launay 1986). On the basis of such a channelling of meanings there are two possible ways of conducting linguistic analysis, while remaining on discrete grounds, which I shall call respectively "homonymic reduction" and "static polysemy". Homonymic reduction consists in processing as as many distinct units the different meanings (or the different syntactic-semantic behaviour patterns) of the item : for instance the verb marcher (as in the Lexis dictionary quoted herewith by J. Picoche) is treated as having a first unit marcher 1 (meaning "to progress by moving the feet one after the other") a second unit marcher 2 (meaning "to work" or "to make progress") and a third unit marcher 3 (meaning "to give one's accep tation") or again the verb devoir (cf. H. Huot 1974) as having a first unit devoir 1 (behaving as a full verb and signifying obligation) and a second unit devoir 2 (behaving as an auxiliary and signifying probability). This process of reducing polysemy to homonymy is all the easier to adopt as the item lends itself in addition to an effect of referential dissociation, i.e. allows distinct referents to be pinpoin ted (this in particular is the case for nouns referring to totally dissociated "concrete" referents like hôtel and bureau quoted herewith by J. Picoche). Let us say that differences (referential or syntactic, depending on the case) tend to prevail over semantic similarities (i.e. over the kinship of the significates) : the unity of the sign is negated and split up — itemisation is at a maximum ; this attitude is currently prevalent in a number of formal works, notably in computational
100
CATHERINE FUCHS
lexicography, where "homonymic degrouping" is systematically practised (cf. G. Gorcy 1990). Static polysemy, on the other hand, attempts to describe the kinship bet ween the different meanings of the item, to clarify the relations which link them and to map out a kind of route, an itinerary, which makes the transition possible from one point (i.e.from one meaning) to another. This is being done in componential semantics, for example, in terms of addition and / or deletion of elemen tary semantic features known as "semes" (cf. R. Martin 1972 and 1979 or F. Rastier 1987). This process, motivated by the desire to uphold the unity of the linguistic sign, does make use of a certain notion of movement (a transition is im plemented from one meaning to another), which nevertheless remains within a discrete conception, i.e. like a succession of jumps between a finite number of states. The difficulties met with by these discrete approaches to polysemy include on one hand the number of meanings retained, and on the other the order in which one chooses to present them. In just the same way as dictionaries, linguistic stu dies give for the same items descriptions varying as to the number and order of the meanings. It is a well-known fact that the more acute the analysis becomes (i.e. in proportion to the raising of the level of the interpretation demand) the greater the number of meanings (or acceptations of meanings) becomes, and the more diffi cult it becomes to determine whether the new shades of meaning detected consti tute distinct meanings ; this brings a risk of atomisation, which if not undefined, is at least very far-reaching, and particularly detrimental in the case of homonymic reduction (where this atomisation leads to a proliferation of the units themselves). Moreover, the setting out of a "map of meanings" in static polysemy imposes very strict constraints : the choice of a starting point (the meaning considered as the "first" or "basic meaning"), the laying down of a single oriented itinerary linking all the points, and the impossibility of intermediate stops between two points. The acknowledgment of these difficulties implies that the variability of the stopping points and of the itineraries (i.e. of the meanings and of the links bet ween them) far from revealing a deficiency in linguistic analysis is on the contrary an integral part of the object under description, it manifests the pliability, the se mantic malleability inherent in linguistic units, and, proportionately, in the diver sity of the interpretation demand thresholds : "As the observer shows deeper in sight and sagacity, the discourse categories which he believes he can discern, pro liferate without a term being able to be assigned to this proliferation" (G. Guillaume op.cit. p. 206) (our translation, C. Fuchs). One possible way to integrate this variability into semantic analysis itself is to adopt a continuous outlook, in this case what I shall call dynamic polysemy.
LINGUISTIC CONTINUITY
101
Let me emphasise the fact that the transition to a continuous dynamic repre sentation of polysemy constitutes, with regard to observable data, a jump or a kind of "theoretical bet". As several linguists note herewith (H. Seiler, G. Leech, J. Picoche), continuity cannot be observed in linguistic data, nor even deduced from it : what is visible on the macroscopic level of observables are discrete reference points. The transition to continuity is tantamount to proceeding to a microscopic, not directly observable level, on which a theoretical structure is suggested, the aim of which is to account, in a more precise way, for certain observed behaviour patterns. In the field of polysemy, a continuous dynamic approach (like, for example, that proposed by the Laboratoire ELSAP : cf. C. Fuchs & P. Le Goffic 1983 / 1985, C. Fuchs & B. Yictorri (eds.) 1988, B. Victorri & C. Fuchs 1992) amounts to processing the movement as primary and the stops on this mo vement as secondary and variable — a concept which, as can be seen, concurs with G. Guillaume's intuition concerning "kinetism" in tongue and the possible diversity of the "cuts" in discourse. The important point is probably not so much that of the countable or countless number of stopping points (i.e. of meanings) as that of their smaller or greater extension : the meaning of a polysemous item can be conceived not as a finite list of inter-linked meanings but as a "potential" of meanings in context, able to be represented as a multidimensional "semantic space" in which the meanings in context cover smaller or greater regions and the interpretation process can then be described as the movement which, relative to the contextual elements, constructs a dynamic upon this space (cf. modelisation suggested by B. Victorri 1988). From the point of view of linguistic theorization, the advantage of such a conception of polysemy (based on dynamic continuity) is that it enables cases of graduality to be opposed to cases of non-graduality. Linguistic observation of attested textual corpora (the only means of escaping the narrow framework of artificially manufactured examples) actually enables the following theoretical si tuation to be formalized : it occurs that two semantic values A and B which, in certain contexts, function as mutually exclusive, thus being able to give rise to alternative-ambiguities, function in other contexts as compatible on the interpre tation level (depending on the cases, they are neutralized into a neutral value, or blended into a resulting mixed value, or else indistinguished into an under-deter mined value) : cf. P. Le Goffic 1981, C. Fuchs 1988, C. Fuchs 1991a ; contex tual factors thus exist, some of which cause the interpretation to "tilt", in such a way that a discontinuity (a "catastrophe point") is introduced between A and B, while others, on the contrary, lead to a continuity between the same values A and B. Let us take as an example, the two values of the French adverb encore known respectively as "repetitive" (ex. Tu as encore fait une bêtise. "You've been
102
CATHERINE FUCHS
up to something again") and "durative" (ex. Quand je suis entré, il était encore endormi. "When I came in he was still asleep"). These two values function as mu tually exclusive in ambiguous contexts like Je suis triste car elle a encore un petit linge autour de la cheville. "I feel sad because she (still ?) has a bandage around her ankle (again ?)". (The English translations "again" and "still" convey respecti vely the two above mentioned values — the ambiguity thus disappears in the translation) or like // a encore les quatre as. "He (still ?) has the four aces (again ?)." (Is a fresh card game being played in which he was dealt the four aces again, or is the same game being continued, in the course of which he still has the four aces he had initially ?). But in other contexts these two same values are not opposed to each other, so that it is not necessary to choose between the two when interpreting the utterance ; thus there is a neutralisation of the two values in Quelques averses se produiront encore, plutôt près des côtes "A few showers will (still) occur (again) in coastal areas" (Whether one undersands that "Fresh sho wers will occur" or "Showers will continue to occur" makes no difference, owing to the indefinite quantification of the subject), whilst over-determination (i.e. a blending effect of the two values) occurs in C'est une foule de la Chine, je la revois encore dans les images de la prospérité de maintenant. "It is a crowd from China, I can (still ?) see it (again ?) in the images of present prosperity" (it is at once and undissociably "I can see it again" and "I continue to see it") and underdetermination in La petite chienne, frémissante et extasiée : "Encore ! Encore ! Oh ! que j'ai peur !" "The little dog, trembling and ecstatic : Again ! Again ! (More !More ! ?) Oh ! how afraid I am !" (is the process repeated or continued ? In this case, the reference point is below the interpretation threshold at which the question can be raised). One notes incidentally that effects of over-determination are reminiscent of blending processes which function in puns, portmanteau words and in metaphors. Systematic linguistic analysis of contextual factors shows that the discon tinuities which polarize two values A and B by creating between them an effect of discontinuity cannot be described in strictly quantitative terms in a merely compositional outlook. Indeed, although discontinuity can be created by the interplay of several converging factors inducing in a "strong" manner a change of meaning, it also occurs that a very slight modification bearing upon a single factor is sufficient to induce such a change : this is the case, for example, in the transi tion from Tu t'es encore levé à 3h du matin ! "You got up at 3am again !" (repetitive) to Tu es encore levé à 3h du matin ! "You are still up at 3am !" (durative). It also happens that in a conflict between diverging factors, two "weak" factors override a "strong" competitor and thus cause the interpretation to shift. Let us consider, to illustrate this point, the following examples : in II s'est encore plaint "He complained again", the "passé composé" constitutes a strong
LINGUISTIC CONTINUITY
103
factor in favour of the repetitive value ; in II s'est encore plaint après son départ "He complained again after his departure", this strong factor overrides the adver bial après son départ which tends nonetheless — weakly — to induce a durative value ; in II s'est encore plaint pendant une heure, "He (again ?) complained for an (other ?) hour" the adverbial pendant une heure constitutes a slightly stronger factor than the preceding adverbial in favour of the durative value which however remains less plausible than the repetitive (the influence of the "passé composé" remaining dominant) ; finally in // s'est encore plaint pendant une heure après son départ "He complained for another hour after his departure" the merging of two weak factors in favour of the durative value overrides the strong opposing factor constituted by the "passé composé". Let us note that the choice of continuity has common points with certain conclusions both in the theory of typicality, in questions of categorization, and in the Gestalt theory.The meanings marked out at the outset as relatively stable fixed reference points, occur in fact as maximum accumulation points cor responding in some way to particularly salient typical values (cf. C. Fuchs 1991b). J. Lyons notes on this subject that componential analysis is not "a tech nique for the representation of all and only the meaning of lexemes" but "a way of formalizing that part of their prototypical, or focal, meaning which they share with other lexemes" (1981, p.84). These typical values are constructed, through the action of contextual factors, as "well-formed patterns" (Gestalten), which show semantic stability owing to the converging of factors (all the factors "draw" the interpretation in the same direction) ; thus for example in On me fit encore le coup cinq ou six fois "The trick was again played on me five or six times" the lexical verb type (terminative punctual process), the tense (simple past with aoristic value) and the adverbial (which conveys a plurality of discrete occur rences) are three converging factors in favour of the repetitive value of encore ; li kewise in Or tandis que la transformation s'opérait, on ignorait encore son existence "Now, while the transformation was taking place, its existence was still unknown", the lexical verb type (stative non-terminative process), the tense (imperfect with value of non-accomplished) and the adverbial of time (marking a non-bounded reference interval) constitute three converging factors in favour of the durative interpretation of encore. On the contrary, intermediate meanings, va riably difficult to characterize and to delimit, illustrate various types of semantic "blurring" corresponding to non-typical and unstable semantic values, constructed by a set of conflicting factors which draw the interpretation in different direc tions (cf. examples above).
104
CATHERINE FUCHS
Conclusion The issues raised by continuity in semantic linguistics interface ultimately with the great issues of general linguistics. In synchrony, the Saussurian opposition between "tongue" and "speech" (or "discourse" in Guillaume's terms) appears in the problematics of continuity and discreteness. On the one hand, the continuous movement in tongue is contras ted with discrete cuts in speech ; but on the other hand, the finiteness of potentials in tongue is contrasted with the infinitude of possible effects in speech : the use of the system of signs, largely discrete, (but bearing within itself— as we have seen — continuity) by subjects in enunciative situations, and in a given speech context, is itself carried out within a continuous area of interplay, of shifting and distor tion. What is at stake here is the communicational dimension of language, and subject-to-subject adjustment. In diachrony, continuity appears as the actual condition for the evolution of the system as it undergoes the influence of various "antagonistic tensions" (in the words of P. Le Goffic 1988 and herewith). Here, clearly, continuity mani fests the living and evolutionary character of the language system, the items of which can undergo genuine processes of "semantic drifting" which cannot be ex plained in a discrete perspective. Finally, it should be recalled that another question central to the debate is that of the arbitrary nature of the sign : champions of continuity in linguis tics tend to emphasise the relativity of the theory of the arbitrary nature of the sign. Thus, one of the supporters of cognitive semantics, C. Vandeloise (1991, pp.9394) defends the theory of "motivation of grammatical categories". As he rediscusses L. Bloomfield's analysis (1933) of the opposition (arbitrary in the latter's eyes) between wheat (singular) and oats (plural), he invokes the "continuum which proceeds, from mass bodies like sand, made up of an infinity of small par ticles impossible to distinguish from each other, to entities like spinach (Fr. épinards — plural), composed of easily discernable identical parts which, however, taken separately, do not have (...) sufficient interest to be mentioned", and em phasises that "when a distinction within a continuous domain must divide it into two separate domains (...) the motivation is clear at the extremities of the conti nuum, (and) becomes more doubtful at its centre" : situated at the centre of the continuum, the two borderline-cases wheat and oats refer to cereals very much alike which nevertheless can be distinguished in that "it is wheat, with its elements more compact and less distinct, that is represented by a singular word, while oats, with its grains more scattered is represented in English by a plural word" (my translation, C. Fuchs). For another stand in favour of a certain motivation of the sign, see also B. Pottier (1980 re-ed. 1992, p.45).
LINGUISTIC CONTINUITY
105
To end, we shall insist, as many authors of the present volume, upon the fact that continuity and discreteness constitute not so much two contradictory op tions as two modes of description which articulate and delimit each other — on the understanding, as R. Thorn recalls herewith, that it is always possible to materia lize discreteness upon a background of continuity, but not vice-versa : in this res pect, the emergence of continuity in semantic linguistics is a sign that a certain re cognition is being given to "the necessity of introducing, behind an apparently fixed structure, 'hidden' parameters of a kinetic order which explain its stability" (in the terms of R. Thorn 1974).
106
CATHERINE FUCHS
REFERENCES
Bloomfield, L. 1933. Language, New-York : Holt, Rinehart & Winston. Bolinger, D. 1961. Generality, gradience and the all-or-none, La Haye : Mouton. Culioli, A. 1990. Pour une linguistique de l'énonciation : opérations et représentations, Paris : Ophrys. Douay, C. & D. Roulland, 1990. Les mots de Gustave Guillaume : Vocabulaire technique de la psychomécanique du langage, Université de Rennes 2 : Presses Universitaires. Fuchs, C. 1986. Le vague et l'ambigu : deux frères ennemis. Quaderni di Semantica, Bologna : Il Mulino, VII : 2, pp. 235-245. Fuchs, C. 1988. Représentation linguistique de la polysémie grammaticale. TA. Informations 29, Paris : Klincksieck, pp. 7-20. Fuchs, C. 1991a. L'hétérogénéité interprétative. In H. Parret (ed.), Hétérogénéités : ruptures, silences, ellipses, Paris : CNRS, pp. 107-120. Fuchs, C. 1991b. Polysémie, interprétation et typicalité : l'exemple de "pouvoir". In D. Dubois (ed.), Sémantique et cognition : catégories, prototypes, typicalité, Paris : CNRS, pp. 161-170. Fuchs, C. & P. Le Goffic. 1983/1985. Ambiguïté, paraphrase et interprétation. Modèles Linguistiques, Lille : Presses Universitaires, V : 2, pp. 109136/VII : 2, pp. 27-51. Fuchs, C. & B. Victorri (ed.). 1988. Vers un traitement automatique de la poly sémie grammaticale. TA. Informations, Paris : Klincksieck, 29. Gorcy, G. 1990. La polysémie verbale, ou le traitement de la polysémie de sens : discussion à partir des normes rédactionnelles du TLF. Cahiers de Lexicologie, XXX, 56 : 1/2, pp. 109-122. Guillaume, G. 1964, rééd. 1969. Langage et science du langage, Paris : Nizet, et Québec : Presses de l'Université de Laval. Huot, H. 1974. Le verbe "devoir" : étude synchronique et diachronique, Paris : Klincksieck. Kerbrat-Orecchioni, C. 1991. L'acte de question et l'acte d'assertion : opposition discrète ou continuum ? In C. Kerbrat-Orecchioni (ed.), La Question, Lyon : Presses Universitaires, pp. 87-111. Lakoff, G. 1987. Women, fire and dangerous things, University of Chicago Press.
LINGUISTIC CONTINUITY
107
Lakoff, G. & M. Johnson. 1980. Metaphors we live by, University of Chicago Press. Launay, M. 1986. Effet de sens, produit de quoi ? Langages 82, Paris : Larousse, pp. 13-39. Leech, G. 1983. Principles ofpragmatics, London, New-York : Longman. Legendre, G. 1989. Unaccusativity in French. Lingua 79, pp. 95-164. Legendre, G., Y. Miyata, & P. Smolensky. 1990. Can connectionnism contribute to syntax ? Harmonic grammar with an application. Chicago Linguistic Society 261, pp. 1-15. Le Goffic, P. 1981. Ambiguïté linguistique et activité de langage. Thèse de Doctorat d'Etat de l'Université Paris VII. Le Goffic, P. 1988. Tensions antagonistes sur les systèmes : les rapports entre synchronie et diachronie. In A. Joly (ed.), La linguistique génétique : histoire et théories, Lille : Presses Universitaires, pp. 333-341. Lyons, J. 1981. Language, meaning and context, Fontana Paperbacks. Martin, R. 1972. Esquisse d'une analyse formelle de la polysémie. Travaux de Linguistique et de Littérature 10, Strasbourg, pp. 125-136. Martin, R. 1979. La polysémie verbale, esquisse d'une typologie formelle. Travaux de Linguistique et de Littérature 17, Strasbourg, pp. 251-261. Martin, R. 1985. Ambiguïté, indécidabilité et non-dit. In C. Fuchs (ed.), Aspects de l'ambiguïté et de la paraphrase dans les langues naturelles, Bern : Lang, pp. 143-165. Picoche, J. 1986. Structures sémantiques du lexique français, Paris : Nathan. Pottier, B. 1980, rééd. 1992. Théorie et analyse en linguistique, Paris : Hachette. Pottier, B. 1987. Seconde intervention à la table-ronde sur le vague. Quader ni di Semantica, Bologna : Il Mulino, VIII : 2, pp. 306-310. Rastier, F. 1987. Sémantique interprétative, Paris : Presses Universitaires de France. Sadock, J.M. 1986. Vagueness as a vague concept. Quaderni di Semantica, Bologna : Il Mulino, VII : 2. Thom, R. 1974. Modèles mathématiques de la morphogenèse, Paris : Bourgois. Vandeloise. Autonomie du langage et cognition. Communications 53, Paris, pp. 69-101. Victorri, B. 1988. Modéliser la polysémie. In C. Fuchs & B. Victorri (eds.), pp. 21-42. Victorri, B. & C. Fuchs. 1992. Construire un espace sémantique pour représenter la polysémie d'un marqueur grammatical : l'exemple de "encore". Linguisticae Investigationes, Amsterdam : Benjamins, XVI : 1.
PART II MODELLING ISSUES
WHAT KIND OF MODELS DO WE NEED FOR THE SIMULATION OF UNDERSTANDING ? DANIEL KAYSER University of Paris XIII (LIPN, URA 1507 CNRS), France
After having sketched the framework in which I wish to circumscribe the debate, I would like : - to show that the choice between discrete and continuous models for the re presentation of semantic information is not a matter of necessity as far as Natural Language is concerned ; - to explain that most of the arguments in favour of continuous models are actually arguments in favour of an (a priori unbounded) multiplicity of models, which by no means implies continuity ; - to stress some advantages that discrete models have over their continuous counterparts. 1.
The Framework
Natural Language Understanding is one of the most salient manifestation of Intelligence, and therefore, simulating this activity is a major challenge for Artificial Intelligence. However, understanding is not a process yielding a welldefined product ; nor is its completion a clear-cut event. A better goal, even if still not defined with absolute exactness, is to try to replicate the set of conclusions which are regarded as unproblematic by humans belonging to a given cultural group, when exposed to a short text. Psychological experiments could (and probably already do) confirm that for commonplace texts, there exist a large and stable amount of conclusions shared by every bona fide reader. The synthesis of a function : t e x t → s e t of c o n c l u s i o n s obviously requires intermediate steps. One of the usual steps is the translation of the text into some internal representation(s) from which an inference operator can be applied. The problem is whether this(these) representation(s) should be sought for under a discrete (e.g. formulas in a language) or under a continuous (e.g. functions operating on real numbers) form.
112
2.
DANIEL KAYSER
Against Arguments of Necessity
We shall first argue that the issue is basically contingent, that is : some cri teria have to be defined, and experiments must tell which of the discrete or conti nuous models proposed so far — or which might be proposed in a foreseeable future— better satisfy these criteria. The discussion in this section will essentially be negative, i.e. we attempt to refute some of most typical reasons which have been given in favour of a necessary choice, whether for discreteness or for continuity. a) Ontological Arguments for Continuity The argument goes more or less like the following pseudo-syllogism : The world is continuous Language is about the world Semantic models must be continuous I believe there are at least five weaknesses in this way of reasoning : i) The world is continuous : a metaphysical commitment is hidden in this statement ; to be continuous is a property of a model, not of a physical entity ; unless we identify the world with a model, the statement is ill-formed. ii) Language is about the world : at best, language is about how we perceive the world, not about the world as it actually is (or, with respect to the above, as the current models of the world describe it). Therefore, our syllogism should read : We perceive the world as continuous Language is about the way we perceive the world Semantic models must be continuous iii) We perceive the world as continuous : this sentence is rather ambiguous ; does it mean that our perception organs provide us with continuous inputs (the elements of answer given so far by Neuroscientific evidence seem negative), or that our consciousness uses these inputs, whatever their nature, to obtain a phenomenal experience that has the features of continuity ? I am not certain that the latter sentence really makes sense. If it does, it clearly belongs to the realm of Psychology, where it might be backed up, e.g. by the experiments on mental rotation (Kosslyn 1980) ; but we must be very careful here not to confuse continuity with (possibly unbounded) multiplicity (this distinction will be presented in more details below in § 3). True continuity would suppose the ability, at least in principle, to recognize as distinct an infinite number of phenomenal experiences. iv) Language is about the way we perceive the world : this view is, at the very least, oversimplistic. Language is a convention, and therefore, even if we
SIMULATION OF UNDERSTANDING
113
talk about what we perceive, this requires a transformation which might well affect the very nature of what is eventually talked about. The so-called SapirWhorf thesis went as far as to assert that linguistic conventions somehow regiment our perceptions. Even if the thesis sounds too extreme, it stresses the fact that the relation between perception and language is far from being a one-way mapping. To be more precise, I may use a sentence which clearly admits a continuous model, e.g. : (1)
the car is getting nearer
without having experienced any feeling of continuity : it may be the case that I me rely had two glimpses, from which I decided that (1) was the best way to reflect my perceptions. Conversely, even if there is such a thing as an experience of true continuity, I may choose to express it by a statement involving no "progressive" predicate (e.g. "it is brighter outside than inside"). v) Ergo, semantic models must be continuous. This reminds me of an argument given by Aaron Sloman some years ago : tornados are wet ; hence computer models of tornados should be wet too ! Obviously, a model must take into account some aspects of the phenomenon it is the model of, but it has not to replicate every aspect of that phenomenon ; otherwise, the only possible model would be identity, and the model would be totally worthless. Therefore, even if it was agreed upon that language reflects something continuous, this would not necessarily entail that the model of this thing should respect the feature of continuity. Another point might be timely at this point of the discussion : even conti nuous models are described in discrete terms ! The function f(x) = cos x where x e R is continuous, although it is easier to represented it by a (discrete) sequence of symbols than by a (continuous) two-dimensional curve. b) Ontological Arguments for Discreteness The argument here goes like this : Language is discrete The desired set of conclusions is discrete The intermediate steps must be discrete The premises can raise some objections : prosody is an important part of language, and it is not discrete ; membership in the set of conclusions is hard to decide by yes or no : a (possibly continuous) degree of confidence might better reflect human attitudes. However, the objections are not crucial : in most situations, the set of conclusions will not change whether the text is given in oral or in written form ; therefore the continuous aspects of the input hardly play a role in the phenomenon under investigation. Similarly, even if some reluctancy can be
114
DANIEL KAYSER
observed from some subjects towards some conclusions of the set, there seems to be an overall agreement on a sufficiently large subset for the graded membership problem to be safely ignored. What is however obviously wrong in the above reasoning is the jump to conclusions. It is often the case, in Operation Research, that in order to solve a problem on a discrete set, the solution of the same problem on a continuous set provides an invaluable help. Similarly as it were, in order to go from the literal in put "two thousands and eighty eight plus three hundreds and thirteen" to the literal output "two thousands four hundreds and one", the conversion into Arabic numerals might be appropriate ! The only lesson to draw from this discussion is that, if in order to go from the discrete to the discrete, a conversion into the continuous reveals necessary, there must exist strong arguments to justify this qualitative detour. c) Computational Arguments against Discreteness Let us briefly break down a last argument : Discrete models may receive a logical interpretation The kind of Logic required by the task is inherently incomplete Discrete models should be discarded Here again, there is nothing wrong in the premises. The inferences that we wish to perform could possibly be regarded as the deductive closure in some logi cal system. Clearly, Natural Language Semantics requires higher order Logics, which, since Gödel, are known to be inherently incomplete. The error is in the conclusion : no human being has ever been considered as "complete" and obtaining the set of conclusions generally agreed upon by humans does not require a complete theorem prover. Therefore, the incompleteness of the Logic has no impact on the feasibility of the task. 3.
Continuity versus Multiplicity
The discussion so far shows that no necessity forces the choice of the Semanticist towards either continuous or discrete models. The matter is therefore purely contingent. Which of these models — or which blend of them — will give the best results ? As will be discussed shortly, "best" can be understood in various ways. But let us first consider the more important issue of why continuity appears to so many scholars as an interesting candidate. We believe the appeal that continuity exerts on Semanticists to be more the result of a repulsion inspired by the philosophical truth-based theories, than by a genuine attraction to continuous functions. Now there are possibilities to reject
SIMULATION OF UNDERSTANDING
115
these theories without embracing ipso facto a continuist point of view. In other words, there is more in discrete approaches than what the tradition has put into it. By "tradition", we henceforth mean the trend which extends, despite countless variants, from Montague and Davidson back to Russell, Wittgenstein's Tractatus, Frege ... and Aristotle ; this trend is described in various textbooks, e.g. (Cherchia & McConnell-Ginet 1990). Of course, if a position is taken that Semantics must describe "meaning" as, say, Botany describe plants, it is difficult not to concede that there must be some semantic support (the term of semantic space is already biased in favour of continuity) where meanings are "located", and that meanings behave in this sup port either in a discrete way (points, atomistic constructions, or whatever) or in a continuous manner (blobs with crisp or fuzzy boundaries, fields, etc.). There seems to be no third option. But, as any attempt so far to represent meaning yields paradoxes, we may question the basic postulate, i.e. wonder whether the construction of a semantic theory requires as a prerequisite to take "meaning" as an ontologically well-defi ned object. I have developped elsewhere (see, e.g. Kayser 1991) various argu ments concluding that meaning certainly is a convenient appellation in ordinary speech, but could not be a scientifically viable notion. Briefly summarized, the idea is that the requirements which seem normal to be expected from the meaning of a text, can only be met by the text itself. A metaphor — which must not be taken too seriously — consists in comparing meaning with fire. The early ideas that fire was an element (phlogiston) have had to be abandoned, and we now speak of the physico-chemical process of combustion ... but we still (rightly) use the word "fire" in our everyday life as if its reference was an object-like entity. Similarly, an object-like representation of "meaning" might work fairly well in coarse semantic theories, but we better forsake this view if we want to give account of the process of comprehension. Notice that the use of representations is not at stake here. The arguments developped in section 5 below can furthermore be understood as so many supports for a "representational" view of intelligence, contradicting thereby approaches such as (Brooks 1991) which promote the idea of intelligence without representations. What is at stake is whether any of the representations used for simulating the comprehension of a text deserves to be called the meaning of the text. Our ne gative answer allows us to open a third option into the debate, namely that using several representations, none of them pretending to be a representation of the meaning, could avoid both the paradoxes of discreteness and of continuity in a se mantic space. This explains the title of the present section : the repellent effect of
116
DANIEL KAYSER
the philosophical "tradition" in Semantics might lead, instead of choosing a continuous semantic support, to adopt a multiplicity of representations. Before comparing the merits of this view with the other options, let us delve a little more into its consequences. The problem that we tackle, viz. to go from a text to a set of consequences, can be solved if we find the right level where to factor out the inferential behaviour. There are obviously regularities in the problem, hence storing with each text its corresponding set of consequences would, beyond its practical impossibility, provide no enlightenment on the question. The classical method : translate the text into some logical form where the inferential behaviour is fully defined, while being certainly a better idea, still harbours serious deficiencies. A consequence of the multiple-representation approach is to enable several levels where partial inferential behaviours can be described. Another metaphor seems appropriate here. Imagine a situation where an excursion has to be planned. No geographical map gives an accurate account of reality (we already observed that identity, i.e. a map at scale one, is not a representation : it would take as long to "travel" on the map than to travel in the world !). It must then be decided up to which level the details can be ignored, and there is no uniform answer : if a prior analysis shows that successively motorways, local roads, and footpaths will be taken, the planning activity will use maps of different areas having various scales, none of which being the exact counterpart of reality. Unexpected situations during the excursion itself might require for a given area a different map than what was anticipated (e.g. if a highway is blocked, with no diversion signposted). Returning to the field of language comprehension, this metaphor tells for instance that a crude lexical information can be provided for each word. If the text is easy, this may reveal sufficient. However, if the style is ornate, or if some unexpected problem arises (e.g. a difficult anaphora), access must be possible to a more elaborate set of lexical information. The important point is that none of the sets of lexical information is said to represent the full meaning of the word. An inadequacy detected in a (discrete) representation can be remedied by calling into play another, still incomplete (discrete) representation. A similar comment could be made about the access to grammatical information ; according to this comment, there is no need to accommodate at any moment for an allegedly complete set of grammatical information for the full language. This development was intended to show that many arguments against dis crete Semantics are in reality turned against a discrete space of meanings, but are not against discrete representations, if a multiplicity of representations is allowed. In any case, they are not arguments for continuity.
SIMULATION OF UNDERSTANDING
4.
117
About criteria
We began to describe an approach, and we need now to compare it with other perspectives on the same problem, namely on the one hand what we called the "traditional" approach in Semantics, and on the other hand the "continuist" point of view (which, by the way, though being less traditional, should not be considered as new : for instance, [Lyons 1978] presents the theory of semantic fields of the 1920's and 30's as having been influenced by Humboldt and even by Herder 1744-1803). The rules of the game should thus first be stated. As for other scientific domains, the comparison of theories follow a rather simple scheme : accuracy of the agreement with experimental evidence, range of the domain covered, ontological simplicity (Occam's razor). In theoretical Physics, the quest for unified theories has moreover put forward the idea of minimizing the number of arbitrary constants required. We add here one more criterion, which should not be regarded as of mere pragmatical significance, but is as important as the previous ones : the compu tational complexity of the theory. As a matter of fact, recent works in complexity theory show that computations belong intrinsically to some class of complexity, i.e. the class is independent of the material details of the (electronic or whatever) device on which the computation is performed. As our goal is to synthetize a function, the theory in which the function happens to be of lower complexity should be preferred. The introduction of this criterion has the important consequence to "neutralize", as it were, possible equivalence theorems. As a matter of fact, it might be conjectured that for any continuous model (possibly taken in some res tricted class, but sufficiently broad to be of interest for Semantics), there exists a discrete model yielding the same behaviour and vice-versa. Even if such a conjec ture is proven in the future, this will not settle the issue, unless the class of com plexity in a model and in its translation are the same. Notice that we leave learnability out of the stage. It is obviously an impor tant criterion for the cognitive plausibility of a semantic model — and certainly continuous models are currently ahead in this respect —, but this criterion is irrelevant for the task to which we circumscribed the debate. 5.
Advantages of (a multiplicity of) Discrete Models over Con tinuous Models
a) Labelling facts and Meta-Reasoning Let us first elaborate on an argument once given by Marvin Minsky (Minsky 1975) : symbolic models are superior, because symbolic labels coupled with facts
118
DANIEL KAYSER
provide an indication on how each fact has been obtained, while numeric labels lack that possibility. This argument looks rather technical and superficial at first sight, but I think it is nevertheless an important one. Most of the inferential behaviour depends not only on the propositional content of the data on which it applies, but also on what we might call the "modality" of the data. All the well-known arguments showing the weaknesses of extensional semantics bear witness to the fact that two sentences having the same propositional content accept however different sets of consequences. Modal and intensional logics have been developped, precisely to solve otherwise intractable puzzles of Natural Language Semantics. Very crudely, what they amount to is to provide formal sentences with labels or indices (usual modalities include, e.g., the labels L, read "necessary", and M, read "possible"). Of course, instead of symbolic modalities, one could think of numerical la bels, the numbers being taken in a continuous range (generally, the real interval [0,1]). Continuity seems to add more flexibility to the use of modalities (for ins tance, a distance between modalities can be defined, and this opens the possibility of defining a distinct modality arbitrarily close to some modality). However, this advantage (if it is really an advantage, see below) is offset by two serious pro blems. First, there is a theorem, first proven in 1940 by J.Dugundji (see Hughes & Cresswell 1968), which states that no finite multi-valued system can have the same theorems than any of the classical modal systems. This theoretical problem might be of little practical significance if only short deductions are performed (this is a reasonable assumption when we try to simulate human performances). The second argument is precisely Minsky's statement. Some inferences take into account not only the factual content of the premises but as well their status, which is more or less equivalent to the way the content has been obtained. And, as Minsky points out, having labelled the fact by a mere number erases the possi bility to trace back its origin, while a symbolic modality leaves more readable im prints. Obviously, there is an equivalence theorem here, and a numeric code can always be found, which provides exactly the same information as the symbolic one, but this is a case where introducing numbers would be pretty artificial, and probably yield code-deciphering algorithms of higher complexity. To sum up, finding the set of acceptable conclusions from a text is a task which is likely to require not only object-level reasoning (reasoning based on the content) but meta-level inferences (which take their support in the way the content has been obtained). We claim here that meta-level inferences are easier if the facts are equipped with symbolic labels (modalities) rather than with numeric ones. This argument concerns the input of the meta-reasoning, but does not say anything about the continuous vs. discrete nature of the meta-reasoning itself. It
SIMULATION OF UNDERSTANDING
119
can only be noticed here that even connectionist methods take their inputs as a list of discrete features. b) Qualitative Physics Let us turn now to the "object-level" of the reasoning. Language often refers to domains where continuous physical models are available, and these models are then known to provide the most accurate answers. For example, spatio-temporal inferences are necessary in virtually every text comprehension task, and the best physical models of both time and space are continuous. As said earlier, this does not entail that continous models are necessary to represent these domains, but this might be a hint that these models might prove empirically more effective. This is the assumption that we now challenge. Our argument goes along the following lines : - It is impossible to extract from a text the value of every parameter needed to run the equations of a continuous model ; the text only sets constraints on these parameters ; consequently, the equations should in principle be solved for the (infinite) set of parameters satisfying the constraints. Therefore, even if for otherwise good reasons, the continuous model was preferable, it would yield intractable calculations. What is needed however, in order to get the conclusions of the text, is not the result of the equations for each run, but a characterization of the set of results thus obtained. The situation is somehow similar in various engineering fields, where the desired result is not any definite value, but either an order of magnitude, or the direction of a variation when the input parameters vary within a given range. In order to solve such problems, a body of methods have been developped under the name of qualitative physics (e.g. DeKleer & Brown 1986 ; Raiman 1989). The interesting thing here is that they use discrete tools and provide, more economically than the usual differential equation solvers, qualitatively accurate solutions for a range of situations. Admittedly, there are situations where they give no answer, but this is due to a too coarse sampling of the parameters. Having, as discussed in section 3, several qualitative models at various degrees of granularity (Falkenhainer & Forbus 1988) might increase the number of situations where an interesting result is found by this kind of models. - But there is a more fundamental issue here. Obviously, when humans understand a text, they do not solve differential equations. This does not count as an argument as long as we don't care for Cognitive Modelling, i.e. what we want is the same conclusions as humans, but not necessarily in the way humans get them. However, the difference is not only in the methods : there are also discrepancies in the results. Of course, in current situations, adults do not infer conclusions which differ blatantly from Physics. But language clearly refers more to pre-theoretical constructs than to contemporary physics. Several works in A.I.
120
DANIEL KAYSER
have tried to represent naive physics (Hayes 1978 ; Hobbs & Moore 1985) rather than qualitative physics. And once more, it can be observed that these naive physics have been developped in a discrete framework. Anyhow, no continuous theory is available as far as naive physics is concerned. To sum up, simulating the function which provides a set of conclusions from a text seems easier to achieve at the level(s) of granularity which corres pond(s) to the categories present in the language itself. Forcing the reasoning to take place at finer levels is possible, but requires a huge amount of computations, which can be short-cut by the use of qualitative (hence discrete) techniques. Moreover, it is not the result of these computations which is desired, since there is no evidence that humans, reasoning on the physical world, do so accordingly to the actual laws of Physics. c) Vagueness Furthermore, language often refers to entities which have no well-defined reference in the physical world. Without mentionning fairy tales or other fictitious worlds, it is easy to see that most concepts do not admit a precise definition in terms of physical measurements. Such concepts are sometimes said to be vague. Although the term might not be felicitous (the vagueness comes only from a nor mative view of Physics), let us discuss the solutions which have been proposed to cope with this vagueness of language. One of the best-known attempts is the use of fuzzy sets. (Zadeh 1975) considers for instance words such young. Assume that, in a given cultural area, a person younger than 25 is definitely considered as young, a person above 50 is definitely considered as n o t young. Zadeh would provide a statement such as "person x is young"with a truth value in the [0,1] interval, this truth value being 1 for age(x) < 25, 0 for age(x) > 50, and subject to a continuous variation when age(x) varies in the interval [25,50]. This solution raises several difficulties : - what is the form of the continuous variation ? the simplest form, linearity, implies a discontinuity in the derivative ; sigmoid functions avoid this defect, and Zadeh therefore chooses them, but they require the introduction of more arbitrary parameters ; - what is the meaning of a statement such as, say, "'John is young' is .736 true" ? That if a large number of people, in the cultural area concerned, are asked to scale the statement "John is young", the average (or the mode ?) will be .736 ? Or if asked to answer by yes or no, the percentage of positive answer will be 73.6% ? - how can this approach be extended for those predicates for which no measurable variable exists ? (it is rather safe to assume that y o u t h depends on
SIMULATION OF UNDERSTANDING
121
the measurable variable a g e , but what about b e a u t y ? Zadeh nevertheless provides a similar analysis for b e a u t i f u l as for young) - having a truth value for a statement naturally yields to derive new truth values for logically related statements. For instance, it is reasonable to assume that the truth values of "John is young" and of "John is not young" are not independent, e.g. that they sum to unity ; but paradoxes are then known to creep in (Osherson & Smith 1981). These problems have been listed in order to remind that, even if "vagueness" has intuitive connections with continuity, no obvious continuous treatment of it is to be expected. More qualitative approaches (such as supervaluation, or Wright's modal treatment of the related sorites paradox [Wright 1991]) are not flawless either. In our point of view, the real issue concerns the articulation between various levels of granularity : as a matter of fact, young is not vague in a rough model where people are ontologically either young or o l d . On the other hand, the use fulness of a imprecise predicate, such as young, tends to disappear in models where the age of every person is known. An account of y o u t h is however useful if we link together several models, with for instance inferences in one model de pending only on the fact that people are young, while inferences in another one use the more accurate notion of age. We therefore consider that vagueness is not a case for continuity, but for articulating a multiplicity of heterogeneous models. d) Thresholds and Trade-offs One of the most attractive feature of continuous models is their ability to "weight" various arguments. In other words, if separately neither argument A nor B is enough to accept conclusion C, their conjunction can nevertheless be viewed as a sufficient support for the conclusion. Conversely, if A alone is a sufficient reason for C, it might not be true any longer if B is present. While the first situation has an obvious translation in ordinary logic ([A A B] C), the second contradicts a basic principle of logic, viz. monotonicity, i.e. the ability to derive at least as many consequences from a superset of a set S as from S itself. Anyhow, the phenomenon under investigation, the obtention of a set of conclusions from a text, is inherently non-monotonic, because it is easy to add a sentence to the text which forces to retract an otherwise admissible conclusion. Now, the second situation is easily representable in non-monotonic logics. We adopt here R.Reiter's "logic for default reasoning" (Reiter 1980). The repre sentation is : A : -,B C
^
C
122
DANIEL KAYSER
which reads : "if A is believed and if it is consistent to believe -1B A C, then C is believed". As soon as a proof for B exists, the reasoning yielding C is no more warranted, and its conclusion, C, must be withdrawn. This solution can easily be generalized to an arbitrary number of arguments and counter-arguments, where only a given threshold allows for some conclusion. But the situation is reversed, compared to a) above : the computations are defini tely easier on the numerical side than on their logical counterpart The empirical is sue is whether the threshold mechanisms are not merely, in several cases, conve nient approximations, and whether a more careful analysis would not reveal a more stratified nature of the decision process. By "stratified" we mean that arguments can be ordered in layers, and that the presence of an argument belonging to a higher layer is enough to counteract the effect of arguments of lower layers. This is a rather strong conjecture which, if taken absolutely, may remind of the most dogmatic approaches to Linguistic. But recall that we allow for exceptions at any level, i.e. that every conclusion can be retracted in presence of new evidence. Moreover, the alternative is to accept that adding a sufficiently large amount of "low level" arguments can always invert any decision, and this alternative does not sound really more palatable. It can be objected here that our discussion shifted without warning from models to decision processes. This objection would have some force if we had granted the models some ontological relation with the world ; but as we earlier rejected the idea that models could possibly represent meaning, their only purpose is to enable decision making, and there is thus nothing wrong in arguing for repre senting (discrete) strata in models on the basis of a stratified decision process. e) Consistency The last point by which we contrast discrete and continuous models concerns the notion of consistency. An advantage of using a multiplicity of sym bolic models is that in each of these models, consistency can be maintained. This feature does not seem easy to achieve in a continuous model : if some configuration of the system is eventually to be interpreted as the presence of p in the set of conclusions, while some other configuration is interpreted as the pre sence of - p , is there any general mechanism which guarantees that both configurations will not be simultaneously present in some possible state of the system ? This has not to bother us as long as it concerns contradictions which are not perceptible by humans. After all, humans are certainly not consistent, but they tend to be, at least in "Western cultures", say, locally consistent, that is they feel embarrassed if they conclude on two conspicuously incompatible assertions.
SIMULATION OF UNDERSTANDING
123
Our view of multiple symbolic models tends to be consonant with this phe nomenon. We believe that while expressing him/herself, a locutor invites his/her hearer to build models, and more precisely, that the syntagms he/she uses refer to entities of a model. But, in opposition to the "traditional" view of Semantics, we do not believe in the unicity of the model : many examples can be found where the locutor deliberately plays simultaneously on several models. Each model respects some notion of consistency. Now, the computational price to pay for maintaining consistency is known to be high (consistency-checking is a NP-complete problem, i.e. it is likely to re quire inherently a computation time which grows exponentially with the length of the formulas), and we already accepted this as a strong argument against a theory. However, (Fahlman 1979) has found very quick algorithms (they run mostly in constant time) to detect obvious inconsistencies in taxonomies. We have tried to extend these algorithms for more complex semantic representations (see e.g. Coupey 1989), and we conjecture that reasonably rapid procedures can detect what we called above the "conspicuous incompatibilities", the bad theoretical re sults for a complete check notwithstanding. 6.
Conclusion
We dismiss the idea that semantic models have, by necessity, to be of a gi ven kind, and we set therefore the debate on an empirical ground : which kind of model happens to be accurate, to have a broad scope of applicability, and to be economical at one and the same time in terms of its ontology, of the constants it needs, and of the computations it involves ? As far as accuracy is concerned, we side with the proponents of continuity to adopt a critical point of view against what we called "traditional" Semantics (see e.g. [Kayser 1990] for further discussion). We attempted here to show that this criticism however does not provide positive arguments for continuous models, and we explained that multiple discrete models could be at least as good candidates as continuous ones to overcome the inadequacies pointed out. Finally, we argued that at least as good as was in fact better. Of course, this last part contains no indisputable argument. Nor does the list a)-e) exhaust the defence of discrete models. Exhibiting a discrete model which satisfies all our expectations would be a much stronger move, but anybody having worked on the problem realizes how foolish would be the hope of such an achievement in the fo reseeable future. An unprejudiced assessment of the currently available models is next to impossible. From the continuist point of view, the discrete models have hit their ceiling, and the purported recent advancements are not significant. From the dis-
124
DANIEL KAYSER
crete point of view, only the very low starting point of continuous models explains the appearance of quick progresses, and the state yet reached is still globally far below the level of the disparaged tradition, even if some local results are impressive. With some perfidiousness, they might add that, although analogical computers exist for long, these results have been obtained on regular (symbolic) computers, because of their greater versatility, of the existence of better dialog utilities, debugging aids, etc. Does this count for an unintentional homage to the superiority of discrete devices ? As a more equitable conclusion, we would like to state that neither discrete nor continuous models are likely to solve all the puzzles raised by Semantics. As we insisted to regard the issue as empirical, nothing forbids Syncretism, at least as a provisional escape, until either a dubious decisive victory of one side over the other, or, more likely, the gradual discovery by each side of its (possibly empty) ecological niche.
Acknowledgements The author gratefully acknowledges fruitful discussions with Françoise GAYRAL and François LÉVY.
SIMULATION OF UNDERSTANDING
125
REFERENCES Brooks, Rodney A. 1991. Intelligence without representation. Artificial Intelligence vol. 47 nos 1-3, pp.139-159. Chierchia, Gennaro & Sally McConnell-Ginet. 1990. Meaning and Grammar. An Introduction to Semantics, M.I.T. Press. Coupey, Pascal. 1989. Étude d'un réseau sémantique avec gestion des exceptions. Interprétation logique et implantation informatique. Thèse d'Informatique. Université Paris-Nord. DeKleer, Johan & John Seely Brown. 1986. A Qualitative Physics based on Confluence. Artificial Intelligence vol. 24 nos 1-3, pp. 7-83. Fahlman, Scott E. 1979. NETL - A System for Representing and Using RealWorld Knowledge, M.I.T. Press. Falkenhainer, Brian & Kenneth D. Forbus. 1988. Setting up Large-Scale Qualitative Models Proceedings AAAI-88 Hayes, Patrick J. 1978. The naive physics manifesto. In D. Michie, (ed.), Experts systems in the micro-electronic age, Edinburgh University Press, (see also The Second Naive Physics Manifesto in Hobbs & Moore, 1985, pp. 1-36. Hobbs, Jerry R. & Robert C. Moore (eds.). 1985. Formal Theories of the Commonsense World, Ablex. Hughes, George Edward & Maxwell J. Cresswell. 1968. An Introduction to Modal Logic, Methuen. Kayser, Daniel. 1990. Truth and the Interpretation of Natural Language : a Nonmonotonic Variable-depth Approach. Proc. E.C.A.I.-90, Stockholm, pp. 392-397. Kayser, Daniel. 1991. Meaning Representation vs. Knowledge Representation. In N. Cooper et P. Engel, (eds.), New Inquiries into Meaning and Truth, Simon &Shuster, pp.163-186. Kosslyn, Stephen M. 1980. Image and Mind, Harvard University Press. Lyons, John. 1978. Eléments de sémantique, Larousse. Minsky, Marvin. 1975. A Framework for Representing Knowledge. In P.H. Winston (ed.), The Psychology of Computer Vision, McGraw Hill, pp. 211277.
126
DANIEL KAYSER
Osherson, Daniel N. & Edward E. Smith. 1981. On the adequacy of prototype theory as a theory of concepts. Cognition vol. 9, pp. 35-58. Raiman, Olivier. 1989. Le raisonnement sur les ordres de grandeur. Revue d'Intelligence Artificielle vol. 3 n° 4, pp. 55-67. Reiter, Raymond. 1980. A Logic for Default Reasoning, Artificial Intelligence vol. 13 nos 1-2, pp. 81-132. Wright, Crispin. 1989. The Sorites Paradox and its Significance for the Interpretation of Semantic Theory. In N. Cooper et P. Engel, (eds.), New Inquiries into Meaning and Truth, Simon & Shuster 1991, pp. 135-162. Zadeh, Lotfi A. 1975. Fuzzy Logic and Approximate Reasoning. Synthese vol. 30, pp. 406-425.
CONTINUUM, COGNITION AND LINGUISTICS JEAN-MICHEL SALANSKIS CNRS (UMR 17, Paris), France In this study, we consider continuum primarily as being the significate of a particular language activity which is privileged in many respects, viz., mathemati cal activity. This activity which, today, is carried out within the framework of formal deontology, assumes the more specialized name of real or complex analysis when it is directed towards continuum. Over a recent period, it has produced an immense amount of knowledge, of which one section, considerable in itself, comes under the heading of differential geometry. Many aspects of this knowledge are invested in twentieth century mathematical physics, to the utmost glory and greatest success of the latter : the alliance between differential geometry and physics, contracted at the end of the XVIIth century with the uprise of infinitesimal calculus, is still the major fact of modern science, including quantum theory, unlike what has been imagined by people who have been overimpressed by the discrete character of the levels of energy in the hydrogen atom. Not only linguistics, but more generally speaking the various disciplines of the structuralistic constellation, or, closer to us, the cognitive galaxy, did not at first follow the pattern of physics as regards the relation to continuum. The path of "formalization" was adopted, but in the process discrete mathematics was sytematically given pre-eminence. If such has been the case, this is undoubtedly for good reasons, having to do with one's intuition of the object under examination. However, nowadays, various attempts at modelling or description show that this first option was perhaps not irreversible, and the present symposium is proof that this re-examination affects linguistics itself. We say "linguistics itself' be cause language appears, on first examination, to show virtually impregnable dis creteness, and a continuistic theorisation of it seems a priori to be highly impro bable. Having said this, we are aware of the circumstances in which continuum succeeds nevertheless in affecting linguistics : the capital part here is played by cognitive problematics and the fact that linguistic research is becoming increasingly involved in the more general framework of "objective" study of cognition. We shall thus attempt to ascertain how cognitive problematics succeeds
128
JEAN-MICHEL SALANSKIS
in bringing continuum into the field of linguistics, distinguishing in this respect several different modes. The method followed is philosophical, i.e. we attempt in each case to provide a philosophical characterisation of the outlook within which cognitive theory calls upon continuum, and of the extent to which this mobilization concerns the specific level of linguistics. In this study we shall thus enter into three points, corresponding to the three possible types of legitimization for the bringing in of the "grammar of continuum" of real analysis into cognitive areas, i.e., legitimization through the "perceived" and perception, legitimization through the dynamistic approach to cognitive acti vity, and legitimization by the taking into account of the "continuum of meaning". But we shall also, secondly, approach the problem in the reverse direction. From linguistic description there emerges something like a "grammar of conti nuum" inhabiting everyday use of language, which in some way competes with the logical-mathematical apparatus implicitly referred to throughout the above study of continuum. We shall thus attempt to analyse the relation between the two available levels on which continuum is elaborated, which occur in this case as two levels for the elaboration of spatiality. We shall then take advantage of the fact that the question of the relation between cognitive space and geometrical space has already been raised in the philosophical field, in connection with the interpretation of Kant's doctrine of transcendental esthetics. 1.
Various legitimacies of continuum related to linguistics
1.1. The perception input One fundamental property of cognitive activity is that it establishes relations between itself and the world. It is thus possible to introduce continuum into the field of linguistics by emphasising the dependence of language upon perception : perception — or at least the "perceived" — has, to our mind, everything to do with continuum. This is exactly what Pylyshyn says when he refers to transduction : for him, a transduction function is a function which sends "certain classes of physical states of the environment into computationally relevant states of a device"1. But the specific difficulty of transduction — what causes transduction not to be a symbolic calculation — is, as Pylyshyn points out in no uncertain terms, the fact that the parameters of physical description are incommensurable with the symbolic register :
1) Pylyshyn (1984), p. 152.
COGNITION AND LINGUISTICS
129
"Physical devices respond to physical magnitudes (that is, the basic dimension of physics — force, time, length). The development of technological tools for mechanical or electrical tasks is intimately related to the development of physical theory. What makes speech seem highly variable is that we lack physical dimensions that correspond to phonetic similarity ; in other words, we lack a description, in physical terms, that will group sounds into perceptually similar classes."1
Thus formulated, the problem appears as linked to the nature of the physical parameters of force, time or length : it proceeds, more precisely, from the fact that these parameters are associated, in physical theorization, to mathematical domains into which their values must be inscribed, these domains all being constructed upon R, the mathematical model of linear continuum. The transition from what is thus first referred to a collection of real numbers, from what is most often interpreted as a point within a differentiable manifold, this transition via a mathematical criterion of assimilation of that which has, symbolically, the same value, to symbolic information, is the basis of the problem known as transduction, which type of processing Pylyshyn illustrates with the example of the works of Marr2. This problem has its attractions, its difficulties and its language, but what we wish to emphasize here is that continuum, up to this point, is only mentioned because it generally appears in physical modelling, there is no specifically linguistic reason to bring it to the fore. The idea is simply that every linguistic cognitive activity is prepared within transduction, and that transduction begins in a universe which is conventionally modelled by continuum. Admittedly, there is a second level on which perception is relevant to lan guage : since Pylyshyn first posed the problem of the input of the cognitive system in terms of transduction, others have been finding, concurrently, that the transition from the physical to the symbolic and the symbolic operation itself, is essentially relative, blurred and variable. Let us quote in this connection the recent article of Chalmers-French-Hofstadter : "Recently both Pylyshyn (1980) and Fodor (1983) have argued against the existence of top down influences in perception, claiming that perceptual processes are "cognitively impenetrable" of "informationally encapsulated". These arguments are highly contro versial, but in any case they apply mostly to relatively low-level sensory perception. Few would dispute that at the higher, conceptual level of perception, top-down and contextual influences play a large role."3
1) Pylyshin (1984), p. 150. 2) But Petitot's morphodynamical approach to the problem of recognition of external sensible forms, proposed in Petitot (1991), can also be quoted, likewise the study presented in Petitot (1985) of the problem of categorial perception. 3) Chalmers-French-Hofstadter (1991), p.4.
130
JEAN-MICHEL S ALANSKIS
These considerations trigger a debate, which necessarily has repercussions on the specifically linguistic level : the fact that language is the medium of recognition, as Chalmers-French-Hofstadter point out1 compells a certain commensurability between language operations and the transductive function : perception is completed in recognition. As Chalmers-French-Hofstadter develop the Kantian character of this discussion, it may be simple and enlightening to say that this problem, in Kantian lexicon, would be that of schematism whereas the first would be that of sensibility and pure intuition. Whatever the case, once this has been said, the fact remains that that continuum which thus really concerns the linguistic area, on the level of its schematizing function, is the continuum of physics, the continuum imputed by physics to the spatio-temporal being of nature. Our assimilation of this problem to that of schematization, however, is basi cally erroneous, if not misleading. The reading of cognitive literature shows that there are not one but two problems : - The problem of what one could call the transductive preparation of acts of categorization or recognition carried out in and by language : the mathematical strategy and the network implantation studied by Petitot in Petitot (1991) seem to come under this heading to the exact extent that the "cognitive archetypes", here, are the unconsulted termini ad quern of the functioning of the network. - The problem of schematism in a more precise sense of the term, which is basically the problem of the degree to which language spontaneously transposes itself into the continuous register of the "perceived" : this would involve, for example, the problem of understanding how language can prescribe the conti nuous meaning (if there is one) of Langacker's diagrams. The trans of trans-duction and trans-position, in each case, does not head in the same direction. The first problematic is that of the physical preparation of the symbolic (in Pylyshyn's terms), the second one is that of the schematizing power of language. Only the second problem is specifically linguistic : it implies an at tempt to understand, at the very level on which language and thought are linked, a sketch of the continuum in which referents are immersed. We feel that this problem of linguistic schematization can only be approa ched within the framework of the study of a priori spatialization in language : it is
1) "High-level perception begins at that level of processing where concepts begin to play an important role. Processes of high-level perception may be subdivided again into a spectrum from the concrete to the abstract. At the most concrete end of the spectrum, we have object recognition, exemplified by the ability to recognize an apple on a table, or a farmer in a wheatfield." Chalmers-French-Hofstadter (1991).
COGNITION AND LINGUISTICS
131
coherent that the manner in which language turns to a pre-established spatiality of things should necessarily be considered on the basis of a reflection about the space which language, so to speak, produces itself. But this refers to the second part of this article. 1.2. Cognitive dynamism Philosophy has long held that the discrepancy between the world and the mind is a discrepancy in terms of space and time. The soul, as we know, has al ready been designated, by Aristotle, as the essential partner of time ; for Kant, time is the form ascribed to inner sense, Hegel ultimately recognizes the concept as the same thing as time, for Husserl, the intimate flow of time is the basic medium of constitution upon which rests the whole of phenomenology as an unfolding of the field of consciousness. Such a tradition may lead us to think of the essence of thought in terms of time : in this perspective the characteristic of thought is to be a process, to be a certain rhythm within time. But on the basis of this there naturally emerges a philosophical hypothesis on the relevance of continuum with regard to the cognitive domain : can one not maintain that time, in cognitive sciences, plays a similar part to that played jointly by space and time in classical mechanics ? And that the common interpretation of time in terms of unidimensional mathematical continuum R becomes an a priori imperative for cognitive research ? If one wishes to express this theory in the lan guage of transcendental criticism, from which it obviously draws its inspiration, one may say that time is the framework within which "cognitive phenomena" oc cur, and that the attempt to objectivize cognition in a scientific manner is conse quently dependant upon an "aesthetic" temporal continuum, in the Kantian sense of the word. It then appears that such a hypothesis may be notably supported by taking into consideration recent efforts to reconstruct all of the cognitive sciences upon a dynamistic paradigm. These efforts, as we know, take several forms. The most obvious, the most famous, is the development of what is known as "connexionism" — or sometimes neo-connexionism. Often, the expounding of these models depends very closely upon the obvious neuro-biological inspiration which motivated them, at least in their beginnings : i.e. the likelihood of the hypothesis that thought emerges from "neural networks". It is, nevertheless, possible to look at things in a totally diffe rent way, and consider as essential not the fact that these models bring into focus the dynamical system associated with the reiterated and indefinite updating of acti vation values of neurons within a network, but simply the plain fact that they bring into focus a dynamical system. We even have every reason to do so if we
132
JEAN-MICHEL S ALANSKIS
take into consideration René Thorn's anticipation of connexionist ideas on model ling nearly twenty years ago, in terms in which reference to neurons did not play a great part : what led him to dynamical modelling was above all the idea he had of cognitive activity being essentially event-related and temporal. If we follow the language of D. Amit in his treatise Modeling Brain Function1, is is not difficult to realize the importance of time in the presentation he gives of ANN (attractor neural networks). Firstly, the principle of representing contents of thought by means of attractors within a dynamical system proves to be a temporalizing principle in two respects : firstly, thought appears as a process, and secondly, the "grasping" of a content, in the sense of the putting into effect of its meaning, is interpreted as the arrival of the dynamical system at the attractor : "The arrival of a trajectory, initialized by a given stimulus, at the attractor is the realization of retrieval and at the same time it is the assignment of meaning, ..."2
This great interpretative option of the essence of "representations" takes shape in the temporal register : for representation to be really actualized in the sys tem, the dynamic must remain at the attractor for a sufficiently long period of time, whereas the time of arrival is a comparatively short period. On the basis of such considerations, Amit distinguishes recognition from memory as such (the former is the immediate arrival of the system at an attractor activated by the stimulus and which is "in memory", the latter corresponds to the case of a "process by which a detailed item of information, specific to the particular attractor which has been rea ched, is propagated in the wider system to generate a response based on the spe cific detailed memory."3). The selection of an attractor, moreover, is required to depend solely on the connexion strengths of the network, and not on the temporal mode of updating neural activations : two extreme possibilities are taken into consideration in this respect, one of them biologically unlikely but mathematically convenient, accor ding to which they are all updated together, during successive discrete stages of time, the second, more realistic, according to which these updatings are asynchro nous : for each stage of the same discrete time prescribed by the biological clock, there is updating of a neuron taken at random. Thus the concept of stability introduced has everything to do with the temporal character of cognitive activity. Finally, in the fifth chapter of the treatise, Amit recognizes that the model such as he has considered it up to this point equates thought to memory, and, 1) Amit (1989). 2) Amit (1989), p.84. 3) Amit (1989),p. 85.
COGNITION AND LINGUISTICS
133
further, to singular events of memorization : it is inevitable, he says in substance, to attempt to model thought as a sequential process, i.e. as a process of processes, a chain of events taking place according to a long time-unit, being in themselves progressions, punctuated by a shorter unit. Amit goes on to show that a system can necessarily transit from one attractor to another, a finite number of times, on condition that one adds to the matrix of connexion strengths an adequate asymetrical term, which will in fact be partly calculated on the basis of the activations of the system % units of time before (the dynamical system is such that the state of the system at instant t+1 is governed not only by its state at instant t but also by its state at instant t-T ) : in such conditions the transition from one attractor to the next in a finite list of attractors will not take place from one instant — determined by the neural system's microscopic clock — to the next, but will occur once every % unit of this lapse of time, giving a slowness which allows the system to really ac tualize the representations associated with the attractors. Thus it is possible to a certain extent to reapprehend dynamical modellings as governed by a "rhythmic" idea of thought : the fact of thought is linked to a qualitative differentiation of time, and to the emergence of several time-scales. The question now arises as to whether these remarks nevertheless substantiate the more precise theory — that of an "aesthetic-transcendental" function of time — expounded above. We see a certain number of reasons to reply in the negative, which we shall develop in the following series of remarks : - Modelling by means of attractors, by principle, does not only involve the time dimension : there would be no attractor if the dynamical system under consi deration did not have a substrate space. This substrate space is furnished practi cally complete to connexionism by the biological analogy (it is the space ascribed to the states of the neuronal network), it is postulated by Thorn's point of view as a "manifold of the internal dynamic" (and it is therefore a continuous hypercube — "of Hibert" — [0 1] ). The coherence of modelling thought as a rhythmic modulation with several time-scales implies an actualisation space for this modulation. - This first remark relates the problem of the temporal consideration of thought-cognition both to old and to recent discussions. One the one hand we know that Kant regarded a mathematical approach to psychology as impossible precisely because of the unidimensionality of time to which it would necessarily be confined : he found it impossible to conceive that the objectivizing categorial concepts that this science would need should be given an empirical sense on the "stage of time", for the simple reason that there is no such stage, as time is always absent, the meeting of time with itself being always denied by the passage of time. In some respects dynamical modelling recognizes the difficulty just as it over-
134
JEAN-MICHEL SALANSKIS
comes it, since it "adds" to time an actualization space which allows the conceptual and schematic-mathematical identification of a notion such as that of an attractor. On the other hand, we know that some modern specialists, advocating extreme dynamicism, recommend the abandon of any idea of stocking as a possible sup port for the thought process (we refer to the ideas of Rosenfield, expounded by Clanceyl) : does their view — one fails to see what sort of modelling it could lead to — also challenge the idea of an actualization space, or are actualization and sto cking the names of two completely distinct functions ? We feel that the case in point holds interesting matter for reflection. - Whatever the case, it must be specified that the authors themselves of connexionist modelling did not, in general, argue their appeal to continuum in terms of reference to time : this is the point emphasised in Salanskis (1992). - A major difficulty ultimately remains : even such as we present it here, co gnitive science still does not appear to us as a temporal mechanic of cognition phenomena. Cognitive science is not mechanical because its object, thought, is not passively a synthesis of the cognition phenomena to which it refers ; time is not simply its framework, it is its substance, thought is the very animation of that which it is construed to be the synthesis of2 . Here we once more come across the old philosophical dilemma concerning the soul and time : the soul is not only in time, it makes time. J.T. Desanti has provided a very profound explanation, in his book on Husserl, about the extent to which the latter had increased this difficulty in his meticulous analysis of the constitution of time, which precedes all constitu tion3 : this is tantamount to saying that the aporia of the duplicity of time is perpe tuated up to our own times, and in the works of the same author, in whom, preci sely, one can discern a forerunner of cognitive research. But as far as this paper is concerned, the main problem is that of the way in which the entry of the time factor in cognitive theory affects linguistics proper : does the dynamical theory of thought-cognition introduce continuum into linguis tics, and, if so, how ? It can first be observed that the time factor in dynamical models does not in an obvious way engender the constraint of continuum : the models, for the greater part, make use of a discrete time-lapse ; Amit is even discussing his model as far as discreteness of time is concerned in the light of biological facts (he comes to af-
1) Cf. Clancey (1991). 2) In Kahn (1991), it is shown how Freud, on the basis of a cognitive view ahead of his time regarding the yuch, eventually reaches such a conception. 3) Cf. Desanti (1976), pp. 63-97.
COGNITION AND LINGUISTICS
135
firm that the idea of a basic cycle in the cognitive system is biologically plausible, while that of the synchronous functioning of the neural network is not). Here, we have an avatar of the profoundly non-transcendental character of the recourse which connexionism has to continuum. The fact remains that Thorn and Grossberg can still be said to call upon continuous time (a necessary condition for having true differential equations and true dynamical systems in the classical sense of the term). Of course, one could try to appreciate the impact of the dynamistic approach on the linguistic level by looking closely at the linguistic applications of con nexionism, of the "harmonic grammar"1 type. It seems simpler to characterize this impact from a greater distance, observing that the continuum in question comes to play mainly in describing our way of having language rather than in describing meaning. For example, Smolensky points out that connexionism is better able to explain the fact that a child learning the forms of conjugated verbs will begin by knowing some irregular verbs, then, having acquired the standard rule of suffixa tion with -ed, will erroneously conform to this pattern verbs which he knew be fore, prior to reaching normal competence2. However legitimizing towards con nexionism this view might be, and however interesting in itself, it is not, properly speaking, an elucidation of the linguistic meaning. If perceptive continuum has been seen to remain outside the symbolic field of language for essential reasons, as a continuum of that which confronts lan guage, of what language refers to, dynamical continuum remains in a sense exte rior to the same degree, to the extent that it affects the inside of our possession of meaning or the way in which it is mobilized or actualized, and not the actual level on which meaning becomes manifest, the linguistic level. However, as in the first case, the possibility of a purely linguistic analysis must be allowed for in which those phenomena which in linguistic performance, where meaning becomes lin guistically manifest, would occur as a counterpart to the event related by the dy namical theory : in this case it would be the occurrence of a second level of lin guistic schematism, which would no longer be the schematism governing the way linguistic expressions project configurations into external continuum, but that which governs the way they encompass and achieve temporal figures (which would by all probability be connected with those which preside over the cognitive activity leading to the manifestation of these expressions).
1) Smolensky-Legendre-Miyata (1990). 2) Smolensky (1988), p. 14.
136
JEAN-MICHEL SALANSKIS
Some time ago, we heard Jean Petitot glossing the meaning of the sophisti cated French sémème hainamoration1 in terms of cusp geometry and conflict between the actants haine and amour. This gloss, we think, was actually intended to reveal a temporal schematism of this sort, and it illustrates at first glance the contention we have just suggested. Does there not remain, however, a strictly lin guistic level for the meaning of hainamoration in respect of which such a gloss would not be necessary ? And does meaning not, in a sense, remain aloof from continuum even in this very instance where continuum seems to suit it so well ? Such clarifications and the formulation of such questions focus one area as being the possible scene for an essential alliance between continuum and linguis tics : that of the manifestation of meaning. We shall now proceed briefly to this point. 1.3. A continuum of meaning ? It seems possible to put forward the radical hypothesis that meaning may be a dimension in itself, being on this assumption absolutely distinct from the di mensions of space or time for example, and which should be recognized as being by rights open to a continuous diversity of degrees or instanciations. The main ar gument in favour of this theory would be the taking into consideration of the nu merous ways in which language manages to modulate a meaning, with such di versity of means, some of which lend themselves to reiteration, that one is temp ted to conclude that all meaning can be referred to a continuous variation. We have heard Bernard Victorri uphold a position of this type2. The problem raised by a theory such as this seems to us essentially that of the primitivity of the scales of meaning. Ronald Langacker, for example, also holds that any "predication" refers to a "domain" within which it outlines a "profile", this operation being staged in a topological-geometrical mode — which to us suggests continuum — rather than in the discrete mode in customary use in linguistics. However, he concedes from the outset that the spatial "domain" remains prototypical3, in such a way that it seems that any "continuous" character of the numerous dimensions of meaning4 would still be inherited from spatial continuum. 1) Which refers, roughly speaking, to that state where love (amour) and hate (haine) are mixed up together. 2) In a paper presented in June 1990 at the symposium Continu dans les sciences cognitive s (The Continuum in Cognitive Sciences) organized by P.Y. Raccah and P. Bourgine. 3) Cf. Langacker (1987), p. 147 : "Physical motion in the spatial domain is regarded as a special (though prototypical) manifestation of more abstract conceptions with great linguistic significance.", and Langacker (1991), pp. 13-19. 4) But certain dimensions can be discrete : for example the qualitative axis of colours.
COGNITION AND LINGUISTICS
137
In his 1990 paper, Bernard Victorri was confronted with a similar problem as he put forward a capability located on the level of linguistic "competence" for discerning elementary differences in meaning : the question arises, concerning each example one could take, as to whether the difference is not always thought of firstly in a connoted spatial register. It seems to us that in order to reflect in a satisfactory manner on the theory of a continuum of meaning, it is first necessary to clarify the nature of space in terms of what is phenomenologically proper to it, and the relation between this proper nature with cognitive space on the one hand and mathematical space on the other : we shall come to this in the second part of this article. However, there is another way to discredit the finite image of meaning : in a literary rather than a phenomenological perspective. In this case one will not claim directly that meaning is always framed within a continuum, but more simply that semantic complexity and the wealth of language resources surpass without a sha dow of doubt any finite measure. This theory itself seems to us to have some privileged modes of entry to the field of linguistics : - Firstly, on the strictly "intra-linguistic" level of meaning, one can put for ward the necessity to appeal to the encyclopaedic totality of knowledge to account for the least occurrence of meaning. This is to a certain extent Langacker's posi tion in Langacker (1987)1, it is also Rastier's in Rastier (1987)2 : on this point, it seems that "cognitive semantics" and "interpretive semantics" naturally converge, both refusing the finitist generativism emblematically expressed by Chomsky, which was the tendentious orientation of the logico-structuralist period. - Secondly, one can invoke the infinite "pragmatical" variability of meaning ; in this case the strictly linguistic level will only be affected with a character of infinitude if the pragmatical register is reintegrated into the linguistic. Now this is exactly the theory expounded jointly by Langacker and Rastier in their 1987 treatises : for Langacker, each discourse situation is understood as giving rise to the "sanction" of a singular conceptualisation (featuring the whole "pragmatical" import) by a "unit", i.e. a conventional conceptualisation3. And this sanctioning relationship, under the more general name of a categorisation
1) Cf. Langacker (1987), p. 63. 2) "Bref, nous préférons éviter ce genre de distinction, en rappelant que n'importe quelle connaissance lexicale encyclopédique peut être l'interprétant d'une relation sémique". "In short, we prefer to avoid this kind of distinction, recalling that any encyclopaedic lexical knowledge can be the interpreting agent of a semic relation." (our translation) Rastier (1987), p. 251. 3) Cf. Langacker (1987), pp. 65-73 ; the "pragmatic" aspects of sentence meaning and their continuity with "semantic" ones are expounded in Langacker (1991), pp. 494498.
138
JEAN-MICHEL S ALANSKIS
relationship, is nothing other than the basic relationship constituting the "network" of cognitive grammar. For Rastier, the notion of the "sème afférent" reintegrates into the sphere of semantics dimensions of communication generally considered as coming under pragmatics1. Both of these authors underline the fact that such a point of view allows semantic innovation to be integrated into the field of linguistics, and taking a challenge like this into consideration seems to us an essential step towards the infinitizing view on language. Having said this, one could continue the above discussion with a remark of a general philosophical nature : whether the reference be to encyclopaedia or to the "entour pragmatique" (to use Rastier's terms) the infinitizing view on language basically works upon the opening of the situation, taking the terms this time in their phenomenological sense. The important point is not so much the quantity of encyclopaedia actually accumulated, nor the listed plurality of enunciation circum stances, but the fact that these "sets" are essentially open, and that language is pro foundly relative to a situation in which, in each instance, its boundary shifts, no finite horizon being able to contain in advance the scope of the shift. In other words, the encyclopaedia is the situation as belonging to the past, as being a sedi mentation which is beyond us (labyrinthic), the here and now circumstance is the situation as the future which takes hold of us (presents itself). The problem which arises is this : what type of discursive approach is appropriate for this kind of opening and infinitization originating in the situation ? In principle, it is not difficult to suggest an answer, provided one is aware of what can be drawn from philosophy : the opening-up of meaning through situa tion is the theme, the challenge, the leading concern of hermeneutics. Hermeneutics is that method or that attitude which tends to replace scientific me thod as soon as one focuses the fact of understanding meaning rather than explaining it (Dilthey) ; but to understand is nothing other than to revert to the situation, to its opening, and to its excess over the finite reduction (Heidegger, Gadamer). François Rastier, at the end of his Sémantique interprétative, quite naturally meets the level of hermeneutics : the art of drawing the map of the "isotopies", the linguistic level having been generalized and enlarged, as the book intends from the outset, seems likely to be nothing but a modality or a new assumption of hermeneutics. But the difficulty is that, until proof of the contrary, hermeneutics is an alternative to science, and Rastier desires nonetheless, at least in his 1987 writings, a theorization of linguistic phenomena other than a verstehen à la Dilthey-Heidegger. As far as we know, Ronald Langacker does not see this difficulty. It is tempting to say that he simply thinks that there is a descriptive science 1) Cf. Rastier (1987), pp. 42-55, especially p. 55.
COGNITION AND LINGUISTICS
139
concerning everything he brings into play in cognitive grammar (he says so in as many words in connection, for example, with the description of our cognitive faculties, reiterating in a positivist mood the conviction which was Husserl's at the beginning of the century1). We nevertheless have one indication at least that the problem arises within the framework of "cognitive grammar", this indication having all the more value in our eyes as it is at the same time a manifestation of the depth and originality of Langacker's work : the "revolutionary" methodological principle of the non-productivity of rule-schemas, the principle by which universal forms are stocked as units of cognitive grammar together with the particular forms which substantiate them2 is not without relation to hermeneutics : does the simple fact that the universal thus ceases to be the constraining path towards the particular not give to the particular, or even to the singular, a role in the theory such that the latter is at least reflective in the Kantian sense, and perhaps even hermeneutical? For one of the crucial properties of hermeneutics is that the universal which is ceaselessly declared, is always declared as being the universal of its singular (of what is given in the "situation"). We shall conclude with a simple and general remark about the relation which this problem of the infinitization of meaning may entertain with the one we first took into consideration, that of the relevance of a "continuum of meaning". There is a well-known technical connection between problems of infinity and those of continuum, regarded as mathematical problems of the foundational type : all mo dern syntheses of continuum borrow from a concept of infinity the decisive re source for elaborating what we might call the continuous effect. It would thus be tempting to see here a relation, if not a possible passage from one question to the other. We rather wish, however, to emphasise the difference between the two problems, and the reasons for which we do not expect the technical link between continuum and infinity ever to afford something like a positive passage from the elucidation of the hermeneutical character of the usage of linguistic meaning to the convincing qualification of an intrinsic continuum of meaning. We feel that the problem is situated precisely at the following point : the usage of meaning is "rendered" infinitary by the opening of the situation, which is essentially singular ; conversely, the idea of a "continuum of meaning", such as has been expounded above, cannot but be the idea by which the meaning effect is originally
1) "I believe that mental experience is real, that it is susceptible to empirical investigation and principled description, and that it constitutes the natural subject matter of semantics." Langacker (1987) p. 99. 2) Cf. Langacker (1987), pp. 45-47.
140
JEAN-MICHEL SALANSKIS
submitted to a multiple manifesting a modality of continuum. Furthermore, continuum of meaning should be actual (to receive all the modulations), while hermeneutical infinity of meaning is essentielly potential. There is thus every reason to separate the two problems. 2.
Cognitive space and transcendental space
As we explained at the beginning of this article, we shall now, so to speak, enter into the question — in the opposite direction to that which has been followed up to this point — of the relation between continuum and the cognitive domain in general and the linguistic level in particular : instead of studying by what right continuum, via the cognitive approach to language, comes to concern linguistics, we shall now inquire into what cognitive information natural language brings concerning continuum. This question, in fact, can be narrowed down to a question of the space which is given with language, this spatiality revealed and presupposed by our use of natural language. But it is impossible to treat this question without confronting cognitive space given by language with geometrical space. This problem of the comparison between pre-comprehension of space and the "scientific" comprehension of it is not, to my mind, adequately dealt with if one fails to recognize the difference of principle between what we call cognitive spatiality, i.e. pre-comprehension of space as psychologically attested, and what one could call transcendental spatiality, which we shall describe as precomprehension of a different type. To a certain extent, the elaboration of this distinction is to be the focus of the reflections below, which means that we must admit it is not self-evident. But, on the principle of what is called in good philosophical terminology a hermeneutical circle, we stand no chance of grasping this distinction unless we first commit ourselves to an initial understanding of it. 2.1. The position of the problem What we call cognitive spatiality can in principle be identified with cabled spatiality, to which a priori no value of truth can be ascribed : this is factual space comprehension, which has to be brought to light by empirical study (establishing psychological facts, possibly involving neuro-physiological, ethnological or sociological factors). This spatiality is, so to speak, an unformulated conception of space and spatial relations between things, with which the human subject is factually equipped, and from which he must emancipate himself in order to produce a scientific theory of space having a guiding value for a general investigation of natural being.
COGNITION AND LINGUISTICS
141
Carnap, at the beginning of the century — although he was not alone, it ap pears that Helmholz was the father of this attitude which was prevalent at the time in German circles1 — judged that the transcendental space decribed by Kant in Critic of Pure Reason was a cognitive space. In Carnap (1924), he argued against Kant, attempting to show that in an honest psychological analysis of man the principles of Euclidean geometry did not come to light. This interpretation of the meaning of Kantian transcendental aesthetics ho wever does not seem defensible to us : Neo-Kantians from the Marburg school, at the same period, upheld a more plausible interpretation, affirming that Kant's space and time, "a priori forms of sensibility", were to be understood as the very factors which presided over scientific theorization of nature ; but they inferred, we believe erroneously, that these a priori forms were not intuitions at all but pure constructions of thought in the essentially active sense that this word has for Kant2. To which Heidegger replied in an endeavour to defend the intuitive charac ter of Kantian intuition, but with total disregard for the link between this intuition and science : for him, pure intuition of space is no longer that by which, through the intermediary of mathematical geometry, science is ruled3. We have taken up this old discussion in an attempt to reach a conception of the Kantian message which we feel to be the only one compatible at once with the philosophical character of Kant's writings, and with his obvious intentions (such as bringing to light the metaphysical apparatus which made Newton's physics possible). Our conclusion, basically, is that what is called an a priori form of space, in Kant's work, is the content of a thought experience which is inevitably content from the moment that we wish to take in an external manifold that presents itself to us : the constraint according to which the manifold must be spatialized is metaphysical, we take knowledge of it by our attempt to elaborate for ourselves a representation of what such a thing as the presentation of an external manifold means to us. Kant's idea is that in this effort to represent to ourselves a priori the external manifold, we actually go a little further than meeting space itself as a frame, we anticipate a structure of that space ; we start to reflect upon the structure of space, and tend to resolve it into terms of some kind of geometry. There is experience, because everything is brought to light in an attempt to elaborate an a priori representation which is situated (it is that of someone who takes part in the adventure of philosophy, mathematics and physics ; it is invalid unless we lend it our personal reflective energy), but it is a thought experience, nothing of what is established comes to be so through other means than the
1) Cf. Chevalley (1991), pp. 422-442, esp. pp. 433-35. 2) Cf. Natorp (1921), Cohen (1917). 3) Cf. Heidegger (1928,1929).
142
JEAN-MICHEL SALANSKIS
decisional-responsible mode of thought in its active sense. The transcendental fact is that natural science relies incessantly upon the results of this millenary thought experience, which thus begins in metaphysics, continues into mathematics and reaches completion in physics1. If then we understand the difference there is between the idea of cognitive space and that of transcendental space in the way we have just expounded, the problem which immediately becomes central is to situate linguistic space in relation to these two spaces : if it is true, as recent research by the Californian school upholds, that there is pre-comprehension of space in language — and if, furthermore, this pre-comprehension is fundamental to every semantic system — is linguistic pre-comprehension the reflection of psycho-cabled cognitive space, or is it the starting point of transcendental pre-comprehension, which is Kant's pure intuition ? 2.2. The cognitive point of view : Talmy and Poincaré To provide an answer, let us look at what Talmy says. In Talmy (1983), he analyses in detail the conditions of preposition usage in English. He brings to light the fact that localization through language generally calls upon a reference object (and often two such objects), with regard to which the localization is accompli shed. He expounds the theory that each preposition brings a schema in terms of which the spatiality of scenes is constructed in language. The application of these schemas is not without "idealization" (when I utter "from Mars", I idealize the planet Mars as a point to which the schema of from can be applied) nor without "abstraction" (in the same example, I make abstraction of all that is irrelevant to my idealization, the matter of which the planet is composed, the defect in its sphe ricality, etc.)2. This type of observation, combined with a few others, converges fairly naturally towards the idea that if there is a manner of geometry rooted in language, it would be more of a general topology than a metrical geometry : « This sort of further abstraction is characteristic of the spatial relations defined within the mathematical field of topology. It is metric spaces, such as classical Euclidean geometry, that observe distinctions of shape, size, angle and distance. Distinctions of this sort are mostly indicated in language by full lexical elements — square, straight, equal, plus the numerals. But at the fine structural level of conceptual organization, language shows greater affinity with topology. (One might further postulate that it was this level — and its counterparts in other cognitive systems — that gave rise to intuitions from which thefieldof topology was developed). »3
1) Cf. Salanskis (1991, 1994a, 1994b). 2) Abstraction and idealization were already basic stages in the transition to the geometrical in Husserl's work, cf. Husserl (1936), pp. 209-212. 3) Talmy (1983), p. 262.
COGNITION AND LINGUISTICS
143
It may seem strange indeed that the pre-comprehension geometry brought to light by Talmy should be in a position to claim greater scientific value than the geometry associated for all eternity with Kantian transcendental aesthetics : general topology has over Euclidean geometry the advantage of modernity and prestige 1 . It is disconcerting, moreover, to note that the theory of a topological precomprehension was already put forward by Poincaré, based on a point of view which in this case owed nothing to language analysis. In Poincaré (1912), the author affirms that genuine intuitive geometry is the analysis situs, because it states what is valid in spite of the imperfection of our material representations of the figures : « It has often been said that geometry is the art of reasoning about badly-drawn figures. (...) But what is a badly-drawn figure ? it is what can be produced by the clumsy artist we mentioned above ; he alters the proportions to smaller or greater approximation ; his straight lines show alarming zigzags ; his circles describe ugly bumps ; all this is of no consequence, it will in no way disconcert the geometrician, and will not stop him from reasoning correctly. But the inexperienced artist must not represent a closed curve by an open one, three lines intersecting at the same point by three lines having no common point, a broken surface by an unbroken one. His figure would then be useless and reasoning would be impossible. (...) This very simple observation shows us the real role of intuitive geometry ; it is to favour this intuition that the geometrician needs to draw figures, or at least to represent them mentally. Now, if he makes light of the metric or projective properties of these figures, if he dwells exclusively on their qualitative properties, it is because this is the only area where geometric intuition really enters into play. »2 1) Whereas in the conclusion of Talmy (1985), Talmy marks on the contrary more normally the poverty of pre-comprehension compared to what modern physics teaches about reality (cf. Talmy [1985], pp. 37-41). 2) Poincaré (1912), pp. 134-135. (our translation) "On a dit souvent que la géométrie est l'art de bien raisonner sur des figures mal faites.(...) Mais qu'est-ce qu'une figure mal faite? c'est celle que peut exécuter le dessinateur maladroit dont nous parlions tout à l'heure ; il altère les proportions plus ou moins grossièrement ; ses lignes droites ont des zigzags inquiétants ; ses cercles présentent des bosses disgracieuses ; tout cela ne fait rien, cela ne troublera nullement le géomètre, cela ne l'empêchera pas de bien raisonner. Mais il ne faut pas que l'artiste inexpérimenté représente une courbe fermée par une courbe ouverte, trois lignes qui se coupent en un même point par trois lignes qui n'auraient aucun point commun, une surface trouée par une surface sans trou. Alors on ne pourrait plus se servir de sa figure et le raisonnement deviendrait impossible.(...) Cette observation très simple nous montre le véritable rôle de l'intuition géométrique ; c'est pour favoriser cette intuition que le géomètre a besoin de dessiner des figures, ou tout au moins de se les représenter mentalement. Or, s'il fait bon marché des propriétés métriques ou projectives de cesfigures,s'il s'attache seulement à leurs propriétés purement qualitatives, c'est que c'est là seulement que l'intuition géométrique intervient véritablement"
144
JEAN-MICHEL SALANSKIS
In the ensuing part of this famous article, as we know, Poincaré discusses the tridimensionality of space as its fundamental property in the genuinely intuitive perspective of the analysis situs, and formulates a conception of the origin of this tridimensionality which is highly cognitive in modern terms, since everything can be reduced to the examination of external changes (deduced from evidence provided by sensory chains) which we have learnt to correct by internal change (a motory act). Poincaré's reasoning has some common points with Talmy's : it refers to the anthropological fact that metrical relations are indifferent. But Talmy's anthropological fact is located in the natural usage of language, Poincaré's in the geometrical habitus. A first argumentative reaction towards what Talmy and Poincaré jointly put forward would be to emphasize the essential difference that exists between the identification of the analysis situs as forgetting determinations (metrical, projective, and to a certain point morphological) and its foundation as mathematical discourse. The specific theme of the analysis situs as mathematical discourse is, ultimately, the study of topological spaces, of continuous mappings and of the behaviour of topological properties under the effect of these. Now, as we know, the definition of a topological space involves an actual (possibly infinite) underlying set of points, and a privileged family of subsets of this set (i.e. an object which, in the perspective of a theory of types, is of the type ((0)), if 0 is the type of the individuals of the underlying set). This amounts to saying that when Talmy suggests that the level of spatial pre-comprehension in language which he has brought to light "gave rise to intui tions from which the field of topology was developed", his assertion can only be accepted if one notes at the same time the difference in perspective introduced by topology. This consists, as we have said, primarily in the entering of speech into the infinitarian-typal frame of the set theory. While the level explored by Talmy is only concerned with spatially composite extended objects and relations between these objects expressed by prepositions, and the "referential" function subse quently always assumed by a privileged object (rather than by something which is not an object but an actualized synthesis of all the ideal "punctualities" covered by the objects), general topology thinks in terms of points and sets, as we have said, and ultimately organizes the whole of its thought concerning proximity according to fundamental "proximity assessments" which are not relational, in the sense that they do not bring two configurations or two points into relation : the assessment is of the type "0 is a proximity of frame X" (0 is an open set of topological space X), or, in an alternative formulation also close to the secondary intuition of the topologist "V is a proximity for point x" (V is a neighbourhood of x — i.e. V contains an open set of which x is an element), which is a relational
COGNITION AND LINGUISTICS
145
assessment but between individuals of different types (0) and ((0)), and which in fact expresses the frame encompassing point x. Although Poincaré, once again, does not address the same level of habitus as Talmy, the same observation can be made about what he claims : if the practical interest of the geometer for what remains of the spatial after abstraction of the me trical, and of the projective structure, can function as an indication towards an in tuitive value of topology, it does not follow that topology can by any means be considered as given with its specific point of view with this indication1. Did Talmy not recognize this from the outset when he simply said that the domain of topology may possibly have been developed on the basis of the pre-comprehension level he had brought to light ? The risk, here, seems to be that of elaborating a homogeneous and genetic conception of this development. 2.3. Confrontation between cognitive-linguistic space and transcen dental/geometrical space In fact, what we have briefly mentioned by way of situating the originality of topology as compared to "Talmian" spatialization can be re-stated and stressed if one distinguishes the level of language pre-comprehension from that of the ma thematical-metaphysical question of spatiality, and if one clarifies the hermeneutical function which is operative within each level as well as from one level to ano ther. The language precomprehension of space which Talmy, among others, brings to light, is directed to objects and their relationships, it is unaware, as it seems, of the spatial frame as such, or of the point as the ultimate individuality of this frame. It is by nature semantic (in the logical-foundational sense) : in support of what he puts forward, and as a confirmation of the limits he detects to the validity of the uses, Talmy quotes examples of unacceptable or hardly acceptable utterances (preceded by the symbol *), like I crawled in the window! *into the window2 testifying that in can signify the passage "through an opening in an enclosure's wall" whereas into cannot ; this kind of localization of boundaries is something quite different from the expounding of a set of axioms to which English prepositions would comply, being prospective bearers (in the implicit mode) of their meanings. Those approaches oriented towards obtaining axiomatics belong, on the contrary, to what cognitive linguistics turns away from ; moreover, the proponents of classical artificial intelligence are well aware that
1) It is a fact that, as far as we know, Poincaré, to whatever extent he was aware of it, was not in perfect agreement with this set-oriented topology as yet in limbo at the time of his work. 2) Talmy (1983), p. 240.
146
JEAN-MICHEL SALANSKIS
spatial relations resist coding through "meaning postulates"1. The project of Talmy and of others working along similar lines, is very different : it is to grasp descriptively the semantic content of English prepositions with the presupposition of Euclidean geometry : their work tends towards what one might call technically an interpretation of the geometry of language pre-comprehension in bi- and tri dimensional Euclidean geometry (this aspect is particularly obvious in a paper like that of Anette Herskovits2). What we have just said focuses the hermeneutical mode specific to theoreticians concerned with language pre-comprehension of space. But if they are to be believed, as both Talmy and Anette Herskovits for example explain fairly clearly, there is incidentally a hermeneutical aspect at work within this pre-comprehension itself : the choice of marking in language such a type of relation or configuration rather than such other, or, similarly, the choice of conceptualizing such a situation by such preposition rather than by such other, taking such or such a viewpoint, such or such scanning of given data, etc. is something which the speaker assumes each time in a singular manner. All the authors insist on the fact that there is in this case an unconstrained dimension, a treatment which is decided in final analysis by the situation of the speaker3 (his involvements, interests, etc.). The strategy of interpreting ordinary spatial significations in the "neutral" referential frame of Euclidean geometry ultimately serves this purpose : to reveal the non-neutrality of language pre-comprehension "in situation". Geometry as a branch of mathematics has since its origins been subjected to the question of "What is space?", and has been striving, mathematically, to clarify the understanding it has always already had on the subject. This long experience testifies to the fact that geometrical pre-comprehension of space involves aware ness of frame (space) and point. We may say that since the beginning geometry
1) Cf. Johnson-Laird (1980), pp. 86-88. 2) Herskovits (1987). 3) Talmy mentions "preselections" between alternative schematizations. Herskovits rejects the "computational" view whereby "given a description of a scene, or an environment, in terms of the shape and location of the objects it contains (the canonical description), one could generate appropriate locative constructions, using more or less complex spatial relations as meanings of the prepositions."— notably observing that objects are themselves constructions, and she concludes "In summary, language is thoroughly context-dependent and pervaded with vagueness"(Herskovits [1987] pp. 292-293). She moreover indicates that she is aware that she refers to the hermeneutical circle and to the situation as she quotes Winograd and Flores ("Another way to express the same thing is to say that every utterance takes its meaning from a background of assumptions and beliefs that cannot all be made explicit. Winograd and Flores [1985] express a similar idea" Herskovits [1987] p. 295).
COGNITION AND LINGUISTICS
147
has been working on the basis of the space-point pair, which exposes in a radical and demanding way the problem of locality. The question "What is space ?", the question of the frame, has for example already been raised in Aristotle's physics1, and Euclidean construction, as we know, begins by setting out the fundamentality and abstraction of the point. Therefore objects and their relations are not primary concerns : geometry begins with the thought experience which empties space, the thought experience which is the same thing as the assumption of the question "What is space ?" such as Kant took it up it transcendental aesthetics2. Moreover, for this geometry at grips with the space-point pair — even though it does not yet enter into a compositional, set-oriented view of space — infinity and continuum are problematic from the outset. Today, this interest for the point, for infinity and continuum gives rise to formulations in the set-oriented frame, and it appears that much of what concerns infinity and space continuity requires to be expressed at the "second order" level, and has recourse to a viewpoint in which the power set is posed together with the set itself. On the technical level, now, geometrical hermeneutics, having no external location on which to project what it anticipates, cannot have recourse to the semantical method like cognitive linguistics, it cannot interpret what it focuses and thinks in a richer pre-existing language. This explains why in modern times the predominant hermeneutical path is the axiomatic one : the meaning which I anticipate may be expressed by noting the prescriptions gover ning the type of use which it profiles, or by specifying a list of axioms, in which the geometrical sense will be implicit. Only the syntactical regime, then, favours the development of geometrical hermeneutics. The link between the two levels remains to be clarified.We see it as two fold : - Firstly, it is clear that the "phenomenon" of pre-comprehension springs from language, and more specifically from the habitus of a subject whose situation is basically determined in terms of language (the human subject : we subscribe to the fundamental theses of hermeneutical anthropology expounded by Gadamer). Geometrical pre-comprehension of space, then, cannot be fundamentally alien to ordinary-language pre-comprehension : no doubt the basic primitive experience,
1) Physics IV 1-5, in Traduction Carteron, tome 1, pp. 123-135, Paris, 1926, Les Belles Lettres. 2) This is how we understand the famous statement «We can never represent to ourselves the absence of space, though we can quite well think it as empty of objects.» Kant (1781-87), Trad. N. Kemp Smith, Macmillan and co. Ltd.,1929. p. 68 ; "On ne peut jamais se représenter qu'il n'y ait pas d'espace, quoique l'on puisse bien penser qu'il n'y ait pas d'objets dans l'espace." Kant (1781-87), p. 56.
148
JEAN-MICHEL SALANSKIS
according to which our language "strucures space" — and does so only in com pliance with a decision of the scene in the hermeneutical situation of the Being-in the-world — is at the bottom of geometrical perspective itself, in the sense that it is on the basis of this primitive experience that the thought experience of emptying space is initiated, thus raising the question "What is space ?" In other words, this question, although not thematically addressed, already queries the habitus which configures the scene of being, each time : the non-prescribed character of the organization of the scene points to the unvarying background upon which the various possible configurations of the objects are projected. This is why it is hardly surprising that all the elaborations of geometry, in one way or another, reintroduce the primitive situation in relation to the ideal correlative universe to which they address the geometrician : figures, and later, open sets, or at least compact neighbourhoods, take over from ordinary objects, and the basic situation is relived in relation to the ideal world, which, among other things, was built to receive it. Topology can be conceived as the poorest theoretical frame within which the primitive situation can be restituted, within which the habitus structuring the scene of the objects can be retrieved, in a similar way to what happens in ordinary experience, i.e. without the intervention of Euclidean localization. This first aspect of the link is a relation of subordination and reiteration : geometrical experience is necessarily subordinated to ordinary experience, it necessarily draws from it the problematizing faculty which is nonetheless its own distinctive feature ; subsequently this first experience never ceases to be available at the stage of geometrical hermeneutics. - It seems to us, having said this, that there is a second link, which would appertain to methodical simulation. The very fact that there is linguistic precomprehension of space shows a remarkable power of language : language provides intuition. From the moment that this has been tested, how then can one fail to understand the dominant modality of modern geometrical hermeneutics, that of axiomatics as a repetition of the original situation where language gives, in a primitive way, a view of the scene of being ? As one exhaustively articulates a new language, expressly instituted as a support for the expression of the supplement of spatial meaning which one has in view, and by marking out by the laying down of rules an artificial habitus (that of the idiom of set-theoretical geometrical language, for example that of topological idiom), one relies once more on that remarkable power of language from which stems the very mystery of space. We expect from the use towards which we have applied ourselves, new intuition, new familiarity, and a new ability to raise questions about that by which, in the element of the first use, we felt ourselves solicited. Even if, as we were saying above, and as we have learnt from cognitive sciences, there is no original axiomatization to account for spatial pre-comprehension (no doubt because it lacks
COGNITION AND LINGUISTICS
149
a place to circumscribe itself), and if then we can only understand the geometrical meaning of ordinary linguistic usage by interpreting into "superior" languages of geometry, it remains established that the global fact of language is that of a tremendous originating implicitation, invested in relational profusion : modern hermeneutical strategy can thus be understood as an effort to recommence and repeat the implicitation of meaning through the simulation involved in the play with formal languages and the specification of axioms. 3.
Conclusion : the dispute of continuum
Does our effort to provide a minute philosophical analysis of the various ways continuum becomes relevant for linguistic study on one hand, of the pos sible contribution of a linguistic anthropology of pre-comprehension to the ques tion of the intuition of spatial continuum on the other hand, hold any interest for the debate which is now open in the domain of cognitive sciences in general, and in the linguistic domain in particular, on the appropriateness of the recourse to continuous models ? It is tempting to answer in the affirmative, but we must not expect, from philosophical clarification, anything other than what it can give : far from bringing a means of settling the debate, our considerations have no other aim than to clarify what enters into the alternative, and especially, to combat any illusion whereby it would be possible to draw a temporary moral based on a criterium of simple effi ciency, and to pacify oneself as to the substance, with the blessing of a rational "wait and see". The question of continuous modelling, in our view, does not solicit the same answer according to whether one considers, on the one hand, the problema tic of perceptive continuum and cognitive dynamism, or, on the other, that of the "continuum of meaning" (or related problematics). As regards perceptive continuum and cognitive dynamism, expounded in sections 1.1. and 1.2. of the article, we feel it is simple to say that the import of continuum does not cause any problem, that it obeys, approximately, the "ontological" logic of physics, the reference science. There is however a distinc tion to be introduced in the case of the mobilization of continuum in view of theo rizing cognitive dynamism, as we have seen, both because time is probably not simply a frame here, and because models also bring in an actualization space whose value is quite different, but we feel at all events that the atmosphere is suf ficiently close to that of physics for the actors of this research, whose desire was primarily to share the scientific situation and the mathematical powers of physi cians, to refuse to abandon the continuistic point of view in years to come.
150
JEAN-MICHEL SALANSKIS
Only, in both these cases, we are not sure whether continuous cognitive modelling has repercussions for linguistics. It seems that the cognitive approach can only distinguish itself in the form of a kind of proto-linguistics of transduction or of the thought event, and that all the relevance of a problematic of spatial and temporal schematism of language, working in the "other direction", is subordina ted to a purely semantic study of the temporalization and spatialization brought by language, a study which would owe nothing to continuous modelling, towards which it was supposed to head. The really contentious issue, we feel, is the possibility of a "continuum of meaning", mentioned in section 1.3. On this subject, we would like to make a few remarks, with a view to eliminating possible misunderstandings. We argued that it did not seem possible to base continuum of meaning as an intrinsic continuum, having its pure intuition, to which would be correlated a "semiometry"1. But we are well aware that continuistic models for certain dimen sions of meaning do exist, and far from "condemning" them, we totally "support", for example, geometrical theorization of the actantial structuration of the sentence by Wildgen and Petitot2 (following Thorn), or the catastrophist mo delling of polysemy by Victorri and the ELSAP laboratory3. Such modellings exist and are welcome. There is no law stating that continuum may only be mobi lized in scientific research as it was by Newtonian mechanics. And it is quite ap propriate to note that the models of physics themselves have integrated both into the configuration space and into the phase space more and more elements whose phenomenologico-intuitive content was highly problematic, or have on the con trary withdrawn elements, with no regard for the representative comfort of the subject4. So let us applaud these models, whose audacity causes us to reflect, and su rely makes us wiser and more perspicacious. Still, we feel it may be useful to dis tinguish, as we have attempted here, between what has "phenomenological" or "intuitive" legitimacy, what can be associated with a mathematical thought expe rience of presentation, and what cannot legitimately be submitted to such a con nection. We see two reasons to this :
1) And what would play the part of the revolution of non-Euclidean geometries ? Dada=treatise of non-Aristotelian semiometry ? 2) Wildgen (1982), Petitot (1992). 3) Victorri (1988). 4) The major points of this adventure of aesthetic increase and decrease are expounded in Salanskis (1994a).
COGNITION AND LINGUISTICS
151
- This type of critical feedback may provide an opportunity to assess what has been put into the model, the choices one has made, giving predominance to such and such a parameter, when laying down the configuration spaces. If we do use continuum in what we may call a "trans-aesthetic" way, it is surely better to be aware of it. Along the same lines, it seems useful not to forget that the point of view of the "discrete", in the linguistic-semantic concern, draws its legitimacy from genuine "pure intuition", according to which meaning is only tested in the sentence, and can thus only be apprehended in terms of the tested structure of the sentence, which is discrete. Structuralism, in its assertion of the dependance of meaning on the network of semantic relations, and in its methodological appeal to competence to proceed to the description of the structures of language, in fact relied on this intuition of the sphere of meaning as being self-enclosed. The "discrete-based" approach to linguistic facts has intuition on its side, unlike what occurs in physics, and it would surely be clumsy and regrettable to ignore this. - But at the same time, we should be aware of the two-fold value which can result from the introduction of continuum in a field hitherto possessed by discrete intuition : a value of objectivation, or on the contrary a value of subjectivation. At the outset, it may appear that continuous modelling may have no other aim than to project meaning phenomena into an exteriority comparable to that of external matter, in order to master and predict it in similar fashion. But if we take a closer look at what is being done, and the real advantages which modellers draw from their work, we note that continuum intervenes much like a representative adjuvant towards complexity. The principal value of continuous semantic models is to allow us to "see" the extreme complexity that is at stake in the structure, of the sense effect which cannot be apprehended and stated other than through the paraphrase (this is a particular form of the hermeneutical circle). As in Bernard Victorri's model, then, the multi-parameter dependance is "simplified" in the potential functions visualizing meaning polarities and in the double bi-dimensional projection which allows facts to be displayed on a sheet of paper. Perhaps continuum in semantics appears under the figure of its extreme "subjectivity", which is not the slightest of its powers, and gives it a motive to intervene against all odds, so to speak, and in spite of the presentative mode of what is focused. What comes to light for philosophical consciousness, in this hypothesis, would be homologous to what we have seen elsewhere in the domain of non-standard analysis, where the mathematics of continuum learns to be understood as an approximate means of approach to the hyperfinitarian excess of the discrete1. 1) We refer to the "finitarian" conception of continuum developed by the Reeb school, particularly Jacques Harthong in his arithmetic model of continuum (cf. Harthong [1983, 1987,1989]).
152
JEAN-MICHEL S ALANSKIS
REFERENCES Amit, DJ. 1989. Modeling Brain Function, Cambridge : University Press. Carnap, R. 1924. Dreidimensionalität des Raumes und Kausalität. In Annalen der Philosophie. Chalmers, DJ. ; R.M. French ; D.R. Hofstadter. 1991. High-Level Perception, Representation, and Analogy : A Critique of Artificial Intelligence Methodology. Center for Research on Concepts and Cognition Indiana University, Technical Report 49. Chevalley, C. 1991. Niels Bohr Physique et connaissance humaine, édition commentée, Paris : Folio. Clancey, WJ. 1991. Israel Rosenfield, The Invention of Memory : A new View of the Brain, Artificial Intelligence 50, pp. 241-284. Cohen, H. 1917. Kommentar zu Immanuel Kants Kritik der reinen Vernunft, Leipzig : Felix Meiner. Desanti, J.T. 1976. Introduction à la phénoménologie, Paris : Idées/Gallimard. Harthong, J. 1983. Eléments pour une théorie du continu. In Astérique 109-110, pp.235-244. Harthong, J. 1987. Le continu et l'ordinateur. In L'ouvert 46, pp. 13-27. Harthong, J. 1989. Une théorie du continu. In La mathématique non standard, Barreau-Harthong éditeurs, Paris : Editions du CNRS, pp. 307-329. Heidegger, M. 1928. Interprétation phénoménologique de la "Critique de la raison pure" de Kant, Paris : Gallimard 1982. Heidegger, M. 1929. Kant et le problème de la métaphysique, Paris : Gallimard 1953. Herskovits, A. 1987. Spatial Expressions and the Plasticity of Meaning, preprint, Wellesley College. Husserl, E. 1913. Idées directrices pour une phénoménologie, Paris : Gallimard 1950. Husserl, E. 1936. L'Origine de la Géométrie, Paris : PUF 1962. Kahn, L. 1991. La petite maison de l'âme, Entretiens de Vaucresson, preprint. Kant, E. 1787. Critique de la Raison pure, Paris : PUF 1971. Langacker, R. 1987. Foundations of cognitive grammar Theoretical Prerequissites, Stanford : Stanford University Press.
COGNITION AND LINGUISTICS
153
Langacker, R. 1991. Foundations of cognitive grammar Descriptive Application, Stanford : Stanford University Press. Merleau-Ponty, M. 1945. Phénoménologie de la perception, Paris : Editions Gallimard. Natorp, P. 1921. Die Logischen Grundlagen der Exakten Wissenschaften, Sändig Reprint Schaan 1981. Petitot, J. 1985. Morphogenèse du sens, Paris : PUF 1985. Petitot, J. 1991. Syntaxe topologique et grammaire cognitive, Langages, 97-127. Poincaré, H. 1912. Pourquoi l'espace a trois dimensions. In Revue de Métaphysique et de Morale, 20 e année, n° 4, pp. 483-504. Pylyshyn, Z. 1984. Computation and cognition, Cambridge, Massachussets, London, England : MIT Press. Rastier, F. 1987. Sémantique interprétative, Paris : PUF. Salanskis, J.-M. 1991. L'herméneutique formelle, Paris : Editions du CNRS. Salanskis, J.-M. 1992. Modes du continu dans les sciences. In Intellectica n° 1314, pp. 45-78. Salanskis, J.-M. 1994a. L'autonomie des mathématiques. In Epistémologie et philosophie, A. Sinaceur éd., ouvrage collectif à paraître. Salanskis, J.-M. 1994b. La mathématique de la nature et le problème transcendantal de la présentation, à paraître dans la revue Archives. Salanskis, J.-M. 1994c. L'intuition dans la lecture heideggerienne de Kant. In Le destin de la philosophie transcendantale, F. Gil, J. Petitot et H. Wisman (eds.), à paraître aux éditions Patiño. Smolensky, P. 1988. On the proper treatment of connectionism, preprint. In The Behavioral and Brain Sciences 11, pp. 1-23. Smolensky, P. ; G. Legendre ; Y. Miyata. 1990. Harmonic Grammar - A formal multi-level connectionist theory of linguistic well-formedness : Theoretical foundations, ICS Technical Report, n° 90-5. Talmy, L. 1983. How Language Structures Space. In H. Pick and L. Acredolo (eds), Spatial Orientation : Theory, Research and Application, Plenum Press. Talmy, L. 1985. Force Dynamics in Language and Thought. In Parasession on Causatives and Agentivity, Chicago Linguistic Society (21st Regional Meeting) University of Chicago. Victorri, B. 1988. Modéliser la polysémie. TA. Informations 29, Paris, pp. 2142.
REFLECTIONS ON HANSJAKOB SEILER'S CONTINUUM RENÉ THOM IHES, Bures-sur-Yvette, France
1.
The opposition Continuous-Discrete : Generalities
The opposition Continuous - Discrete is a great aporia that plays a structu ral role in our perception of reality. We will recall its essential aspects. Ontologically the continuum is anterior to the discrete. Let us call "substance" any entity capable of carrying an accident or a predicate. By defini tion a predicate is posterior to the predicated substance. Now a continuous entity can admit discrete accidents (a broken line for example) whereas a discrete en tity can allow of no continuous accident without itself becoming (at least locally) continuous. This statement may appear paradoxical if we start from the "scientific" definition of the continuous, that of the real axis obtained by completing the gaps in the set Q of rational numbers (according to the famous definition given by Dedekind in his opuscule, Was sind und was sollen die Zahlen). We should bear in mind that continuity in its pure state is the perception of the passage of time in our consciousness. We are conscious of time even if, and especially if, nothing happens ; as Kant said, time is the a priori condition of all experience. We would not be able to define the whole number n, nor the addition of 1 to n that gives n+1, if we did not have, in our consciousness, the temporal memory of the entity (n). In the definition of any "operation", there is an underlying tempo rality which a logicist attitude tends to forget : the result of two operations g o f takes on meaning only through consecution in time : operation f followed by g. But the continuum is the poorest form of existence. There is nothing to be said about the continuum in its pure state, the state where all points (or should we say, places ?) are equivalent. Only the discrete accidents it carries can be matter for discourse. We note, however, that two pure continua, with no visible accident anyw here, can have different "shapes". Thus, in dimension one, the circle differs from
156
RENÉTHOM
the straight line. The straight line is simply-connected : any loop1 in such a line can be continuously retracted into its base point whereas the basic loop of the circle (all the way round) cannot be. The object of algebraic topology is to recognize whether two continua are homeomorphic, in other words, if they have the same shape. This kind of pro blem, luckily, will not concern us here. 2.
The Discrete in the continuous
The minimal example of a discrete accident in a continuum is the point on a straight line. In his Physica (and he frequently returns to it in his Metaphysica), Aristotle does a surprising mental experiment : The point O on the axis x'Ox divides the axis Ox into two half-axes. If the point O has only a potential existence, the axis is not divided. But if O exits in actuality, it gives rise by division to two points : O1 to the left and O2 to theright.These points, terminating the half-axes x'Oi1, O2X, are end-points. Each constitutes a boundary of its respective The axis x'x is then broken half-axis ; although they are distinct, they are together into these two new axes. The figure so obtained is no longer connected, we cannot join continuously a point y of x'O1 to a point z of OX. The act separates : H [MetZ 13,1039a, 6-7]. Now the concept of separation is a fundamental criterion of individuation. Aristotle defines the most general entity, "substance" (o\)Gia) as "this some thing which has separate being", _ . One is reminded of the definition I proposed in [ES] of a salient form : a salient form is a closed set in three-dimensional Euclidean space with a clear boundary and a non-empty inte rior. Saliency is a sufficient criterion of form individuation. In particular, the li ving beings perceptible to us are three-dimensional balls limited by a membrane, their skin. The boundary of an animal is thus a surface separating the interior from the exterior, as the point O on the x axis separates the negative half-axis (x<0) from the positive half-axis (x>0). The splitting of an axis into two half-axes seems a priori to be a unique phenomenon, typically irreversible, like breaking a stick in two. But catastrophe theory makes it possible to immerse such a process in a continuous family. It is the Cusp singularity that achieves this transition between one and two. Let us re call what we essentially need to know.
1) A loop in a space E is a continuous path whose origin and extremity coincide at a point called its base point. A space X is said to be simply-connected if any loop therein can be continuously deformed into its base point.
SEILER'S CONTINUUM
157
Take a dynamical system whose configuration space is the axis Ox, de pendent on two external parameters u and v. We suppose that the representative point of the state of the system moves under the effect of a fourth degree poten tial in x : V(x,u,v) = x4/4 + ux2/2+vx ; the set of equilibrium points of the system is a surface (F) (the cusp) defined by the equation dV/dx = x3+ux+v = 0. The surface (F), when projected onto the plane Ouv, admits "critical" points (points with a vertical tangent plane) which constitute a curve (C) obtained by associa ting with the equation of (F) the equation d 2 V/dx 2 = 3x2+u = O. From this we draw the equation (P) of the projection curve of (C) in Ouv : u = -3x2
v = -(x3+ux) = 2x3
¿
La fronce The pleat Figure 1 This curve (P) is a semi-cubic parabola of equation 4u3+27v2 = 0. It sepa rates the bimodality zone (the points (u,v) where the equation of (F) has three real roots) from the exterior zone where this equation has only one real root. The counterimage of the symmetry axis v = 0 when (F) is projected onto Ouv, is the plane section of the surface (F) by the plane v=0, section which has for equa-
158
RENÉTHOM
tions x3+ux = 0, v = 0. In the plane Oux, it is decomposed into the parabola of equation u = -x 2 , and the axis of the u, x = 0. The complete figure of the inter section of (F) by the symmetry plane v = 0 is then that of the fork (Figure 2). The half-axis u < 0 is the projection of a stable equilibria set, whilst the half-axis u > 0 is the projection of the V maxima, and so of unstable equilibria.When mo ving on the u axis and trying to lift the corresponding point on the fork, we get the two situations : transition 1→2 (dichotomy) for u decreasing, and transition 2 → 1, the confluence of two branches into one, for u increasing (Figure 3).
Figure 3 These considerations show how the directional symbolism of the arrow is not a cultural particularity but, on the contrary, the expression of an essential dif ference between the two directions of the axis Ou. In the direction of u increa sing, the lifted point x(u) on the fork is always continuously univocally defined across the singularity. In the direction of u decreasing, on the other hand, the lif ting of x(u) cannot be smooth : once at the forking point u = O, we must, for
SEELER'S CONTINUUM
159
u = -k, k small, choose between the two branches x = ± k. And this is difficult in a determinist universe where God does not play at dice (there has to be a symmetry breaking). The external variable u then plays the role of a bifurcation parameter. These reflections could be summarized in the following way. The transi tion 1 → 2 is entropically difficult, whereas the transition 2 → 1 is entropically easy. We shall see that linguistic activity, the aim of which is to share with so meone else a piece of information one has, goes in the direction of decreasing an entropy, bringing in a "neg-entropy" (and this with none of the statistical consi derations that are the classic foundation of the notion of entropy). The intrinsic complexity of linguistic messages is an expression of the need to guide the liste ner at each bifurcation of the graph where understanding categorizes reality. 3.
The notion of genus and the couple Genus-Species
By genus I mean here what the Greeks called Genos, in opposition to Eidos or species. By some strange historical phenomenon, these notions, which formed the skeleton of Aristotelian and later of Scholastic thought, which remai ned familiar to all thinkers up to the time of Leibniz and Malebranche, have practically disappeared from modern thinking. I have no doubt but that presentday cognitivists will shortly rediscover them (with a new terminology, of course !) It is Aristotle who gives us the strictest definition of genus. Two universals (A) and (B) belong to the same genus if, whatever the entity X, the two state ments "X is A" and "X is B" cannot be true at the same time. In other words, A and B are opposites. Aristotle's basic intuition is that the relation of opposition can only exist between concepts with a strong semantic affinity. They are oppo sites because they share the same semantic space, where they occupy separate regions. In this kind of vision, a criterion defining genus is the possibility of de forming any concept within the genus into another of the same genus, without there being at any moment an impression of "quitting" the genus. Thus, in the space of colour impressions, a red patch can be deformed continuously into a blue patch (through violet) whilst the spatial form of the patch is respected. A topological criterion of this sort for defining genus would be very satisfying. Unfortunately it does not appear to be wide enough to cover all the uses of the word. Thus, Aristotle, in defining the term Genos [Met, q], offers the example of a people descending from a common historical ancestor. Whence the idea of substituting a common origin (as etymology would suggest) for continuity of de formation. This common origin will be visualized as a space of intelligible mat ter, a lump of modelling clay whose own evolution will push it through a sieve.
160
RENÉTHOM
From each hole in the sieve a filament of the same material will come forth, each filament constituting a species (Eidos). This decomposition of genus into species is the central tool of Aristotelian Logic, the motor of syllogism. It is important then to think about who pushes the undifferentiated material of the genus through the sieve that breaks it down into species. And where does the sieve come from ? Two types of model are considered. The first — of the Logical type — is the metaphor of the blade that cuts the plastic material of the genus. Sometimes this border can be explicitly determined : it is the Diaphora, the specific difference that exists on one side of the blade and not on the other. The second is a dynamical one (in a Heraclitean-catastrophist style), we may imagine the sieve as resulting from conflict between preexisting species which would have shared out the space of the genus along frontiers subsequently stabi lized, the positions of which would have been established by usage, pragmati cally. In this way it was possible to suggest that the main prototypical colours appeared in the space of colours because of the biological importance of a vital material (red = blood, white = milk, green = vegetation, blue = the sky, etc.). In the beginning there would have been an undifferentiated genus of the sensory field, but differences in usage could have favoured, through neuronal "Bahnung", certain regions which became "attractors", the prototypical colours. In this kind of competition between regions, the borderlines between species may be hazy. In other cases, like the perception of phonemes in phonology, frontiers are perceived very clearly, even when the sounds on one side of the frontier are physically very close to those on the other : this is the process of ca tegorical perception in phonology [J. Petitot, 1982], prelude to a discretization of the field that will lead to Jakobsonian structuralism. In the simplified view which is that of Porphyry's tree, all concepts are supposed to issue by ramification from a universal tree (A). In order to deter mine a concept (c), it suffices to connect the branch (c) to the origin (a) by the (unique) path (g) (c) joining (c) to the apex (a) of (A). Following this path from (c) towards (a), the travellor would note at every bifurcation which branch the path came from and thus we would have a coding for every concept. Linguistic communication would simply require that these coded words be transmitted into the free monoid engendered by all these bifurcation markers. Assuming that Porphyry's tree is the same for every speaker, with the same coding at each bi furcation, linguistic communication would be consistently faithful. This ideal conception stumbles against the fact that the ramifications of the tree are not de termined in a unique manner. Studies of prototypicality in the extension of a concept have shown that, very often, the organization of a certain sensory field (colours for instance) includes local attractors that have no linguistic being (experiments carried out by Eleanor Rosch [1973] with a primitive New Guinea
SELLER'S CONTINUUM
161
tribe). Concepts, in a certain sense, may have a preverbal existence. Other stu dies on the extension of an ordinary concept (neither too abstract nor too con crete), like the concept "bird" for instance, bring out the fundamental phenome non of prototypicality. When one tries to attribute to a species of bird the mea sure of its normality (large for the sparrow, middling for the duck, practically nil for the ostrich), one realizes that comparison implies the use of "genera" — which I propose to call "directive genera", such as the bidimensional genus of the usual habitat (earth, water, air) and unidimensional genera such as those of the opposites (wild-tame, edible-inedible, diurnal-nocturnal). In each genus, certain classes are marked with respect to a neutral class, the prototypical, and the atypical character of a bird is globally evaluated by "adding" all these marks together. (Of course the relative weight attributed to the different directive ge nera depends on the individual preferences of the speaker : we are not talking about a quantitative theory of prototypicality). In the example given we can see that the genera used for evaluation are strongly anthropocentric in character. This is where H. Seller's vision of continuity comes in. In the well-known syntagma of epithets, in German "Diese erwähnten zehn schönen roten hölzernen Kugeln", we are shown a linguistically imposed order of qualitative "genera" : localization in space (by Deixis), localization in context, number, subjective appreciation, colour, matter of the thing alluded to. Even if few lan guages present "genera" in such a canonical order as German, the continuous process from Deixis to predication nevertheless still shows a universal character, local permutations between genera remaining possible from one language to another. From a universalist viewpoint, one wonders then whether the direction of Seller's continuum, from Deixis to Predication, is a universal, or whether it might not be reversed in some languages. Later on we will show there is good reason to believe that it is indeed a universal direction, for it is opposed to "thermodynamical" irreversibility which favours the transition Species —» Genus over the transition Genus → Species. All communication of information requires an effort on the part of the speaker. In the general schema : Speaker → Message → Receiver, there is an emissive dichotomy on the speaker's part, whereas there is confluence on the listener's side. 4.
Archeology of Seiler's continuum
I do not know whether present-day linguists still consider it an unpardo nable sin to bring up the problem of the origin of language. If such is the case, I must confess, not being a linguist, that I think it necessary to brave this taboo. The origin of human language forcefully raises the question : was the transition from systems of communication used by Prehominidae smooth or dis-
162
RENÉ THOM
continuous ? I hope I will be forgiven for devoting a few pages to this question, just a particular case of the more general one haunting evolutionary theorists : was evolution gradual and continuous, à la (néo)-Darwinian, or did it include "catastrophic" gaps (à la Cuvier-Schindewolf) ? To answer this question we need formal modes of presenting biological organization, the only way to avoid endless rhetorical confrontation. I take the liberty of referring readers my Aristotle-inspired theorization of biological organization [Thorn 1990]. We start with the idea that the organism, as a structure in space, can be decomposed into "homoeomerous" parts of various dimensions (0,1,2,3). A homoeomerous part is a part X of the organism such that any two of its points x, y have phenomenologically equivalent neighbourhoods. Thus two organisms whose homoeomerous parts can be seen to correspond biunivocally, two by two, in a global isotopy connecting the first organism to the second, are said to belong to the same genus. This is the equivalence Aristotle calls kath'elleipsin kai hyperochèn — by excess and by deficiency. It was properly understood only by d'Arcy Thompson, who illustrated it with the famous diagrams of homeomorphic fishes in his classical treatise "On Growth and Form". A definition like this of biological organization dictates the answer to our question : since, in the course of ages, evolution has modified the topological structure of division into homoeomerous parts (the plans of biological organiza tion have been qualitatively modified), there must necessarily have been quali tative discontinuities in evolution. Quantitatively, however, these discontinuities may have been minimal : a new cellular differentiation appearing (or an old one disappearing) just in one little clone, it is fair enough to suppose that change is quantitatively continuous. There is reason to believe that sonar communication systems in Man's ancestors could be presented in a similar way. Sounds are ne cessarily created and destroyed, but we are at liberty to think that these changes may have been relatively slow, by extension, respectively extinction, of the users of these new, respectively short-lived, modes of utterance. We now have to explain why the structure Genus → Species must have intervened in the animal kingdom. We observe that a genus (as a space) divided by the Diaphora into species has the same "stratified" structure as that of the homoeomerous parts of a biological organism (though it is simpler, needless to say). Where does this structure come from in nervous activity ? From a finalist standpoint the answer is simple. In the case of predation, for example, it is ne cessary first to recognize one's prey and then to catch it. Now vision of exterior objects is a process that is generally continuous, but from time to time the sight of a new object (prey or predator) can completely modify one's behaviour. These are objects of biological significance, called pregnant [ES] Prey, predator, sexual partner, are typical pregnant objects.
SEILER'S CONTINUUM
163
Objects of prey and sexual partners attract, predators repulse. Faced with a newly perceived object, one has first to recognize very quickly whether it emits a pregnance, then to proceed as soon as possible with a strategy of capture or flight. Now external objects are variably located, and constitute continuous sti muli. They must be very rapidly classified as attractive objects, repulsive ob jects, or belonging to the neutral class of indifferent objects — the typical classi fication of a genus into three species. But here we have the behaviourist schema of Stimulus - Response. It will be observed that the strategies of response, in their motory elaboration, themselves depend on continuous parameters. A later "psychical" complexification resulted in the development of a double category — useful and useless objects — without forgetting the sacred object, obtained by identification in infinity of attractive and repulsive. Food is a useful attractive object. The useless attractive object gives us the work of art, at the source of the Beautiful. The useful repulsive set consists of objects, indifferent or weakly re pulsive to start with, for which an instrumental use is found. Excrement is a useless repulsive object (note that for some primates, excrement can be used as a projectile to bombard the enemy). In a certain sense, through "repression of im mediate reflexes" a whole series of classes of interesting objects is constituted, objects to manipule (once bipedism has permitted the use of hands). Then appear the tactile genera (opposition hard-soft), apprehension of the forms of rigid ob jects, and of their possible deformation (breakable, flexible objects). The phase states of matter are categorized in a genus (Aristotle's elements), and prescientific knowledge develops through exfoliation of the strata of initially indifferent objects. Even Mathematics emerges from insignificance ! This vision leads one to think that the "directive genera" according to which we class exterior objects range from some deep level related to biological urgency (opposition good-bad) towards "objective" properties. Let us not be misled : objectivity is subjectively measured by its resistance to our efforts (Maine de Biran) ; in order to know an object in depth, we have to attack it, even to the point of destruction (the whole doctrine of modern reductionism). This or der of genera, reflected in the great Holzkugeln syntagma, recapitulates the pre historic acquisition of knowledge. Just as the utterance of a (semantically auto nomous) sentence can be seen as the genesis of a living being, so its formation recapitulates (in the sense of Haeckel's law) the entire phylogeny of the mind. 5.
Seiler's continuum and the theory of C.S. Peirce
It is very rare for a semantically and linguistically complete phrase to pre sent in itself the sequence foreseen in Seiler's vision of continuity. The case of Peirce's theory is different. To check C.S. Peirce's doctrine of ternarity, I started
164
RENÉ THOM
from a fairly commonplace context. When suspicious emanations came out of the kitchen which my wife had left, I told her so with the sentence "It smells like burning (ça sent le brûlé)", a sentence that can be analysed as follows. The inde finite deictic it represents primarity, the undifferentiated impact of a sensory stimulus on the mind ; smells represents secondarily, the type of sensory stimu lus, and like burning represents the conceptual aspect attached thereto according to Peircean ternarity. In the same way, Peirce's ternary sequence — Qualisign, Sinsign, Legisign — reflects the sequence : Pregnance, Salience, Concept, as used in the [ES] terminology. H. Seiler suggested that Iconicity be inserted between Deixis and Predication, something I found rather surprising at the time (there is not much iconicity in our languages, onomatopoeias are very rare exceptions). He explains himself in the akup publication No 73, Iconicity in a functional perspective. Speaking of the Saussurian arbitrariness of the sign, Seiler postulates that in so far as a symbol (in the Peircean sense of ternarity ikon, index, symbol) has a non-arbitrary form, there is an intrinsic resemblance between significans and signification, and so a certain form of iconicity. We have here a very far-reaching postulate. For a mathematician, it is not because a form B in a space Y comes from another space-form A in a space X via an (algebraic) morphism F : X → Y that B is the same as A. Of course we have in mind phenomena like the plural marker in certain languages expressed by repe tition of the singular. So we have to "topologize the notion of iconicity", by re ducing it to a sort of term to term equivalence between "salient" groups. Moreover only this kind of conception allows us to define the addition of para meters in H. Seller's abstract as this Meeting (Cognitive Continuity and Linguistic Continuity). (As I understand it, a parameter is obtained by adding a linguistic marker to the expression, addition of the parameters allows the pas sage from implicit to explicit, hence grammaticalization.) With the notion of Technique and the study of the participation dimension, the problem of transi tion from semantic (I will not venture to say "cognitive") to linguistic expression is raised in its full amplitude. Here again there is an encounter between two con cepts. One, the participant, plays the role of an implicit EGO, and the other, the participated, (p), is in fact what I call in [ES] a pregnance, investing EGO regar ded as a salient object. The generic modes of evolution of the link between EGO and (p) have now to be described. These modes of investment of EGO by (p), defining at each moment t the instantaneous link between EGO and (p), are or dered according to the more or less endogenous / exogenous ratio of (p). Genetically - and here we come back to Peirce — a stimulus that is initially exo genous and not well characterized internally, gives rise to an endogenous reac tion (r) whose conflict with (p) leads to a second reaction p'(r) which evolves
SEILER'S CONTINUUM
165
towards a linguistically explicit situation. Seiler lays this progression from im plicit to explicit as the foundation of his continuum of Techniques. It is perhaps not impossible, by means of a finer analysis of interaction dynamics, to obtain a more precise description of these modes of Subject-Object interaction, so effa cing the somewhat rhapsodical impression given by his enumeration of Techniques. Finally we recognize that H. Seller's continuum is the only known way forward in the elucidation of psychical mechanisms permitting us to classify real objects linguistically. I believe this brings in the problem of schematism, famous since Kant : how to form a representative image of a concept (the prototypical referent, as we say today). In his Philosophy of Grammar, O. Jespersen uses as an epigraph JeanJacques Rousseau's maxim : "It takes a great deal of philosophy to perceive once what we see around us all the time. " From this point of view, H. Seiler certainly deserves, beside the title of great linguist that his work on the Cologne UNITYP project obviously merits, the title of great philosopher.
166
RENÉ THOM
REFERENCES
Petitot, J. 1982. Paradigme catastrophiste et perception catégorielle, Centre d'Analyse et de Mathématique Sociale, Paris : Ecole des Hautes Etudes en Sciences Sociales. Rosch, E. 1973. Natural Categories, Cognitive Psychology Y 328-35 O, Academic Press. Thorn, R. ES 1988. Esquisse d'une Sémiophysique, Paris : Interéditions. Thom, R. 1990. Homéomères et Anhoméomères en théorie biologique d'Aristote à aujourd'hui. In Biologie, Logique et Métaphysique chez Aristote, Séminaire CNRS - NSF 1987, Paris : Editions du CNRS.
ATTRACTOR SYNTAX : MORPHODYNAMICS AND COGNITIVE GRAMMAR JEAN PETITOT EHESS, CREA, France Introduction The 1987 debate between Jerry Fodor-Zenon Pylyshyn and Paul Smolensky1 has raised a critical challenge : have connectionist networks the ca pacity for adequately modeling linguistic structures which, classically, are mode led as symbolic ones? The main difficulty is to model grammatical relations, se mantic roles (in the sense of case grammars), constituency and compositionality in a dynamical way. In a nutshell, it can be formulated in the following manner : if terms of sentences are modeled by attractors of some underlying dynamics what is the dynamical status of a "syntax" relating these attractors ? What can an attractor syntax be ? The problem is difficult for at least two reasons. a. Weak CN vs Strong CN (CN = Connectionist) To construct a syntactic system, we need at least to distinguish : i) between two syntactic (categorial) types : terms and relations, and ii) between two types of relations : static and dynamic. But, if we represent terms by activity patterns (e.g. attractors), how can we represent these two differences ? It is clear that syntactic relations between attrac tors cannot be reduced to mere linear superpositions. We call weak CN a CN which models semantic entities of different syntactic types by attractors of the same type (category mistake). To work out a response to Fodor's and Pylyshyn's challenge we need to strengthen weak CN with a strong CN which has the capacity to model different grammatical categories by mathematical entities of different types. b. Elementary vs non elementary CN syntax One could think that it is trivial to elaborate a strong CN. One would have only : 1) See Smolensky (1988) and Fodor, Pylyshyn (1988).
168
JEAN PETTTOT
i) to represent activity patterns (attractors) coding the terms by units of some higher level layer (what is called in the CN literature a "localisation"), and ii) to represent the relations by connections between these units. We call such a solution an elementary one. It does not work for the very reasons stressed by Fodor and Pylyshyn : it projects neuronal implementation into the functional architecture and admits "the implicit — and unwarranted — as sumption that there ought to be similarity of structure among the different levels of organization of a computational system". Static and dynamic relations between terms must be modeled by dynamical relations between attractors, and these relations are of a completely different nature to the underlying connections they are implemented in. We call nonelementary a model which does justice to this principle. Main problem. Can an "attractor syntax" be worked out in the framework of a strong non-elementary CN ? This problem is essentially a mathematical one. We have shown elsewhere that it can be solved using morphodynamical models. We focus here only on some particular points. 1.
The problem of syntactic constituency as a challenge for connectionist modeling
1.1. The main arguments of Fodor and Pylyshyn1 The conception of syntax which is the least symbolic and the closest to the CN sensibility is that of case grammars. But one must nevertheless give a good CN account of the semantic roles which select cases. This is the main problem : we must model in a CN framework what European linguistics and semiotics call actantial relations. Now, as was stressed by Fodor and Pylyshyn, "the role rela tions (...) traditionally get coded by constituent-structure". More precisely, "when representations express concepts that belong to the same proposition, they are not merely simultaneously active, but also in construction with each other". And "representations that are "in construction" form parts of a geometrical whole, where the geometrical relations are themselves semantically significant". The main problem is therefore to build up what we shall call a configurational definition of case roles. Of course, for the CL paradigm, the problem of a configurational definition of actantial relations is easily solved by means of the use of formal and combina torial symbolic structures. But this does not entail that every such configurational
1) For more details, see Petitot (1991b), (1993).
ATTRACTOR SYNTAX
169
definition must be of a symbolic nature. More precisely, the core of the argument of Fodor and Pylyshyn is the following. Let Ai (i=l,...,n) be the actants of a pro cess linguistically expressed by a verb V. Suppose that the Ai are modeled in a CN way as activity patterns (attractors for instance) ai of some underlying CN dy namics X. How must we model V ? In elementary CN models, V is also modeled as an activity pattern (an attractor) of X, and the structural relations between V and the A i are modeled by the linear superposition V + Sai. But it is impossible to reach a configurational definition of the actantial roles using only an additive operation. Structures such as syntagmatic-trees are non-commutative and nonassociative. Therefore, they cannot be modeled by algebraic structures which are of commutative group type. But this does not entail that it is impossible to build up a CN configurational model of constituent-structures. It only entails that it is necessary to elaborate a CN theory of actantial interactions, that is of these "geometrical wholes, where the geometrical relations are themselves semantically significant". This discussion leads therefore to the following main question. Main question. If the actants Ai of a process are modeled by attractors ai of a dynamical system, is it possible, within the framework of the mathematical theory of dynamical systems, to elaborate a theory of actantial interactions — that is a theory of the verb ? The mathematical challenge is therefore to develop a theory of interactions of attractors. What we call an attractor syntax. We have shown elsewhere that bifurcation theory provides adequate tools for solving this problem. 1.2. Smolensky's tensorial product Smolensky's main idea is to take for granted the CL finitist and combina torial view of symbolic structures and to represent them in a CN way — in much the same way as one represents abstract groups in linear groups in the well known group representation theory1. To do this, he first adopts a case conception of syntax and thinks of syntactic structures as compounded by three sorts of entities : i) semantic case-roles n ; ii) fillers fj ; iii) binding relations between roles and fillers. He supposes then that the roles and the fillers are already represented in a CN (local or distributed) way and solves the problem of representing the binding relations using the linear device of the tensorial product.
1) See Smolensky (1990) and Smolensky et al (1992).
170
JEAN PETITOT
Suppose that the roles ri (resp. the fillers fj) are vectors belonging to the vector space VR(resp. V F ) of the global states of a network R (resp. F). Let u p (resp. Vφ be the units of R (resp. F). One connects R and F using connections up↔vφ with Hebbian weights Wp,φ = Σi ri,p.fi,φ where ri,p (resp. fi,φ) is the activity of the unit u p (resp. vφ) in the global activity pattern of R (resp. F) repre senting ri (resp. fi). The tensorial product device consists in introducing new units bp,(p between R and F,bp,φbeing connected by two weights = 1 to u p andVφand having Wp,φ as activity. It is easy to see that we get in this way a CN imple mentation R*L of the tensorial product V R * V L with basis bp,φ = up*vφ. With Wp,φ = Σi ri,p.fi,φ, the state of R*L becomes :
We get therefore a representation Ψ: S→V of a set S of structures in the state space of a network. In a tensorial product, r i ( r e s pfj)isidentifiedwith an activity pattern ri,pup (resp. fj,φVφ) and the predicate fjlri on S : "fj fills the role ri in the structure se S" is identified with ri®fj. As far as a structure s is a conjunc tion of filri and a conjunction is represented by addition, we have finally :
Paul Smolensky has shown that this type of procedure allows us to repre sent in various ways operations and transformations on symbolic structures. According to him, this shows that it is possible to integrate "in an intimate collabo ration, the discrete mathematics of symbolic computation and the continuous ma thematics of connectionist computation". Smolensky does not want to reduce all symbolic structures and processes to CN ones. In order to explain "higher thought processes", he wants to represent in a CN way these symbolic descriptions. 1.3. The core of the debate : the need of a configurational definition of the roles In their response to Smolensky's response, Jerry Fodor and Brian McLaughlin1 reconsider the systematicity problem and the fact that "cognitive processess are causally sensitive to the constituent structure of mental representa tions" (p. 185). They summarize their main point in claiming that "all we really need is that propositions have internal structure, and that characteristically, the in ternal structure of complex mental representations corresponds, in the appropriate
1) Fodor, McLaughlin (1990).
ATTRACTOR SYNTAX
171
way, to the internal structure of the propositions that they express" (p. 187). More precisely, they introduce a condition (C) which "expresses a psychological law that subsumes all systematic minds". (C)"If a proposition P can be expressed in a system of mental representa tions M, then M contains some complex mental representation (a "mental sen tence") S, such that S expresses P and the (classical) constituents of S express (or refer to) the element of P." Condition (C) plus the fact "that mental processes have access to consti tuent-structure of mental representations" allows one to explain the cognitive fact of the systematicity of the mind. Against this theoretical background, Fodor and Mc Laughlin can offer an evaluation of Smolensky's tensorial product device. Their main criticism is that it is impossible to retrieve from tensorial product representations and from additive superposition operations a constituent-structure whose constituents can have a causal status. Indeed, in a vector space the choice of a basis and hence of a vector decomposition is not canonical. Every vector decomposition is therefore counterfactual and the constituents (components) it generates cannot have causal efficiency. We think that this negative argument is partly right even if it is over-drastic. For instance, it is true that there is no canonical basis in a vector space V (that is V possesses a non-trivial symmetry group, the linear group GL(V)). But neverthe less, the vector space VR of the states of a network R does possess a distingui shed basis, namely the basis defined by its units. In that case, vector decomposi tions are not counterfactual operations. But notwithstanding, the criticism points out a major difficulty which can be expressed in the following manner. For Smolensky, the basic problem of a CN theory of symbolic structures is that of the binding relations between roles and fillers. He succeeded in solving this problem, but in a way which replies to only half of Fodor and Pylyshyn's challenge. Indeed, it says nothing about the possibility of reaching in the CN framework a configurational definition of actantial roles. Moreover, it takes for granted a symbolic pre-definition of the roles. As was stressed by Yves-Marie Visetti1, in the tensorial product approach "the associative conception of memory as a return to a preferential state" together with "the concept of attractor as an intrinsic meaningful state" disappear. For the problem is not only to represent roles as local or distributed activity patterns of some appropriate network, it is also to give a correct CN account of the actantial relations and interactions which are involved in syntactic structures. These relations are not binding relations. They
1) Visetti (1990).
172
JEANPETITOT
concern the semantic (actantial) roles independently of their fillers. The PTC agenda which, according to Smolensky, "consists in taking [the] cognitive principles and finding new ways to instantiate them in formal principles based on the mathematics of dynamical systems" must be also applied to the configurational definition of the roles. 2.
From syntactic constituency to cognitive archetypes
To solve this problem we reduce it step by step to other problems, the most basic one being, as we shall see, the one of perceptual constituency. Let us take the most elementary exemple, the one of a verb like [ENTER] which expresses a temporal transformation of spatial relations. The first step of the reduction uses Jean-Pierre Desclés' theory of cognitive archetypes, which are data structures analog to those proposed by linguists and AI theorists such as Fillmore, Schank, Minsky or Winograd1. Their function is to represent symbolically conceptual (spatio-temporal and dynamical) information. They are intermediary structures between image-schemas in the sense of cognitive grammars and symbolic predicative structures. For the case of [ENTER], the archetype is the following.
Figure 1. The cognitive archetype [ENTER] (Loc=x).
(a) Its general structure. (From Desclés [1990])
(b) Its topological content and its relationship with the actantial graph of capture (lines) 1) See Desclés (1990).
ATTRACTOR SYNTAX
173
i) SIT1 and SIT2 are stative situations (initial and final states). ii) SIT1[y] is described by the following symbolic descriptor of positional relations : y Є 0 ex(Loc), where Є 0 is a localization operator of an object y relatively to a locus Loc. iii) SIT2[y] is described by : yЄoin(Loc). iv) MOUVT is an operator of movement which modifies the stative states. Now, using combinatorial logic and applicative grammar, one can show how a cognitive archetype can be automatically converted into a predicative structure. Desclés uses the following symbolic expression for the cognitive archetype : ENTER = MOUVT(Єo(exLoc)y) (Є o(inLoc)y) We want to associate to it a predicative structure of the form E(Loc,y) C'y enters in Loc") where E(.,.) is a binary predicate. For this we posit : E = Ψ(BΦΦMOUVT)(BЄ-o)ex in , where Ψ, B and Φ are the combinators : •ΨXYZU→X(YZ)(YU); • BXYZ → X(YZ) (composition) ; • Φ(XYZ)U -> X(YU)(ZU) (intrication). It is easy to verify that, starting from E(Loc, y) and applying sequentially these mies we arrive at the cognitive archetype ENTER. Figure 2
The derivation between a cognitive archetype and a predicative structure. (From Desclés [1990]). (Loc=T2, y1=T1, E=P2)
174
JEANPETITOT
This derivation is interesting because it shows that the semantic meaning of an item such as [ENTER] is twofold. First it contains the local (positional and dynamical) content expressed by the primitives Є O, ex-in, MOUVT. But it also contains the formal content expressed by the combinatorial operations of predicativization. As is stressed by J.P. Desclés the lexical law E(Loc,y) =MOUVT(Єo(exLoc)y)(Єo(inLoc)y) is "a "compilation" of the linguistic expression, encoded with the grammatical constraints of language, in a system of semantic representations organized by means of cognitive archetypes". 3.
From cognitive image-schemas to perceptive constituency
Cognitive archetypes are symbolic representations of schematic and iconic structures elaborated by cognitive grammars (CG) in the sense of Len Talmy, Ron Langacker, George Lakoff or Ray Jackendoff1. At the most basic level, concepts are thought of as positions — "locations" — or configurations in some geometrical (topological, differentiable, metric, linear, etc.) manifold. CG leads to the following identifications : i) terms (fillers) j localized domains in some concrete or abstract space ; ii) relations ƒ positional relations between locations ; iii) processes ƒ temporal deformations of positional relations ; iv) events ƒ interactions between locations ; v) semantic roles ƒ types of transformations and interactions (configurational definition)2. In spite of the debate concerning it, we take here for granted that CG is a plausible linguistic theory. The problem becomes therefore the following : i) how to mathematize the image-schemas ? ii) what sort of CN computational device is able to scan not only regions in domains (terms, objects, things) but also relations and processes ? This problem is not a trivial one because the positional relations of locations and their temporal transformations are global, continuous, gestaltic (holistic) configurations. But if we want to scan them in a CN way, for instance using some sort of "retinian" array of formal neurons or of receptive fields, we must use only local algorithms. We call this the "global Gestalt VS local computation" dilemma.
1) See e.g. Talmy (1983), (1985), (1990), Langacker (1987), (1991), Lakoff (1988), Jackendoff (1983), (1987). 2) See Langacker (1987).
ATTRACTOR SYNTAX
175
Figure 3
The temporal profile of the process [ENTER]. (From Langacker [1987]).
To solve it we reduce it to the perceptual basis of CG. We identify positional actants with certain topological domains in 2D space and we consider configura tions A = (A1,...,An) of such domains. These configurations can evolve in time. The problem is therefore to scan the relational profiles and the temporal ones. To do this we make a basic assumption : we treat configurations as forms, that is as patterns. The problem is now to analyze whether local and finitist pattern recognition algorithms are able to perform the scanning of relational and temporal profiles ? If we can find plausible and relevant algorithms which can be implemented in a CN way, some sort of CN theory of syntactic constituency will become avai lable. 4.
Contour propagation and the cut locus theory
The fundamental problem is of course to scan the spatial relations. It is an old one, and also an extremely difficult one. A CN scanning can use only two types of devices : detection of local heterogeneities (singularities) and local to glo bal propagation. The most simple and effective way to solve it is to use a propagation device triggered by the detection of the boundaries Bi =∂Ai of the objects Ai
176
JEAN PETITOT
and to extract the singularities of the propagative process. Deep mathematical theorems show that these singularities (which are local entities) characterize the initial configurations. We give here only a very elementary example inspired by the so called "grassfire" model. Let I(x,y) be the intensity pattern characterizing a configuration A = (A1,...,An). We embed I(x,y) in a family Is(x,y) which is a solution of the wave equation ∂ 2 I s /∂ s 2 = ΔIS. The characteristics of this hyperbolic PDE are rays which propagate orthogonally to the initial contour B° = B 1 + ... +B n . Wave fronts propagate orthogonally to the rays, that is parallel to the initial front B°. We focus on the singularities of the propagation. They constitute what is called the cut locus (CL) of the propagation, that is the locus of the points which are reached at the same time by two rays coming from two different points. Historically, the contour propagation routine has been introduced in vision theory by Harry Blum1. Some other sort of propagation correlated to a diffusion equation (heat equation) have been considered by other specialists, David Marr, Stephen Grossberg, Jan Koenderink, etc.2 The CL is a very interesting structure, well known in differential geometry. i) it is a singular locus and it allows us to reconstruct the global shape S from the radius function (i.e. the radius r(x) of the maximal disc centered at x e CL). ii) it is a dynamical object. It is built following the propagation of the wave fronts, that is following the direction of increasing radius r(x). iii) its topological properties — and in particular its singularities : triple points, end points — are fundamental indicators for the geometrical properties of the shape S, e.g. the convexity. Figure 4
(a) The contour diffusion process according to Blum. The dot lines represent the cut-locus of the form. 1) See Blum (1973). 2) See for instance Marr (1982), Koenderink (1984), Koenderink, van Doorn (1986), Grossberg (1988).
ATTRACTOR SYNTAX
177
(b) The cut-locus as a dynamical object. The arrows represent the direction of evolution of the cut-locus. (From Blum [1973]).
Hugh Bellemare has implemented the contour propagation routine in a CN network with five layers1. 1. The first layer enters the input. 2.-3. The second and third layers compute the X and Y components of the rays. 4. The fourth layer computes all the singular points of the propagation. 5. The fifth layer computes the cut locus using as a geometrical criterion the discontinuities of the divergence of the field. The figures show some examples. i) In the first example (rectangle), we see how a CL evolves. ii) In the second example, we see that every shape, however irregular it might be, has a well defined characteristic CL. iii) In the other examples, we see how the external CL of a configuration a of domains Ai evolves and progressively partitions — categorizes, stratifies — the ambiant surrounding space in regions Ri associated to the Ai. In this case, the CL is a 1-dimensional singular structure which is locally computable and whose geometry characterizes the global configuration a.
1) See Bellemare (1991).
178
JEANPETITOT Figure 5
(a) The propagation process triggered by a simple rectangle. The cut-locus (displayed by thefifthlayer of the network) is constructed progressively.
(b) The components X and Y of the rays at a particular moment of the propagation.
ATTRACTOR SYNTAX
(c) The activity of thefifthlayer during the construction of the cut-locus.
179
180
JEAN PETITOT Figure 6
The cut-locus of an arbitrary form.
ATTRACTOR SYNTAX
181
Figure 7
The cut-locus of a 2-domain configuration. It partitions the ambiant space into two regions.
182
JEAN PETITOT
Figure 8
The cut-locus of a 3-domain configuration. Observe the emergence of a triple point which is a singularity characterizing the configuration.
ATTRACTOR SYNTAX
Figure 9
The cut-locus of a general configuration.
183
184
JEANPETITOT
Conclusion Starting from the Smolensky vs. Fodor-Pylyshyn debate concerning CN modeling of constituency and compositionality, we have first stressed that the main problem was to reach a CN configurational definition of semantic (actantial) roles, that is of that "geometrical whole, where the geometrical relations are themselves semantically significant", which constitutes the geometrical basis of constituent-structures1. We could of course only sketch our solution to this problem. In more elabo rated works we have developed the following strategy. 1. We need first an "appropriate" linguistic theory. We select cognitive grammars. Using this general perceptive, iconic and schematic grounding of basic elementary syntactic structures, we can reduce — via cognitive archetypes — the main problem to that of "perceptive" (in fact schematic and iconic) constituency. 2. We then introduce contour diffusion routines (spreading activation triggered by boundaries) which generalize some well-known routines of computa tional vision to higher-order representational levels. We show, according to deep mathematical theorems (e.g. Morse's theorem), that the singularities of the diffusion-propagation processes are singular structures locally and finitely accessible which encode in a local and finite manner the global holistic structure of the considered configurations. This is the first key idea : constituency is retrievable from the detection of singularities. 3. The contour diffusion-propagation routines solve the "global Gestalt VS local computation" dilemma. It allows the scanning of profiled positional relations. With such a result at hand we can easily explain how to scan actantial processes and interaction schemes. It is therefore possible to elaborate a CN theory of a configurational conception of semantic (actantial) roles. 4. It can be shown that these models are deeply correlated with models where the actants of a process are modeled by the attractors of some dynamics. In these models, the interactions between actants (which are the basis of syntactic constituency) are modeled by what are called in mathematics bifurcations of attractors2,3.
1) For technical precisions and mathematical details, see Petitot (1985), (1992), (1993). 2) The first dynamical models for syntax were introduced by René Thom at the end of the sixties. They have been developped by the European School of "Catastrophe Theory". See in particular Thom (1980], (1988), Wildgen (1982), Brandt (1986), Petitot (1985), (1992). 3) More details can be found in some of our other works (see the bibliography).
ATTRACTOR SYNTAX
185
REFERENCES
Bellemare, H. 1991. Processus de diffusion en Perception visuelle. Technical Report, Paris : Ecole des Hautes Etudes en Sciences Sociales,. Blum, H. 1973. Biological Shape and Visual Science. Journal of Theoretical Biology, 38, pp. 205-287. Brandt, P.A. 1986. La Charpente modale du Sens. Doctoral Thesis, University of Paris III. Desclés, J.P. 1990. Langages applicatifs, langues naturelles et cognition, Paris : Hermès. Fodor, J., Z. Pylyshyn. 1988. Connectionism and Cognitive architecture : A critical analysis. Cognition, 28, 1/2, pp. 3-71. Fodor, J., B.P. Mclaughlin. 1990. Connectionism and the problem of systematicity : Why Smolensky's solution doesn't work". Cognition, 35, pp. 183-204. Grossberg, St., (ed.). 1988. Neural Networks and Natural Intelligence, Cambridge : MIT Press. Jackendoff, R. 1983. Semantics and Cognition, Cambridge : MIT Press. JackendofF, R. 1987. Consciousness and the Computational Mind, Cambridge : MIT Press. Koenderink, J.J. 1984. The Stucture of Images. Biological Cybernetics, 50, pp. 363-370. Koenderink, J.J., Van Doorn, A.J. 1986. Dynamic Shape. Biological Cybernetics, 53, pp. 383-396. Lakoff, G. 1988. A Suggestion for a Linguistics with Connectionist Foundations. Proceedings of the 1988 Connectionist Models Summer School, M. Kaufman. Langacker, R. 1987. Foundations of Cognitive Grammar, vol. I, Stanford University Press. Langacker, R. 1991. Foundations of Cognitive Grammar, vol II, Stanford University Press. Marr, D. 1982. Vision, San Francisco : Freeman. Petitot, J. 1979. Hypothèse localiste et Théorie des Catastrophes. In M. Piatelli (ed.) Théories du Langage, Théories de l'Apprentissage, Paris : Le Seuil.
186
JEAN PETITOT
Petitot, J. 1985. Morphogenèse du Sens , Paris : Presses Universitaires de France. Petitot, J. 1986. Structure. In Th. Sebeok (ed.) Encyclopedic Dictionary of Semiotics, t. 2, New-York : de Gruyter, pp. 991-1022. Petitot, J. 1989a. Hypothèse localiste, Modèles morphodynamiques et Théories cognitives : Remarques sur une note de 1975. Semiotica, 11, 1/3, pp. 65119. Petitot, J. 1989b. Modèles morphodynamiques pour la Grammaire cognitive et la Sémiotique modale. RSSI (Canadian Semiotic Association), 9, 1-2-3, pp. 1751. Petitot, J. 1990. Le Physique, le Morphologique, le Symbolique. Remarques sur la Vision. Revue de Synthèse, 1-2, pp. 139-183. Petitot, J. 1991a. Syntaxe topologique et Grammaire cognitive. Langages, 103, pp. 97-128. Petitot, J. 1991b. Why Connectionism is such a Good Thing. A Criticism of Fodor's and Pylyshyn's Criticism of Smolensky. Philosophica, 47, 1, pp. 49-79. Petitot, J. 1992. Physique du Sens, Paris : Editions du CNRS. Petitot, J. 1994. Morphodynamics and Attractor Syntax. In T. van Gelder, R. Port (eds.)77ze Mind as Motion, MIT Press. Smolensky, P. 1988. On the Proper Treatment of Connectionism. The Behavioral and Brain Sciences, 11, pp. 1-23. Smolensky, P. 1990. Tensor Product Variable Binding and the Representation of Symbolic Structures in Connectionist Networks. Artificial Intelligence, 46, pp. 159-216. Smolensky, P., G. Legendre, Y. Miyata. 1992. Principles for an Integrated Connectionist /'Symbolic Theory of Higher Cognition. Technical Report, Department of Computer Science, University of Colorado at Boulder. Talmy, L. 1983. How Language Structures Space. In H. Pick, L. Acredolo, (eds.) Spatial Orientation : Theory, Research and Application, Plenum Press. Talmy, L. 1985. Force Dynamics in Language and Thought. Parasession on Causatives and Agentivity, Chicago Linguistic Society (21 st. Regional Meeting). Talmy, R. 1990. Fictive Motion in Language and Perception. Workshop Motivation in Language, International Center for Semiotic and Cognitive Studies, University of San Marino. Thorn, R. 1980. Modèles mathématiques de la Morphogenèse (2 e éd.), Paris : Christian Bourgois. Thom, R. 1988. Esquisse d'une Sémiophysique, Paris : InterEditions.
ATTRACTOR SYNTAX
187
Visetti, Y.M. 1990. Modèles connexionnistes et représentations structurées. In D. Memmi, Y.M. Visetti (eds.) Modèles Connexionnistes, Intellectica, 9-10, pp. 167-212. Wildgen, W. 1982. Catastrophe Theoretic Semantics, Amsterdam : Benjamins.
A DISCRETE APPROACH BASED ON LOGIC SIMULATING CONTINUITY IN LEXICAL SEMANTICS VIOLAINE PRINCE ENS de Cachan, LIMSI, CNRS, France
Lexical semantics processing by means of computers can be performed wi thin two major modelling frames : first, a framework which takes into account the natural continuity in the semantics of words, and which is based on an associative point of view on sense in context. This framework is inspired from the connectionist or neuromimetic approach (Cottrell 1985 ; Selman & Hirst 1985 ; Victorri 1988 ; Victorri & alii 1989). It provides interesting results because it is intrinsically adapted to the notion of continuity through the fine grain of its numerical calculus functions. Nevertheless, it is still a "local" approach in the sense that it has not yet proved to be efficient when running a thorough processing of built texts. It seems that some architectural enhancements in computer functioning still have to be made before this approach reaches its best scores. Second, a framework which relies on the very nature of the computational setting, that is, a discrete approach. It emphasizes logic as the basic ground for the representation of meaning, and composition as the major law which derives com pound or transformed meanings from primitive or original meanings. Calculus is performed through symbolic functions. Continuity is not represented as such, but is understood as being the result of a complex function applied to an important amount of the most elementary "particles" of sense (mainly covered by the Composionality Principle). This framework, though highly performative in the field of conceptual semantics, finds some limitation in its results when trying to express the subtleties of lexical semantics. Mostly, it happens to be sensitive to distorsions in meaning (Hirst 1987) and does not show in that case a behaviour as robust as the former approach. The most typical phenomenon in which the two frameworks may compete, within lexical semantics processing, is the case of polysemy. In our opinion, polysemy presents some particularities which invalidate such a simple idea as that of a composition law applied on meaning. In the same time, polysemy does not act as a brownian movement in the universe of meanings, solely creating sense out of the conditions of its context : polysemous words do not vary ad infinitum, and
190
VIOLAINE PRINCE
seem to obey some constraints on the emergence of their contextual meaning (Lakoff & Johnson 1980) (Hobbs &P. Martin 1987). Therefore, whatever the chosen framework for computational processing is, polysemy has to be dealt with by means of some enhancements of the original setting. 1.
Polysemous lexical elements : words that tamper with the sta bility of concepts
We have chosen to study the modelling of polysemous words with the idea of providing a structure able to adapt to conceptual semantics requirements. In other words, the framework which is ours belongs to the domain of the discrete approach. One of the reasons for this is the existence of tools already built for morpho-syntactic parsing and for semantic representation of propositions. These tools have passed through many refinements throughout the various experiments in computational linguistics and have reached a state of maturity, although modi fications are still welcome. While being intererested in semantic representation at the lexical level (position also argued by Mel'chuk 1983 ; Stallard 1987 and J. Martin 1990), we have noticed that the very notion of "concept", in spite of its usefulness in auto matised processing, seems to cause more problems when associated to some lexi cal elements (lexemes), mainly the polysemous ones, than what it helps to resolve (this was discussed from the beginning by Brachman [1979] who sticked to an epistemological level rather than to a lexical level. However, many later usages of semantic networks in NLP have discarded Brachman's reserved position about the word-concept relationship). It is an old idea that the relationship between concepts and words is not so simple, in abstracto. One could have thought that computational needs, which are less demanding than philosophical and linguistic assumptions, would have been satisfied with a rough approximation of this relationship. This approximative substitute could be expressed either as the correspondance relationship (one concept, one word) or the hierarchical relationship (one concept, many words which act as synonyms) or its symmetrical, the decomposition relationship (many concepts which act as homonyms, one word) (notions existing in [Sowa 1984], also present, but with more restrictions in [Le Ny 1979]). In the case of polysemy, such a proposition is not satisfactory. 1.1. An undecidability in the type of relationship between concepts and polysemous words In polysemous words, more than one of these relations is valid between a concept and a word. One can always find a word W to express a concept C : W is at least the lexical realisation of C if it belongs to a lexicon.
SIMULATING CONTINUITY
191
Example : let us note by @eat the concept of eatingl. The word for @eat is *eat. If the C concept is sufficiently "current", then many words or lexical ex pressions, W 1 ,... Wp could represent the concept C. Are we able to choose bet ween the correspondance or the hierarchical relationship ? Example : The synonym words or other lexical realisations for @eat are : *absorb, *consume. Both could represent @eat as long as they can theoretically be substituted to it in a text. The theory is limited by current usage. ( 1 ) John eats an apple (2) John consumes an apple (3 John absorbs an apple Sentence (2) would be a precious way to express sentence (1) whereas sen tence (3) is unlikely to be used. If one replaces *apple with *soup, than it is the other way round. Therefore, it seems that the environment has an effect on the relationship between a concept and a word, and this effect is at least partly defined by current usage. Furthermore, we can show that the third type of relationship is also simulta neously represented with the first two. Let us consider two cases. First, the lexical realisation of C could be associated to other C 1 ,... Cq concepts. Example : the word *eat is associated to the concept ©threaten in the sen tence : He wont eat you It is also associated to the concept ©trouble in the sentence : What is eating you ? Second, any of the Wi words associated to C is at the same time associated at least to its conceptual extraction. Example : the word *absorb is associable with the concept ©absorb which could be explained by the following definition : A absorbs B if the substance ofB is incorporated in the substance of A at the end of the process. To summerize our first remark, it seems difficult with a polysemous word to determine the nature of the relationship between the concept (discrete unit of thought) and the word (discrete unit of language) as a computational symbolic function. Nevertheless, the main factors in semantic determination seem to be the en vironment (e.g. the particular words surrounding the given one), the current
1) This notation is inspired from John Sowa.
192
VIOLAINE PRINCE
usage (what is accepted, authorized, or not), and some potential meanings, associated to the word ab initio. 1.2. Are polysemous words modifiers of the essence of concepts ? The second remark that one can make about polysemous words is : to what extent the possible Cl,... Cq concepts associated to a polysemous word W are (a) different concepts linked through the word by a co-existence relationship, (b) dif ferent concepts related by common attributes, (c) partial views on concepts ex pressing sole relevant aspects determined by current usage, (d) partial views on concepts, determined by their participation to the meaning of the word W, and re levant to the meanings associated to W by current usage and environment. Propositions (a) and (b) rely on the paradigm of concept integrity. Concept integrity could be defined as the conservation of the intensional de finition (in terms of generic properties) or the extensional definition (in terms of elements constituting the defined set) of a concept. In most cases, specific con cepts are defined intensionally (in semantic networks, type lattices, object-oriented classes and so forth), and generic concepts being extensionally described in terms of their properties. In the "dictionary-oriented" point of view over lexica, one happens to find extensive definitions of concepts. Whatever the type of definition is, it seems that any slight change in the definition seems to bring forth a change at the conceptual level. We claim that polysemy falsifies concept integrity and therefore propositions (a) and (b) which could be valid for homonymy are not true for polysemy (see also J. Picoche's argumentation in that sense). A simple argument in favour of our claim could be the following : the defi nitions, in terms of properties, of the concepts associated with a polysemous word W are not evaluable as such when these concepts are involved by the current usage of W. This leads to two assumptions. First, the setting functions as if a concept were "modified" by the co-occurrence of other concepts participating to the expression of a given word sense.The notion of participation is described at the sentence level by HJ. Seiler : we believe that it could apply at this 'sub-lexi cal' level. This means that if a word Wi and a word W2 invoke the same concept C for the description of their meaning, the description of C relatively to Wi is not exactly the same as the description of C relatively to W2, although it is intuitively the same concept (see the notion of 'ambiversion' in [Pagès 1987]). Second, it seems that definition and current usage are either two different processes on a concept (and there is a traditional belief in accordance with this assumption), or ir relevant notions concerning a concept. If we assume the first proposition, we find that definition is a conservative process (definition is stable), and current usage is an evolutive
SIMULATING CONTINUITY
193
process. (For further details, refer to the discussion about stability and distorsion in [Culioli 1986] and the structural stability analysed in [Thorn 1972]). Considering these processes as two separate operators on language is close to the traditional approach of a primary meaning and derived meanings, with few or no inferences about derivation modes. We find this approach costly in terms of computational resources, especially that the computational frame provides interesting tools for inference production. If we assume that both notions are meaningless for concepts, then concept integrity itself drops down as a concept property. And this brings us to consider propositions (c) and (d). Both rely on the idea that only a local definition (determined by current usage) and relevant to the polysemous word apprehends satisfactorily a concept. As it is local, this definition claims to define a concept neither extensively, nor in terms of truth conditions. Therefore, it is called a concept or conceptual view. Whereas proposition (c) still give credit to the paradigm of a dominant concept, even if this latter is amputated of its integrity, proposition (d) allows more importance to the linguistic expression, the word. Particularly, it reintroduces the actions of environment variation as modifiers of the concept participation to the word meaning (Ortony 1979). 2.
How a discrete approach could still manage to give the im pression of continuity
The characteristics of polysemy as a widespread linguistic phenomenon — here seen at the lexical level but nevertheless extendable to a higher structural level — have been very rapidly overviewed in this first section. We have tried to show how they have weakened the paradigms of a traditional conceptual semantics ar chitecture. In that case, one may want to entirely throw out a conceptual point of view over semantics, and to adopt another approach. Our work has been devoted to experiment modifications, inspired from some robust properties of the associa tive (and hence, neuromimetic) approach, but nonetheless of a discrete essence. The most crucial goal was to look for a new presentation of continuity. Instead of sticking to the idea that promotes "decomposition into fine-grained particles" as the paradigm of continuity representation, we prefer to put forth the notion of a "middle-grained particles dynamic recomposition". Let us give an illustration of this argument. 2.1. Continuity as a decomposition into fine-grained particles Two possible trends can be observed within this paradigm. First a "componential trend" based on the idea of differenciating the meanings thanks to the existence of a modification/append/deletion of a given component in the description.The componential analysis school is represented by (Pottier 1963 ;
194
VIOLAINE PRINCE
Greimas 1966 and Rastier 1985). Second, a "variable depth trend" based on the sound idea that meaning is a function of the generality level and the functional ca tegory. Generality level could be defined by the current usage of the word as a generic lexical item (as *like in novellists like cats ) or as a more specific designa tion (as *like in I like my neighbour John). Functional category could be defi ned by the current usage of the word as a notion, as a process, or as the result of the process. Coherent with the componential trend, this categorisation finds achie vements with (R. Martin 1983) in linguistics and (Kayser 1990) in knowledge re presentation. Example : we are waiting for the vote at the Parliament. This means as much : the vote (as a process) to occur, the vote (as the result of the process) to be given. Thus variation in meaning is likely to be caused by variation in generality le vel and functional category. 2.1.1. Example of a componential representation of continuity Example : to give an idea of the continuity of concept view over @eat, we can note that @eat is extensionally describable, when associated with *eat, in terms of components (all seen as relevant properties to the concept view over @eat) by means of the following portion of a graph :
Remark 1 : arcs between the different nodes could be oriented in both directions. Remark 2 : all "©concepts" in this graph are implicitely seen as relative descriptions (that is, the @soup described here is a link to the *soup as a lexeme,
SIMULATING CONTINUITY
195
which is a possible neighbour to *eat, and is not a definition of @soup) (see dis cussion about the isotopy phenomenon in [Rastier 1987]). Such a description could give the idea of continuity in meaning as long as one considers the set of paths in this graph. A path can be described in terms of a list of nodes linked by oriented arcs (computationally speaking). One has to notice that elements with a different backgrounds in this illustration are considered as belonging to the interpretation. Example : in the sentence John eats an apple, an interpretation of @eat (merged with *eat) is described by : (C1) @eat _ @absorb_@animate_@dead_@food_@solid_APPLE Let us notice that APPLE could have been considered as a living food. Therefore the other path is also correct. (C2) @eat _ @absorb_@animate_@living_@food_@solid_APPLE But as apple could be reached by the path : (C3) @eat _ @feed-on _@food_@solid_APPLE Beyond the problem of ambiguity, which is clearly demonstrated here (as long as inferences about the relevance of @feed-on as a valid concept in this inter pretation are not performed), we can consider that : first, the existence of more than one path, and second, the closeness between the components of these paths, show that the semantics of this interpretation seem to require a certain "thickness". This pleads in favour of a set solution instead of an atomic solution, with the constraint that the elements of the set are semantically rather close. With such a small semantic distance, we can say that continuity is granted thanks to the mul tiplication of nodes along the paths, and paths that constitute a "semantic bundle". 2.1.2. Continuity in the depth of interpretation The variable depth position proposes a dynamic alternative to a componential representation. Robert Martin's position was to propose operators on semes associated to a lexical element (addition, substraccion) which will create an i level interpretation. The computational theory developed by Kayser was to provide a default-based formal system for which default corresponds to the root-level of in terpretation. It relies on the idea that meaning varies almost "continuously" within the depth of the hierarchical representation of the relevant conceptual components. Therefore, a meaning indexed by i of a word W, is understood by both the path to the relevant component i from the root (depth) and the adjacent elements at the i level in the tree (width). This accounts for the notion of interpretation level. Example : let us take the sentence : (4)
The book is on the shelf
196
VIOLAINE PRINCE
"Book" could be understandable as a polysemous word, at least at the func tional level. "Book" addresses as much the physical object (a brown heavy book), as the reference (Selfs last book) as the contents (this a book about metamorphisms) and an evaluable work (an extraordinary book). A physical level interpretation of the sentence would give the meaning : the physical object book is located on the shelf. A reference level interpretation will_give : the reference (that you are looking for) is embodied in the physical object located on the shelf. And so forth for contents and evaluation. Variable depth admits that a level could be more relevant at given moment, especially when this is indicated by context. Example : by default, the physical level interpretation seems to be a correct interpretation of the sentence. But this does not preclude the fact that other meanings could still be valid. This theory assumes that many meanings are valid, but that, pragmatically, there is emergence of one interpretation. As long as one could go to any depth one wants to, variable depth know ledge representation cannot be extensive, or would create very huge structure in which it computes its paths. Works in that field rely more on knowledge derived from inference interpretative rules and on nonmonotonic reasoning. 2.1.3. Continuity as a dynamic recomposition of a small number of properties We find that componential analysis matches the requirements of proposition (c) as current usage is encapsulated in the representation. We also think that va riable depth partially matches the requirements of proposition (d) as mentioned in paragraph 1.2. It gives the same encapsulation of current usage by means of infe rence rules and derivation principles, and provide some hints about environmental importance by means of a circumscription principle. In the same flavour, the principle of dynamic recomposition gives ground to inference rules as producers of current usage (Hobbs 1983), but environment and current usage are not so basically separated. Assuming the existence of a feedback between allowed interpretations and the existence of particular semantic or prag matic markers, the principle of dynamic recomposition draws its power from the following framework. (e) A minimal representational part is necessary to account for the relevant properties of concept views (close to qualia structure in [Pustejovsky 1989 ; Pustejovsky 1991]). (f) These properties have proven not to be too numerous as long as a con cept view is a sketchy survey over that concept.
SIMULATING CONTINUITY
197
(g) As the representational part is a rough schema, the subtleties of current usage and environment have to be embedded into inference rules. (h) In order not to have too many rules, one may rely on some distinctions between general rules ( always applicable ) and specific rules (triggering a particu lar meaning). (i) An interpretation is the application of inference rules, whose premises are instanciated by context (environment), on the representational part. As a consequence, a meaning (of a word W) is computed and not derived, because meaning is context-dependant whereas description is partly context-free (representational) and partly instanciable by context (inference rules). Example : let us agree that *eat could be described by means of views over the concepts ©absorb, @feed-on, ©trouble, ©threaten Let us also agree that what is relevant to absorbtion when *eat is invoked are properties of belonging to categories such as : state-of-the-absorbed-thing : (5) (6)
John eats a hot soup the acid is eating my jacket fabric
For which we can see a salience for a destructive absorbtion in sentence (6). Therefore we will mark "destructive" as being a relevant property, because even if it is not emphasized in sentence (5) it is still nevertheless true. This will show the "difference of exposure" between ©absorb as seen from *eat, and ©absorb as seen from * engulf (where absorption is not destructive). notation : (Gl) EAT - ©absorb [destructive... is adopted. It expresses a link between the word eat (in capital letter) a concept view over the concept ©absorb, for which destruction gives a proper light. way-the-absorption-is done : It seems that current usage of *eat denotes a certain importance of the way absorption is performed as a discriminating property. (7) (8)
John has eaten his way through three steacks Mary has eaten her bread crump by crump
As one cannot describe this property in term of a value, it is conserved as a generic feature.This gives us : (Gl (completed)) EAT - ©absorb [destructive, method]
198
VIOLAINE PRINCE
A rapid survey over corpora on the use of "eat" could give us other concept descriptions of views over ©threaten, ©trouble... But the process is similar to the preceding. Remark 3 : Let us notice that all known metaphors (to eat like a bird, to eat like a horse, to eat out of one's hand...) are not registered as such in that represen tation, although it is possible to interpret them through that process. The result would be an interpretation with a "degraded" value (in the case of distant meta phors) but "something of the meaning" would nevertheless be picked up (see con ventional metaphoric lexicon features in [Martin 1991]). 2.2. The requirements for a discrete computational approach to polysemy To put the preceding paragraphs in a nutshell, let us consider that, if we still stick to a discrete approach based on logic, we should construct our model for polysemy interpretation in a way which matches the following requirements. (j) although discrete, the system has to sufficiently 'simulate ' continuity, in order to express the apparent topology of the meaning set associable to a polysemous word. Simulation has to replace a fine grained composition, so that a com binatorial explosion could be circumvented (computer processing constraints). Therefore, if a dynamic recomposition of a small number of properties could be validated as an approximative idea of semantic continuity, then it could be consi dered as satisfactory from the processing point of view. (k) This system has to rely on discriminant information brought out of the surrounding linguistic structure. This information could be of a morpho-syntactic, a semantic, a pragmatic origin. It could be associated with words as much as with greater linguistic units (phrases, sentences). 2.3. The EDGAR 1 model : a possible solution considering cons traints and goals The model we have tried to promote is based on the recomposition principle as an approach to polysemy understanding (Prince 1991). It attempts to fulfill the conditions of proposition (d) (partial views on concepts, determined by their par ticipation to the meaning of the word W, and relevant to the meanings associated to W by current usage and environment.) by providing the following representa tion : (RI) A Word is described by a semantic potential of recognized concept views, themselves described by means of their properties relevant to the traditional significations associated with that word : sone constraints over these 1) Entry Driven Graph for Ambiguity Representation.
SIMULATING CONTINUITY
199
properties are also welcome for semantic determination (application of propositions (e), (f), (g) mentioned in § 2.1.3.). (R2) As an interpretation aid, specific interpretation rules (we call them pragmatic rules) help triggering the interpretation process. Their premises are instanciable by context information (the latter defined as in proposition [k]), their ac tions are to give an "activity value" to the properties of the semantic potential rele vant to these premises (application of part of proposition [g]). (R3) The propagation of activity, adopted from the neuromimetic 'activation spreading' trend, is defined by means of default inheritance rules. Their actions aim at assigning a value to every element of the semantic potential which is not interpretable. An element is interpretable if it is a word : concept views and pro perty labels are not considered as being interpretable (application of the other part of proposition [g]). (R4) Interpretation is a process that associates a given description (semantic potential) with an image depending on its context (proposition [h]). Therefore interpretation is a function of two things : a potential plus its inference rules, and a context. The produced image has to be a "set image" because an ato mic result does not account for continuity (proposition [j] Thus that produced image could be seen as a list of pairs (x,y) where x belongs to the potential of a word W, and y is an "activity value" attributed to x by either the action of a speci fic rule triggered by context, or the action of an inheritance rule propagating va lues. 2.3.1. Formalization of the lexical model In EDGAR terminology, a word is called an entry, because it is an input to the processor. Let e be the name of an entry. Let A e be the set of the descriptive elements of e (the semantic potential). Let P e be the set of inference rules associated to the potential of e. We will call n(e) the EDGAR representation of e. n(e) = (Ae, P e ). Definition of the description : Ae = A e l U A e 2 supplied with the predicate B e defined as such : Ael= set of concept views ; Ae2 = set of relevant properties ; B e : A e l x A e 2 — > (True, False) (x,y) > Be (x>y) = True if y is a relevant property of x for e. Definition of inference rules : V = {salient, inhibited, accompagnying, ignored) is the set of possible "activity values" for a non interpretable element.
200
VIOLAINE PRINCE p e = Pe1 U Pe2 U P1, supplied with predicate p on V ; p : Ae x V > (True, False) (x,v) > p(x,v) = True si x takes the value v by means of a rule of PeSpecific or pragmatic rules : Let IIe be the premisses of the pragmatic rules known about e. Let Pel be the set of these pragmatic rules. Pel= { [εj -—> p(ai, vi), ej e Ile, ai Є Ae,VЄ V) Constraints on properties (integrity rules in EDGAR termi nology) : Pe2= { [p(ai, vi)—-> p(bj, wj), ai,bj e A e , vi, wjG V} Default Inheritance rules :
2.3.2. Formalization of the interpretation function Let K be the set defining contextual information. Let IIe be the set of pragmatic rule premises associated with the entry e. Let N be the set of entry representations constituting the lexical knowledge base. The elements of N are n(e) representations. The interpretation function f is defined as : (k,n(e)) - - - > f(k,n(e)) = C(e,k) where C(e,k) is defined as : C(e,k) = {(x, v), x G A e , v G V}corresponding to pairs of (conceptproperties, value) resulting from the instan-ciation by k, and the application on A e , of the rules in P e . It is written as a theory A = ( W, P) where : X= {p(x, salient) A p(x, inhibited) → p(x, valid) ,V v G V [p(x, ignored) A p(x, v) → p(x, ignored)]} a consistency rules set ;
SIMULATING CONTINUITY
201
W = (∪e (Pel ∪Pe2)) ∪ X is the set of first order formulas of the system (all integrity and specific rules of all items plus consistency rules) ; P is the set of normal default written above. 2.4. Possibilities and Limits of the EDGAR model Among the advantages of the EDGAR model, let us notice that it tries, as much as possible, to reduce computational effort in both space occupation and processing time directions. Representations are restricted to the smallest level, un der which representation would not provide proper semantics. Inference rules are distributed along three types : the specific ones are not reductible, and therefore have to be registered, but we have attempted to shrink them to their thinnest shape. Integrity constraints are few and look like an interesting mean of optimi zing the specific rules number. Inheritance rules are general and therefore always applicable but written once. On the other hand, calculus is not performed in terms of a complex algo rithm. Specific rules trigger the process, which is continued by constraints propa gation of value and then by default propagation until all potential elements have been evaluated. In terms of polysemy resolution, this model is able to offer an interpretation, even if context is "hostile", which means that context information is scarce.The quality of interpretation depends on how high this hostility is. But then, human beings have to cope with the same alterations in their interpretation of polysemy, when its ambiguity is enhanced by a vague or obscure discourse. The C(e,k) configuration obtained as a result of the function f indicates the scope of interpretation : salient elements would hint for the most favoured con cepts, without dealing with the question of their integrity, and accompanying ele ments would hint for implicit knowledge. Both are useful for polysemy interpretation. We feel that conceptual semantics limits reside in that conceptual semantics cannot interpret words but for these salient concepts, and only when their integrity is not menaced. What we offer is a more qualified panel, although this latter is able to cooperate with a conceptual frame : it appears to us as a subtle "constraint relaxation" process on a conceptual representation. Nevertheless, our system has limits of its own, which we are trying to make recede. Among them first, limitations of the interpretation function, which, as defi ned here, is unable to produce an image when context and premises of pragmatic rules do not match, at least partly. We are mending this situation by providing a "mimickery behaviour interpretation" which supplies substitute premises from re sembling words (Prince & Bally-Ispas 1991). Second, this model has not taken into account interferences between many polysemous words within a sentence.
202
VIOLAINE PRINCE
We plan to provide it with a memory which will modify possible interpretation of the next encoutered words by using what has been found earlier (we plan to use Putejovsky's projective inheritance). Third, this system has difficulties into inter preting metaphors which are no more lexical but which involve big chunks of dis course. This has to be dealt with within a general framework for metaphor resolu tion by means of Artificial Intelligence : some of James Martin's achievements in that domain will lead us to also record some conventional 'moves' in terms of metaphoric behaviour. As a conclusion, we feel that the natural semantic wealth embedded in words must not drive researchers in Artificial Intelligence into looking at it as a bunch of "problems" but as an economic mean for meaning communication (discussion provided in [Picoche 1989]). Until now, ambiguity has been conside red as an impediment to comprehension, and hence, polysemy has not only been put on the same shelf, but evaluated as a true nuisance whenever it has mingled with interpretation processes. Whereas polysemy could be seen either as a short cut to communication between individuals — even between Man and machine — or as an enhancement to a particular communicative act by forcing associations wi thin the reader (or listener)'s mind. Thus, polysemy is a useful characteristic of natural language, as are ellipses, anaphora, or other economic phenomena. But this attitude will lead researchers to review their position toward poly semy in the natural language understanding (and generation) field. They would have to look like for sophisticated models to take into account polysemy as an in formation source on the utterer's intentions. We intuitively feel that models consi dering paraphrase and polysemy as dual phenomena (Fuchs 1982) (Fuchs 1988), whether associative (neuromimetic) or discrete (logic), are probably the best tracks for a unified AI model for "really natural" language understanding.
SIMULATING CONTINUITY
203
REFERENCES Brachman, R. 1979. On the Epistemological status of semantic Networks". In Associative networks : Representation and use of Knowledge by Computers, New York : Findler Academic Press, pp. 3-50. Cottrell, G.W. 1985. A connectionist approach to word sense disambiguation. Doctoral dissertation, University of Rochester. Culioli, A. 1986. Stabilité et déformabilité en linguistique. In Etudes de lettres, Langage et Connaissances, Université de Lausanne, pp. 3-10. Fuchs, C. 1982. La Paraphrase, Paris : Presses Universitaires de France. Fuchs, C. 1988. Représentation linguistique de la polysémie grammaticale. In TA. Informations, vol. 29, n°l/2, pp. 7-20. Greimas, AJ. 1966. Sémantique Structurale, Paris : Larousse. Hirst, G. 1987. Semantic interpretation and the resolution of ambiguity, Massachussets, Cambridge : University Press. Hobbs, J.R. 1983. Metaphor Interpretation as Selective Inferencing : Cognitive Processes. In Understanding Metaphor Empirical Studies in the Arts, vol. 1, n° 1, pp. 17-34, & n° 2, pp. 125-142. Hobbs, J.R. ; P. Martin. 1987. Local Pragmatics. Proceedings of the 10th International Joint Conference on Artificial Intelligence (IJCAI-87), Milan, Italy, pp. 520-523. Kayser, D. 1990. Truth and the interpretation of Natural Language : A non mono tonie approach to variable depth. In ECAI-90 proceedings, Stockholm, Sweden, pp. 392-394. Lakoff, G., M. Johnson. 1980. Metaphors we live by, Chicago University Press. Le Ny, J.F. 1979. Sémantique Psychologique, Paris : Presses Universitaires de France. Martin, J.H. 1990. A Computational Model of Metaphor Interpretation, New York : Academic Press. Martin, J.H. 1991. MetaBank : A Knowledge Base of Metaphoric Language Conventions. Proceedings of the IJCAI Workshop on Computational Approaches to Non-Literal Language, Sydney, Australia, pp. 74-82. Martin, R. 1983. Pour une logique du sens, Paris : Presses Universitaires de France.
204
VIOLAINE PRINCE
Mel'chuk, I. 1984. Dictionnaire Explicatif et Combinatoire du Français Contemporain, Montréal : Presses de l'Université. Ortony, A 1979. Beyond literal similarity. In The Psychological Review, vol. 86, n° 3, pp. 161-180. Pages, R. 1987. Ambivalence et Ambiversion. Actes du colloque Ambiguïté et Paraphrase. C. Fuchs (ed.), Presses de l'Université de Caen. France. Picoche, J. 1989. Polysémie n'est pas ambiguïté. Cahiers de Praxématique, n°12. Université Paul Valéry, Montpellier, France, pp. 75-89. Pottier, B. 1963. Recherches sur l'analyse sémantique en linguistique et traduction automatique. Publications de la Faculté des Lettres et Sciences Humaines de Nancy, série A. Nancy. Prince, V. 1991. GLACE : un système d'aide à la compréhension des éléments lexicaux inducteurs d'ambiguïtés. Proceedings of RFIA91, vol. 2. AFCET. Lyon, France, pp. 591-601. Prince, V. ; R. Bally-Ispas. 1991. Un algorithme pour le transfert de règles pragmatiques dans le processus complexe GLACE. Document Interne du LIMSI n° 91-17. Pustejovsky, J. 1989. Current Issues in Computational Lexical Semantics. Proceedings of the 4th Conference of the European Chapter of the ACL, Manchester, England. pp xvii-xxv. Pustejovsky, J. 1991. The Generative Lexicon. Computational Linguistics, vol. 17, n° 4., pp. 409-442. Rastier, F. 1985. L'isotopie sémantique, du mot au texte. Thèse de doctorat ès Lettres, Université de Paris IV. Rastier, F. 1987. Sémantique Interprétative, Paris : Presses Universitaires de France. Selman, B. ; G. Hirst. 1985. A rule-based connectionist parsing system. Proceedings of the Seventh Annual Cognitive Science Society Conference, Irvine, California. Sowa, J. 1984. Conceptual Structures : processing in mind and machine, Addison-Wesley, Reading, Massachussetts. Stallard, D. 1987. The logical Analysis of Lexical Ambiguity. Proceedings of the 25th Annual Meeting of the ACL, Stanford University, California, pp. 179185. Thorn, R. 1972. Stabilité structurelle et morphogenèse, Paris : BenjaminsEdiscience. Victorri, B. 1988. Modéliser la polysémie. In TA Informations, vol. 29, n° 12, Paris, France, pp. 21-42. Victorri, B. ; J.P. Raysz ; A. Konfe. 1989. Un modèle connexionniste de la poly sémie. Actes de la conférence Neuro-Nîmes, EC2 ed.
COARSE CODING AND THE LEXICON CATHERINE L. HARRIS Boston University, USA
Polysemy (words' multiple senses), while a source of delight for humorists and essayists, poses descriptive and theoretical problems for students of language. Does each distinct sense of a word receive a separate entry in the mental lexicon (Miller & Johnson-Laird 1976 ; Ruhl 1987) ? What factors make a particular sense of a word distinct enough from others that its meaning merits separate listing ? What principles constrain the types of relationships among the senses of a word (Jackendoff 1983 ; Lakoff 1987 ; Deane 1988) ? An abundance of recent work has provided some provocative answers. Observing that sense differences often have syntactic ramifications, theorists have identified these senses as the ones deserving distinct lexical entries (Pinker 1989 ; Grimshaw 1990). Related in spirit is Ruhl's (1989) "monosemic bias." Ruhl urges theorists to propose distinct lexical entries only after attempts to find a core sense common to all uses has failed. In contrast, the "radial category" perspective suggests that a much larger range of senses of a word are represented in the mental lexicon. Principles of human categorization and conceptual metaphor are thought to structure the relationship between word meanings (Lakoff 1987). Using spatial prepositions as his example, Deane (1992) attributes regularities among the polysemes of a word to basic cognitive processes such as the human ability to have different perspectives on the same spatial relation. This variety of opinion highlights the trade-offs between positing maximally abstract representations and enumerating the diverse senses of polysemous words. The former approach captures our intuitions about what is common across a word's uses, but seldom specifies details of the possible range of uses. The latter approach accounts for the range of uses, but obscures their commonalities. Unfortunately, both these approaches put off to be solved another day the problem of explaining how rules for words' contextual integration act to produce the specific interpretation we obtain on hearing words in context. In this chapter I argue that adopting a memory-based approach to lexical re presentation will illuminate the tension between abstractness and specificity of re presentation, as well as helping with the question of contextual integration. There
206
CATHERINE L. HARRIS
are two key ideas in the memory-based approach. The first is that units larger than words (such as phrases, clauses, conventional collocations and idioms) are the primary storage unit. On this view, words are laid down in memory with their fre quent left and right neighbors, and the meaning that is stored with them is the meaning of the unit as a whole, rather than the separate meanings of the individual words in the unit. This view has antecedents in Bolinger (1976), and draws hea vily on concepts and examples in Fillmore (1988) and Langacker (1987). The se cond key idea is that this large, heterogeneous set of phases and word combina tions is not a static list, but is stored in "superpositionaT or "distributed" fashion (Hinton, McClelland & Rumelhart 1986). The term from my title "coarse coding"1 refers to the encoding scheme in which information is represented as a pattern of activation over a pool of simple processing units which participate in the encoding of many different pieces of information. The virtues of conceiving of the lexicon as a superpositionally- stored list of phrases include the advantages noted by connectionist and neural-network resear chers : prototypes emerge when similar patterns reinforce each other, irregular pat terns are maintained if favored by frequency, and novel patterns can be generated or interpreted on analogy to familiar ones (McClelland, Hinton & Rumelhart 1986). I will try to show how, in addition to providing a natural representation for idioms and conventional expressions, the coarse-coding view incorporates me chanisms for both context-sensitivity and the abstraction of argument structure and subcategorization relations. I first describe coarse-coding schemes and why they are a useful way to conceive of lexical representation. Drawing on linguistic and psycholinguistic phe nomena, I motivate the view that the primary unit of linguistic storage is not the word, but is some larger piece (phrase, clause and sentence). Some aspects of the coarse-coding proposal can be illustrated with an existing simple connectionist model of prepositional polysemy (Harris 1994), although other aspects await a more thorough implementation. At that point in my story, a reader may well ask, if the organization of word and sentence meaning exquisitely reflects the statistics of the language, as I argue it does, what psychological variables constrain the statistics of the language ? My view is that factors related to language processing and communicative function are the ultimate shapers. Following researchers in the grammaticalization framework (Meillet 1958 ; Lehmann 1985 ; Givon 1989), I characterize speakers' communicative needs as a trade-off between the need to
1) In this chapter, I will use the term coarse coding rather than distributed representation to emphasize that coding schemes may vary in their coarseness. A coding scheme which contains some material that is relatively localized, but others that is distributed, can still be called a coarse coding scheme.
COARSE CODING
207
minimize processing costs while maximizing communicative impact. Polysemy figures in this equation because polysemy boosts the usage frequency of a word, which drives down the cost of lexical access. But extending a word into varied semantic contexts semantically bleaches it, which decreases its communicative impact. To achieve maximal impact, speakers reach for fresh words (Lehmann 1985). The historically observed cycle of recruitment of a new item, increasing semantic extension, and subsequent phonological reduction and ultimate use as a grammatical morpheme (Sweetser 1990) suggests that an encoding scheme which is inherently continuous will serve us well in understanding both synchronic and diachronic variation in words' form-meaning mappings. Coarse Coding In a coarse-coding scheme, the representational units do not match the in formation to be represented (e.g., "concepts") in a one-to-one fashion. Instead, each unit is active in representing a number of concepts. A concept is represented by a number of simultaneously active units. Distributed representations promote generalization (McClelland & Kawamoto 1986 ; St. John & McClelland 1988 ; Harris 1990) and exhibit graceful degradation (if one unit is destroyed, no single pattern is destroyed, although several patterns might be slightly degraded ; Hinton & Shallice 1991). An additional computational advantage is representational effi ciency (Hinton, Rumelhart & McClelland 1986). In a localist encoding, N units can represent at most N concepts. With coarse coding, a concept is represented by the joint activity in a number of units. The number of concepts that can be repre sented increases as the number of units that are simultaneously active increases (as long as each unit is active for several difference concepts ; Touretzky & Hinton 1988). Coarse Coding and Locating Visual Features One way to get a feel for how coarse coding leads to greater representational efficiency is to work through the visual processing example presented by Hinton, Rumelhart and McClelland (1986). The following four ideas are important to their example. 1. Receptive Field. The receptive field of a neuron in visual cortex is the area of the visual field to which the neuron is sensitive. The neuron becomes active if there is movement or change within this field. 2. Diameter and Overlap of Receptive Fields. In visual cortex, in dividual neurons often have large receptive fields which have considerable overlap with other neurons. The location of a feature in the visual field is accurately pin pointed when it falls within the receptive fields of a number of neurons. The joint activity of several neurons indicates that the feature is located at the intersection of the active units' receptive fields.
208
CATHERINE L. HARRIS
3. Accuracy Increases With Receptive Field Diameter. If there is no overlap in receptive fields, then we have a localist encoding rather than a distri buted one. We would say that the grain size of our coding scheme is fine, rather than coarse. No overlap means that single neurons are solely responsible for iden tifying the location of discriminable stimuli. If N processing units do not overlap, then N distinct locations in the visual field can be identified. But if we double the radius of a receptive field, then the fields of our N neurons will overlap, and we double the number of different locations that can be discriminated (assuming that each addition of an active neuron leads to a discernibly different network state.) 4. Coarse Coding Only Efficient if Features are Sparse. Hinton et al. point out that, if two or more stimuli in close proximity are to be distingui shed, then coarse coding will hinder more than help : several processing units will become active in response to more than one stimulus. In this case, a finer-grained coding scheme is needed, perhaps even a localist encoding. Coarse Coding and the Lexicon In the visual-field example above, the receptive field of a neuron in visual cortez is the set of simpler neurons in the retinotopic map. For the word-meaning example I will develop here, I will refer to "processing units" instead of "neurons". The receptive field of these processing units is a field of simpler units. Concepts or meanings are patterns of activation across a pool of units. An indivi dual simple unit does not have a distinct or determinable meaning. My main proposal is that a word is akin to a processing unit with a receptive field that may vary in size and the degree to which it overlaps with the receptive fields of other words. On this metaphor, polysemous words have wide receptive fields, and thus cover large (and perhaps illdefined) areas of semantic space. A distinct meaning (i.e., small region of multi-dimensional semantic space) is identi fied when several words, or words plus aspects of the non-linguistic context, combine to narrow down the space of possible meanings. On this interpretation, words do not encode one abstract meaning nor are they pointers to a list of several specific meanings. Instead, the mapping from sound to meaning is mediated by a coding scheme which varies in its coarseness. On different occasions of use, words communicate different pieces of information. Unambiguous pieces of information are usually communicated by the joint pre sence of several words. Coarse coding is an efficient representational scheme be cause, holding number of lexical items constant, a greater number of specific ideas can be communicated. For example, one could have a separate word for all the ways that an agent can act on an object using a sharp instrument, or one can have the single word cut. A specific intended meaning is pinpointed by conventional verb + particle combinations.
COARSE CODING
209
The traditional view of the advantage of stringing words together into larger units is that the individual items are the primitive building blocks of more complex ideas. The coarse-coding view suggests that multi-word compositions are used not only to construct a meaning that is more complex than any of its parts, but to pinpoint the concepts which are the intended building blocks. Limits to Linguistic Compositionality Three motivations for the coarse-coding view are the difficulty of specifying the building blocks of meaning construction, words' contextual stickiness, and our intuitions that highly polysemous words do not impose a burden on comprehension. Is the word the building block ? By "word" I refer to our folk-concept of a coherent phonological entity. This folk-concept has been concretized in our orthographic systems and legitimized through dictionaries and cultural scripts on how to talk about meaning and inten tion (Reddy 1979). What remains unclear is whether words have distinct, indivi duated meanings that are discretely represented in some kind of mental structure such as the hypothetical mental lexicon. It is now widely recognized that the meanings of1 most natural language ut terances are not obtained by concatenating the meanings of component words (Miller & Johnson-Laird 1976 ; Lakoff & Johnson 1979 ; Brugman 1988 ; Pinker 1989 ; Pustejovsky 1992). Despite this widespread agreement, many theorists continue to regard words as the building blocks of meaningful communication. It is generally assumed by lexical theorists (e.g., Pinker 1989 ; Miller & Fellbaum 1991) that words are privileged in at least two ways : 1. The form (either sound or orthography) of a word is associated with a data structure that is the primary storage site for linguistic meaning. 2. The form of a word is the entry-point into the representational system. These two factors do not logically have to co-occur, and indeed we can imagine a representational and access system in which neither is true, or true only to a degree which may vary from word to word. Researchers who acknowledge the ubiquity of polysemy may find congenial the perspective illustrated in
1) Some theorists consider the meaning of an utterance to be all evoked mental conceptualization (Langacker 1987 ; Lakoff 1987 ; Deane 1992), while others identify linguistic meaning with a subset of this (Pinker 1989). Although my own bias is towards the former view, taking a stance on this point is not necessary for the current discussion of limitations on compositionality.
210
CATHERINE L. HARRIS
Figure IA : the word is the entry-point into the system, but words activate representational structures that correspond to phonological units larger than a word.
Figure 1. Left-hand side lists language inputs, right-hand side "meanings". A. Illustrates the popular proposal that the phonological word is the unit around which meanings are mentally represented. Lexical items (cut, up, down, out) are thought to activate meaning structures. For polysemous words, multiple meanings are activated. In this illustration, only four of these are listed for cut. For up, down and out, only the sense that is typically meaningful in conjunction with cut is listed, but according to this proposal, all other meanings of these words would be listed here. B. It is proposed that frequent word combinations are recognized as units by the language comprehension system and that combinations, such as cut + particle, directly activate their conventional meanings. In addition, direct objects of cut that have certain properties, such as being an object with salient parts, or being a scale amount (e.g., cut costs) can combine with cut to directly activate a conventionalized sense such as "reduce in amount". The need for distinct meaning-representations that correspond to word com binations rather than words may be clearest to some readers for idioms such as
COARSE CODING
211
Shut up and out of sight, yet is necessary for handling many types of valencedmatch combinations, such as verb + particle (as in write off) or verb + highvalence matched noun phrase (such as open the door)1. Once we accept that linguistic concepts are represented in meaning-chunks that correspond to language units larger than the word, it is only a short conceptual leap to the view expressed by Figure 1B, wherein the word is no longer the privileged entry-point, but 2-, 3-, or 4-word combinations may be the size of unit that either initiates or achieves lexical access. On this view, there is a continuum of context-independence, with some words tightly associated with their typical neighbors, and others relatively inde pendent. But what accounts for the enduring appeal of the notion that words are a privileged unit of mental representation ? I suggest that words are privileged, not because of special ontological status, but because the word is the size of unit which maximizes a trade-off between frequency of usage and constancy of inter pretation. To explain what I mean by a trade-off between usage frequency and cons tancy of meaning, I will recruit some data from my ongoing study of the polyse mies of the word cut. I first investigated how many left-and-right context words were required for native speakers to identify the intended sense of cut. All ins tances of cut (nouns, verbs and adjectives) from the Brown University corpus (Francis & Kucera 1989) and the Lancaster University corpus were extracted in a manner that preserved five words of left context and five words of right context, to yield 11-word discourse fragments, with cut being the central word. Two 22year old native English speakers were given 15 minutes of training on how to categorize cut utterances into a classification system of cut senses similar to that described in Harris & Touretzky (1991). Some examples of the classifications made by raters are listed in Table 1. The two raters each judged the sense of 231 utterances in four separate ses sions that took 40 minutes each to complete. Raters sat in front of a Macintosh which controlled stimulus display and stored reaction times. Raters saw first a three-word utterance in which cut was the central word. After making a judgement of what sense of cut it was, they indicated their degree of confidence in this jud gement by hitting a key for either "guess", "some confidence" or "know for sure".
1) The term "Valence-match" refers to word combinations having a close semantic fit between a predicate and arguments (Brugman 1988 ; Mac Whinney 1989). Examples include subcategorization and selectional restrictions, as matches between the semantics of prepositions and their direct objects, such as in the cupboard and over the hill.
212
CATHERINE L. HARRIS
At this point the computer presented an additional right and left neighbor, and raters again selected a sense and gave their confidence rating. For each utterance there were a total of 5 increments of context to make up the 11-word discourse fragment, and thus 5 sense judgements for each utterance. Figure 2 shows the percent of utterances that were rated as either "know for sure" or "some confidence" with each increment of context. The absolute number of each type of judgement at the various increments obviously depends on task demands, such as the pressure raters may have felt to say "guess" at the 3-word and 5-word fragments to avoid the embarrassment of later reversals of judgement. Nevertheless, what is important is that there is no fixed amount of context neces sary for determining the sense of cut. Instead, we have a continuum of contextual dependency.
Number of Words in Sentence Fragment Figure 2. Raters successively judged the sense of 231 cut utterances with just one leftand-right neighbor (meaning that 3 words were in the sentence fragment), or with 2, 3, 4 or 5 left-and-right context words. At each increment they rated the confidence of their sense selection as either "guess", "know with some confidence*' or "know it for sure". In 97 % of fragments with the maximal amount of context (5 left-and-right context words), raters estimated they either knew the sense for sure, or knew it with "some confidence".
COARSE CODING
213
Although the linguistic unit we call the "word" can not be accorded buildingblock status on the basis of constancy of meaning, factors such as frequency may make it the "right-sized" unit for mental manipulation and mental representation. Intuitively, the larger the unit, the less frequently the entire unit will appear in spo ken and written texts. While whole sentences do repeat themselves in the ambient language (especially colloquialisms or other fixed expressions such, as Easy does it !), they repeat themselves far less frequently then word combinations or single words. Table 1 |
Discourse Fragment
Sense Selected
jar lids, omitting design disk. Cut a notch in lid for
penetrate
He waved at Fox to cut off the finale introduction. The
eliminate
were older two-story mansions, now cut up into furnished rooms and
section
Wars an Austrian threat to cut off supplies of coal to
sever connection
by the rotors. This was cut down to a minimum by
reduce
but you don't look exactly cut out for this life. Still
shape/formed
A comparison of the frequency of cut to the frequency of cut in combinations is illustrated in Figure 3. Cut appears 208 times in the Brown corpus.The most frequent cut combinations of size 2 include cut the (17 occurrences) cut off (16 occurrences), cut down (12 occurrences), cut in (9), cut his (9), cut across (8), cut to (7), cut through (6), cut it (6), cut up (5), cut into (4), cut from (4) and cut over (2). Mean frequencies were calculated for all occurrences of cut combinations of sizes 2, 3, and 4. The "Log Frequency" curve represents the frequency, on a logarithm scale, of the single word cut (log of 208 = 5.33) and the mean frequencies of cut combinations of size 2, 3, and 4. Superimposing the Log Frequency curve over the curve from Figure 2 illustrates the idea that a unit that is about the size of either the word or a valanced-match combination may gain special representation or access status due to an optimal interaction between frequency and constancy of meaning. Words' contextual stickiness Language acquisition researchers have noted that children usually first acquire words in one context of use, such as only saying bye bye ! when guests drive away in a car, or when the word deep is first restricted to describing puddles, and only more later understood to be applicable to swimming pools (Clark, 1983). Children also often learn a whole phrase as one unit, only later having the ability to use the parts out of their original linguistic context, as in the
|
214
CATHERINE L. HARRIS
demand many children can make at 15 months of age, Iwandat ! (Bates, Bretherton & Synder 1989). With time and linguistic practice, words do of course unstick from their original linguistic and extralinguistic environments, but it is likely that many words never entirely "unstick". This is most clearly seen with low frequency words such as paragon, a word whose meaning may be retrievable to some speakers only its typical linguistic context, paragon of virtue.
Size of Fragment (right context only) Figure 3. The single word cut appears 208 times in the Brown corpus. The most frequent cut collocations of size 2 include cut the (17 occurrences) cut off (16 occurrences), cut down (12 occurrences), cut in (9), cut his (9), cut across (8), cut to (7), cut through (6), cut it (6), cut up (5), cut into (4), cutfrom (4), cut over (2). Mean frequencies were calculated for all occurrences of cut collocations of sizes 2, 3 and 4. The "Log Frequency" curve represents the frequency, on a logarithm scale, of the single word cut (208 times in the Brown corpus) and the mean frequencies of cut collocations of size 2,3 and 4. Thise curve was superimposed over the curve reprensenting amount of context necessary to be sure of cut sense in order to illustrate that while the word is not the optimal size of chunk for determining sense, it is more nearly optimal in terms of frequency of access.
COARSE CODING
215
A second sign of words' contextual stickiness comes from psycholinguistic evidence that words are more easily accessed and more quickly understood in con ventional contexts (see Van Petten & Kutas 1991, for a review as well as relevant electrophysiological data). Models of the mental lexicon typically incorporate in formation about words' typical contexts of co-occurrence by positing spreadingactivation links between semantically and thematically related words. It is assumed that these links are built up out of speakers' years of experience with words in di verse contexts. But how these links are obtained from experiential corpora has ne ver been described. In the next section I suggest how the memory-based (or co arse-coding) view of the lexicon may be able to explain this. Why don't words with many meanings, or one abstract meaning, pose a comprehension burden ? If multiple senses of a polysemous word are represented in the lexicon, then the language listener is burdened with the task of selecting the current sense from all of those listed in the lexicon. On the other hand, if the lexicon contains only a maximally abstract encoding, along with abstract representation of allowable ar guments, our challenge is to articulate the rules of contextual inference allowing the concept "steal" to be inferred from sentences such as The thief took the jewels (Jackendoff, 1982 ; Miller & Johnson-Laird, 1976). On both accounts, words which can potentially cover a large semantic territory should imspose a comprehension burden, yet studies have failed to find that sentences containing these words are more difficult to understand than sentences containing words with more specific senses (Millis & Button 1989). One explanation for this might be that the extra processing burden of mat ching the multiple senses of a polysemous word to that word's context is obscured by the processing advantage of being high in frequency as the majority of polysemous words are (Gernsbacher 1984). I agree that the high frequency of polysemous words is part of the reason for their continued use, but would like to add that in many cases (although not all), polysemy does not pose a com prehension burden because the unit that initiates lexical access includes dis ambiguating lexical neighbors. Cut doesn't activate all its possible senses, because the system begins lexical access with cut in, cut up, cut down or the like. (In addition, we t don't have to proposal additional machinery to explain how the ap propriate sense of cut up is obtained from the listing of meanings for cut and up11) Many theorists recognize that the semantics of verb + particle combinations is such that these combination may require lexical entry status. But as long observed by Fillmore (1988) and more recently pointed out by Jackendoff (1992), granting lexical entry status to verb + particle combinations will take care of these obvious cases, but does nothing for the myriad other noncompositional conventional collocations.
216
CATHERINE L. HARRIS
But my claim is more than the idea that verb + particle has the status of lexical entry. My view (following Langacker 1987) is that there is no predefined limit on what amount of phonological signal can be used to activate a stable interpretation. Instead, there are mappings from larger combinations (phrases, even sentences) to stable interpretations, with varying degrees of componentiality within the larger combinations. Frequency of occurrence, and reliability of the form-meaning mappings, are candidates for the factors that determine what parts of the speech stream come to be represented in a relatively context-free manner. Computational realization In what type of representational system could these ideas could be computa tionally realized ? We desire a system with the following properties. 1. Stable (i.e. conventionalized) form-meaning mappings can exist over lin guistic units that vary in size, from sub-word units (morphemes and phoneme clusters with meaning connotations, such as English umble) to multi-word combi nations (including valence-match pairs, and idioms and other collocations). 2. The associations between forms and their meanings are sensitive to hori zontal co-occurrence statistics as well as the variety of meanings that a word can evoke (MacDonald, 1992 ; Juliano, Trueswell & Tanenhaus 1992). Horizontal co occurrence statistics are the frequent left and right neighbors of a word (as well as categorial abstractions over these neighbors). I have recommended conceiving of the lexicon as a memory-based system in which an extremely large number of form-meaning associations is stored. Storing many examples of a word in its different contexts, each with its context-appro priate meanings, ensures that the system contains information about what types of contexts are paired with what types of meanings. Distributed representations give us the ability to abstract over these pairings. If we idealize a language to be a dis tributed set of form-meanings pairs, we have a method for predicting what inva riances will be extracted, and how the degree of specificity of an extracted inva riance is related to the pool of utterances it summarizes. The extracted invariances will be those that were instrumental in learning the training corpus, and will thus be dependent on the type and token frequencies of pattern-set exemplars, and re presentational resources of the network (i.e., number of weights). These inva riances will naturally include the semantic and thematic associations between words that psycholinguistics have long observed in priming and reading-time ex periments. In the next section, I illustrate some aspects of this proposal by describing a simple connectionist network of prepositional polysemy.
COARSE CODING
217
An Illustrative Model In Harris (1994) distributed representations were used to model the mapping from a sentence containing polysemous prepositions to a representation of the sentence's meaning. Prepositional polysemy was selected as the example problem because the mapping from spatial expressions to their interpretation contains regularities which vary in their scope of application (Brugman 1988 ; Hawkins 1984). These regularities could be described by rules, although they would have to be rules that either have exceptions or are rules which have very specific conditions of application. Alternately, the regularities in mapping could be described by a constraint-satisfaction system. I took the approach that the cons traints emerge from the matrix of stored utterance-meaning pairs (Langacker 1987). Construction of corpus and training 2617 sentences of the form subject verb {over, across, through, around, above, under} object were constructed using 81 vocabulary items. The training corpus consisted of these sentences paired with hand-coded feature vectors identi fying salient semantic properties of the sentence's gestalt meaning. These features included domain features (that is, in which cognitive domain does the profiled re lation exist : the domain of space, of time, of money, of interpersonal power, of mental concepts), dimensionality and other salient properties of the primary figures (that is, the subject sentence and object of the preposition, also called the figure and ground, or trajector and landmark), and type of path (curved, end-point-focus). Figure 4 depicts the network architecture, while Table 2 lists some of the sentence templates (word combinations associated with semantic features) that were used to generate the corpus. Table 2 Sentence Templates
Types of Features Assigned
{road, fence, wires,river}stretched around {river, tunnel, road, corner, building, hill}
space curved-path 1-D static obstructing-object
snow lay across {blanket, bridge,river,tunnel, road}
space static 2-D extension
{tree, building,} stood across {field, road}
space path static end-pointfocus
{hiker, children, conspirators} lived around corner
space curved path 1-D static end-point focus
{hiker, children, conspirators} spent over {$100, $1000}
money below
{hiker, children, soldier, conspirators} was under captain
power below
218
CATHERINE L. HARRIS
A goal in constructing the corpus of utterance-meaning pairs was to include words which fall on a continuum of polysemy and which vary in the predictability of their left and right neighbors. I chose 4 prepositions which are relatively highly polysemous {over, across, through, around) and 2 that intuitively have only fewer different senses {above, under). The corpus was also constructed to contain verbs that had either little polysemy (1 to 4 senses), medium polysemy (6 to 10 senses), or high polysemy (13-20 senses). In this corpus, for example, the verbs cost and spent only had three senses, corresponding to whether they co-occurred with the prepositions over, under or around. At the other extreme, verbs such as ran and lay occurred with diverse polysemes of all the prepositions.
Figure 4. The network was trained to associated vectors representing word combinations of the form Subject Verb Preposition Object with vectors representing semantic salient features of the meaning of these word combinations. A relatively large number of vocabulary items was included to incorporate into the model both spatial and non-spatial senses of the prepositions. The corpus was trained using back-propagation (Rumelhart, Hinton & Williams, 1986) until error asymptoted (at 25,000 cycles-roughly 10 training cycles per input-output item). Network behavior Category abstraction. In previous work (Hinton 1986 ; Harris 1990 and others) it has been observed that the hidden-units of networks trained by backpropagation self-organize to categorize aspects of the input vector which participate in similar relations with other parts of the input, or which are paired with similar outputs. For example, some of the hidden units may evolve to have
COARSE CODING
219
identical activations for the items tunnel, woods and field, to capture the regularities in sentences differing only by this word, such as hiker walked through tunnel, hiker walked through woods, and hiker walked through field. Because of the large number of patterns in the corpus described above, the network can best decrease error by evolving hidden units that are selectively activated by items that participate in distributional regularities. For convenience, I'll refer to these hidden-unit organizations as categories. The categories formed by the network during training will vary in their specificity according to the demands of the regularities in the input-output patterns. For example, one of the main distinctions in verbs was whether they participated in spatial or mental relations. The mental verbs read, thought, argued and talked participated in very similar vectors and were thus categorized by the network without further subdivisions. In contrast, the spatial verbs (stood, is, lived, arrived, came, got, flew, moved, walked, ran, lay, and stretched) were similar and dissimilar to each other depending on the other words in the sentence. The verb ran behaved similarly to flew, moved, and walked (in denoting motion) when these words occurred with agentive subjects (soldier, conspirators, children, hiker, birds). But ran behaved similarly to another set of verbs (stretched, lay and was) when it occurred with non-agentive subjects such as road, river and fence. Context dependency. As just described, the network's hidden units did function as abstractions over items that fall in specific sentence positions. However, a more striking feature of the hidden-layer organization was that hidden units were always jointly activated by words in different sentence positions. For example, the hidden units that became strongly activated by the input node for walk were also activated both by items such as run and move as well as input items which commonly occurred with these motion verbs, such as agentive sub jects, the path prepositions over, across, and prepositional objects such as hill and yard. The network appeared to take two solutions to the problem of polysemy. It created internal categories corresponding to sentence-size templates, and it evolved hidden units which conflated semantic attributes of various senses of the polysemous words with their typical contexts of occurrence. To examine the extent to which the hidden-units illustrate the "continuum of context dependency" I analyzed all weights extending from the inputs to the hid den-layer in the following manner. An input unit was classified as activating a gi ven hidden unit if the weight from the input to the hidden unit was greater than the hidden unit's bias weight. In this network, a word's degree of contextdependency is encoded by the extent to which the word activates hidden-units which are also activated by its frequent left and right neighbors. To quantify this, the number of hidden- units which were activated both by a preposition and each of the preposition's possible direct objects was calculated. (Keep in mind that the
220
CATHERINE L. HARRIS
activation of hidden-units is akin to the network's encoding of the meaning of each word -the regularities in its co-occurrences with the semantic feature vector.) The graphs in Figure 5 plot the number of hidden-units activated for by prepositions and 17 selected direct objects. We can see that items which frequently occur together (such as frequent direct objects of over and across, the items building, hill and bridge) jointly activate more hidden units than items which don't occur together, such as the non-occurring combinations over the book and across the contract. Note that above strongly doesn't share strong encoding with any of the direct objects. This is consonant with above's relative context-independence.
Figure 5. Illustration of the continuum of valence-match encoding for four of the prepositions and 17 of the prepositional direct objects. Graphs plot the number of hidden units which were activated by both a particular preposition and the listed prepositional object. Note that frequent direct objects of over and across, such as hill and river, tend to activate the same hidden units that are activated by these prepositions. The input node for above can be viewed as less context-dependent than the other graphed prepositions in that there are fewer hidden units which are simultaneously activated by above and a preposition. The relatively high number of hidden units simultaneous activated by the inputs around and problem is due to the presence in the training corpus of the quasi-idiom, got around the (problem, contract).
COARSE CODING
221
Varied size of receptive field. The number of hidden units activated by an input unit can be called that input's "coarse-coding count" and viewed as the size of that item's "receptive field". The coarse-coding counts for the 81 input nodes varied from 0 to 12. The four "high polysemy" prepositions had coarse-co ding accounts of 5 to 6, while the "low polysemy" prepositions (above and under) only activated 2 hidden units each. The three inputs which activated no hidden units were cost, spent and had_authority. Why would a word activate no hidden units ? A word need not activate any hidden units if the output vector that occurs in all the word's contexts is totally predictable by the other items in the input vector. In natural language this is seldom, if ever, the case, but in the relatively artificial data set constructed for this simulation, the verbs cost, spent and had_authority added no information to the other words in the vector. All input vectors containing cost and spent also had either $100 or $1000 as the preposition's direct object. The verb had_authority al ways occurred in the context of person had_authority over person, a pattern which always activated the feature specifying the power domain. The correlation between the number of distinct senses of a word in the trai ning corpus and that item's coarse-coding count was only 0.29. The sheer diver sity of environments, independent of the question of number of senses, appeared to be the crucial factor for an item to activate a large number of hidden units. For example, the verbs came and got were in the medium polysemy group, yet after training ended up with a high coarse-coding count. Although I did not construct the training set to incorporate distinct senses for came and got, the fact that they could occur with the four highly polysemous prepositions meant that they ended up associated with a large number of meaning vectors. It would be helpful to more rigorously control the implementation of "number of distinct senses" and "diversity of contexts", and to think about whether it is important for our theories of the human lexicon to be sensitive to this difference. Assessment of the implementation of prepositional polysemy A drawback of the implementation just described is that words are identified with specific input nodes. A superior design would pair a phonological represen tation with semantic-feature vectors. This would allow word-sized phonological chunks to come to activate a distinct hidden-unit activation pattern to the extent that these word-sized chunks were predictive of meaning independent of their context. The positive points of the current implementation are that it illustrates several aspects of the theoretical proposal described in the first section of this chapter. - Implements the idealization of language as a set of associations between form and meaning, where "forms" are grammatical word combinations rather than words. Words in this model were not directly associated with specific meanings (except for the trajectors and landmarks, which were always paired with a few
222
CATHERINE L. HARRIS
features in the output vector regardless of the other items in the input vector). Instead, the network was trained to associate an entire "sentence" of the form subject verb preposition object with a semantic feature vector which encoded the relationship holding between the subject and object. - Continuum of context-dependency which reflects co-occurrence statistics, as reflected in the degree to which the hidden units activated by a particular word are the same as those activated by the words with which it typically co-occurs. - Presents a metaphor for conceptualizing the differences among, and the connection between, linguistic and non-linguistic aspects of meaning. The input nodes play the role of information about the form (sound or inscription) of words. The output nodes can be analogized to nonlinguistic meaning, including synapses to neurons that activate long-term memories and motor outputs. The weight ma trices interposed between these two can be conceptualized as the linguistic aspects of meaning : the categories, rules and mappings that mediate between purely com pletely arbitrary representations (the forms of words) and the conceptual structures they ultimately evoke. Processing Factors According to the coarse-coding proposal presented in this chapter, the pro blem of sense selection is minimized in natural language comprehension because processing units which encode a words' diverse senses also encode words' typical linguistic contexts. Once we have removed the problem of why polysemy does not tax comprehension, we are left with reasons for why polysemy is commonsensical. Intuitively, words that cover a large semantic territory will be used often be cause they fit more communicative contexts : high frequency derives from appli cability. An additional factor is the correlation between frequency and ease of lexi cal access. Usage frequency of a word, as measured by word counts in written texts (or by speakers' rating of familiarity), is the most consistent predictor of reaction time to naming and lexical decision tasks (i.e., deciding if a letter string is a word ; Forster 1981 ; Gernsbacher 1984). One significant processing cost is the difficulty of finding the phonological representation of a word or words given the intent to communicate a specific con cept or set of concepts. This link between meaning and sound in producing a sen tence has been hypothesized to be the weak link in the chain of linguistic proces sing (Bates & Wulfeck 1989). If this is indeed a weak link, then words that are easily accessible will be maintained in the lexicon. The characteristics of words that facilitate access are likely to include high frequency, high valence match with adjoining words, and high routinization of word combinations. Two types of evidence suggest that the link between meaning and sound is likely to be the weakest link in sentence processing. Cognitive psychologists have
COARSE CODING
223
shown that the more arbitrary an association, the more difficult to learn and more vulnerable to forgetting it is, and the more its access is facilitated by frequency of exposure (Anderson 1983). Aphasiologists have suggested that anomia (word finding difficulties) may be the common ingredient in the diversity of types of aphasia (Bates & Wulfeck 1989). One method for protecting the weak link between meaning and access to the sound pattern is to increase the frequency of access. Words that are repeatedly used will be more likely to remain in the language. Polysemy is thus a feature that decreases speakers' processing costs by decreasing the effort necessary for access. Above we noted that generality of meaning, or high number of distinct mea nings ("wide receptive field"), leads to high usage frequency. But causality may flow in the other direction. Words that are highly frequent are those that will be easiest to access, and thus are likely to be extended into new semantic territory, either to fill a new semantic niche that has appeared due to technological or cultural innovation, or to supplant existing words that may be harder to access because of their lower frequency. Pressures opposing polysemy. Not all words are highly polysemous. One force at work is likely to be speakers' goal of maximizing communicative im pact. All animal species grow accustomed to the commonplace and dishabituate to novelty (Lehmann 1985). Historical linguists refer to the "semantic bleaching" that accompanies the extension of a word into new semantic territories (Sweetser 1988). To increase impact, speakers are motivated to recruit old words to new uses, or to coin new lexical items. A second force for new coinages and for words with a restricted semantic range is that coarse-coding schemes have disadvantages when fine semantic distinctions are required. Many "overlapping receptive fields" need to be present to pinpoint a very precise meaning. If each of these receptive fields is a word, then the string of words necessary to convey a specific meaning would place processing burdens on speaker and hearer. Speakers will be motiva ted to use less phonological material, a motivation which may lead to the coining of new words. Closing Remarks, Related Approaches The short form of my proposal is that both language form and meaning are stored in chunks larger than a word, and that therefore the meaning of words is usually tightly linked to their typical contexts of occurrence. If we think it probable that a similar "microstructure of cognition" underlies linguistic as well as nonlinguistic abilities, then the continuum of context-dependency one observes in loo king at word meaning is just a linguistic manifestation of a continuum that is om nipresent in other cognitive domains.
224
CATHERINE L. HARRIS
I've proposed that the basic representational structure underlying linguistic knowledge is an associative pairing between actual sentences and their meanings. To the extent that humans possess templates, schemas, categories and rules, these are abstractions over stored utterance-meaning pairs. Among readers sympathetic to this proposal, I anticipate there will be those who find it so commonsensical as to render exposition unnecessary, while others may find much of it a restatement of ideas already thriving in the literature. I happily grant the latter point and will identify below at least a few of the theorists with whom these ideas originated. But it is worth emphasizing that the memory-based approach is strongly at odds with conventional wisdom about the nature of language. A succinct statement of this conventional wisdom is Aitchison's (1991) ex planation (to a general audience) that generalizations, not utterance-meaning pairs, are the fundamental representational structure of linguistic knowledge. "A lan guage such as English does not have, say, 7, 123, 541 possible sentences which people gradually learn, one by one. Instead, the speakers of a language have a fi nite number of principles or 'rules' which enable them to understand and put toge ther a potentially infinite number of sentences" (p. 14). After presenting examples of phonological and morphological rules, Aitchison concludes, "In brief, humans do not learn lists of utterances. Instead, they learn a number of principles or rules which they follow subconsciously." I have proposed that utterances are precisely what humans learn. Generalizations are hard- won, and are extracted only to the extent that they are li censed by statistical regularities, and existing abstractions over statistical regulari ties. The importance of people's store of expressions has been expressed by Langacker (1987) as follows : The grammar lists the full set of particular statements representing a speaker's grasp of linguistic convention, including those subsumed by general statements. Rather than thinking them an embarrassment, cognitive linguistics regard particular statements as the matrix from which general statements (rules) are extracted (p. 46).
The present proposal can be viewed as a logical extension of the approach to morphology championed by Joan Bybee (1985). Bybee notes that the traditional concern of the field of morphology, dividing words into parts and assigning mea ning to the parts, fails because it is impossible to find boundaries between mor phemes, and because morphemes change shape in different environments. If the field of morphology can not aspire to finding a one-to-one relation between se mantic units and their phonological expression, what should be morphologists' goal? Bybee takes on the task of explaining deviations from one-to-one correspon dence in terms of general cognitive characteristics of human language users. One of these is human's ability to use rote processing even for forms that could be
COARSE CODING
225
morphologically decomposed, and the tendency for frequency of occurrence to be the key characteristic that supports rote processing. I am not the first to import the coarse coding metaphor from visual proces sing to the domain of word meaning. Drawing on findings that patients with right hemisphere lesions have problems understanding nonliteral and pragmatic implica tions of linguistic expressions, cognitive neuroscientists Beeman, Friedman, Grafman, Perez, Diamond & Lindsay (1992) hypothesized that the right hemis phere, more than the left hemisphere, may coarsely code semantic information. By "greater coarse coding" in the right hemisphere the authors mean that large seman tic fields are weakly activated, allowing concepts that are more distantly related to the input word to become activated. This "long distance" activation could be what mediates the RH's ability to obtain more than one interpretation of a word or phrase, or to obtain the pragmatic point implicit in the meaning of several words. To test this hypothesis, Beeman et al. conducted a hemifield priming experiment. Subjects either read target words proceeded by three weakly related primes, or they read target words proceeded by one strong prime (flanked by two unrelated words). When naming targets presented to the right hemisphere, subjects benefited equally from both prime types, but benefited more from the one strong prime when naming targets presented to the left hemisphere. Why the two hemispheres differ in their degree of sensitivity to semantic overlap of multiple words remains to be addressed. Finding evidence that it re flects the conflicting demands of precision and creativity would do much to legi timize the notion that the structure of language is an exquisite interplay of spea kers' diverse communicative needs. Philosophers of science have long noted that the right theoretical metaphor can do much to invigorate even an ancient field. It is too early to know if the co arse coding metaphor will propel the field of lexical semantics, but an increased understanding of the nature of semantic continuums will do much to change poly semy from problem to delight for students of language.
226
CATHERINE L. HARRIS
REFERENCES
Aitchison, J. 1991. Language change : Progress or Decay ? 2nd Edition, Cambridge : Cambridge University Press. Anderson, J. 1983. The architecture of cognition, Cambridge, MA : Harvard University Press. Bates, E.A. & B. Wulfeck. 1989. Crosslinguistic studies of aphasia. In B. MacWhinney & E.A. Bates (eds.), The crosslinguistic study of sentence processing, New York : Cambridge University Press. Bates, E. ; I. Bretherton & L. Snyder. 1988. From first words to grammar : Individual differences and dissociable mechanisms, Cambridge : Cambridge University Press. Beeman, M. ; R.B. Friedman, J. Grafman, E. Perez, S. Diamond & M.B. Lindsay. 1992. Summation priming and coarse coding in the right hemisphere. Paper presented at the 33rd Annual Meeting of the Psychonomic Society, St. Louis, MO. Bolinger, D. 1976. Meaning and memory. Forum Unguisticum, 1, 1-14. Brugman, C. 1988. The story of 'over' : Polysemy, semantics and the structure of the lexicon, New York : Garland Publishing. Clark, E.V. 1983. Meanings and concepts. In J.H. Flavell & E.M. Markman (eds.), Cognitive Development, Volume III of the Handbook of Child Psychology, New York : John Wiley. Deane, P.D. 1988. Polysemy and cognition. Lingua, 75, 325-361. Deane. P.D. 1992. Multimodal semantic representation : on the semantic unity of over and other polysemous prepositions. Ninth Annual Meeting of the Eastern States Conference on Language, SUNY, Buffalo. Fillmore, C.J. 1988. On grammatical constructions. Unpublished manuscript, Department of Linguistics, University of California, Berkeley. Forster, K.I. 1981. Lexical access and lexical decision : Mechanisms of frequency sensitivity. Journal of Verbal Learning and Verbal Behavior. 22, 22-44. Francis, W.N. & H. Kucera. 1982. Frequency analysis of English usage : lexicon and grammar, Boston : Houghton Mifflin. Gernsbacher, M.A. 1984. Resolving 20 years of inconsistent interactions between lexical familiarity and orthography, concreteness and polysemy. Journal of Experimental Psychology : General, 113, 256-281.
COARSE CODING
227
Givon, T. 1989. Mind, code and context : Essays in pragmatics, Hillsdale, NJ : Erlbaum. Grimshaw, J. 1990. Argument structure, Cambridge, MA : MIT Press. Harris, C.L. 1990. Connectionism and cognitive linguistics. Connection Science, 2, 7-34. Harris, C.L. 1991. Parallel Distributed Processing Models and Metaphors for Language and Development. Ph.D. Dissertation, University of California, San Diego. Harris, C.L. 1994. Back-propagation representations for the rule-analogy continuum. In J. Barnden, & K. Holyoak, (eds.), Analogical Connections, Norwood, N.J : Ablex. Vol. II, pp. 282-326. Harris, C.L. & D.S. Touretzky. 1991. Verbal polysemy as a knowledge representation problem. Paper presented to the Second International Cognitive Linguistics Conference, Santa Cruz, CA. Hawkins, B. 1984. The semantics of English spatial prepositions. Ph.D. Dissertation, University of California, San Diego. Hinton, G.E. ; J.L. McClelland & D.E. Rumelhart. 1986. Distributed representations. In D.E. Rumelhart & J.L. McClelland (eds.) Parallel distributed processing : Explorations in the microstructure of cognition, vol. 1, Cambridge, MA : MIT Press. Hinton, G.E. & T. Shallice. 1991. Lesioning an attractor network - Investigations of acquired dyslexia. Psychological Review, 98, 74-95. Jackendoff, R.S. 1983. Semantics and cognition, Cambridge, MA : MIT Press. Jackendoff, R.S. 1992. The boundaries of the lexicon, or, if it isn't lexical, what is it? Ninth Annual Meeting of the Eastern States Conference on Language, SUNY, Buffalo. Juliano, C. ; J.C. Trueswell & M.K. Tanenhaus. 1992. What can we learn from "That" ? Paper presented at the 33rd Annual Meeting of the Psychonomic Society, St. Louis, Missouri. Lakoff, G. 1987. Women, fire, and dangerous things : What categories reveal about the mind, Chicago : Chicago University Press. Lakoff, G. & M. Johnson. 1980. Metaphors we live by, Chicago : Chicago University Press. Langacker, R.W. 1987. Foundations of cognitive grammar, vol. I : Theoretical prerequisites, Stanford, CA. : Stanford University Press. Leech, G. & R. Leonard. 1974. A computer corpus of British English. Hamburger Phonetische Beitrage 13, 41-57. Lehmann, C. 1985. Grammaticalization : synchronic variation and diachronic change. Lingua e Stile 20.
228
CATHERINE L. HARRIS
MacDonald, M.C. 1992. Multiple constraints on lexical category ambiguity resolution. Paper presented at the 33rd Annual Meeting of the Psychonomic Society, St. Louis, Missouri. MacWhinney, B. 1989. Competition and lexical categorization. In R. Corrigan, F. Eckman, & M. Noonan (eds.), Linguistic Categorization, Amsterdam : Benjamins. McClelland, J.L. ; D.E. Rumelhart & G.E. Hinton. 1986. The appeal of parallel distributed processing. In D.E. Rumelhart & J.L. McClelland (eds.), Parallel distributed processing : Explorations in the microstructure of cognition, vol. 1, Cambridge, MA : MIT Press. McClelland, J.L. & A.H. Kawamoto. 1986. Mechanisms of sentence processing : Assigning roles to constituents. In J.L. McClelland & D.E. Rumelhart (eds.) Parallel distributed processing : Explorations in the microstructure of cognition, vol. 2, Cambridge, MA : MIT Press. Meillet, A. 1948. L'evolution des formes grammaticales. In Linguistique historique et linguistique générale, Paris : Champion. Miller, G. & P.N. Johnson-Laird. 1976. Language and perception, Cambridge, MA : Harvard University Press. Miller G. & C. Fellbaum. 1991. Semantic networks of English. Cognition, 41, 197-229. Milus, M.L., & S.B. Button. 1989. The effect of polysemy on lexical decision time : Now you see it, now you don't. Memory and Cognition, 17, 141-147. Pinker, S. 1989. Learnability and cognition, Cambridge, MA : MIT Press. Pustejovsky, J. 1992. In B. Levin & S. Pinker (eds.) Lexical and conceptual semantics, Cambridge, MA : Blackwell. Reddy, M.J. 1979. The conduit metaphor : A case study of frame-conflict in our language about language. In A. Ortony (ed.), Metaphor and thought, Cambridge : Cambridge University Press. Ruhl, C. 1989. On monosemy, Albany, NY : SUNY. Rumelhart, D.E. ; G.E. Hinton & R.J. Williams. 1986. Learning internal representations by error propagation. In D.E. Rumelhart & J.L. McClelland (eds.), Parallel distributed processing : Explorations in the microstructure of cognition, vol. 1, Cambridge, MA : MIT Press. Touretzky, D.S. & G.E. Hinton. 1988. A distributed connectionist production system. Cognitive Science, 12, 423-466. St. John, M.F. & J.L. McClelland. 1988. Applying contextual constraints in sentence comprehension. Proceedings of the Tenth Annual Conference of the Cognitive Science Society, Hillsdale, NJ : Erlbaum. Sweetser, E. 1990. From etymology to pragmatics, Cambridge : Cambridge University Press.
COARSE CODING
229
Van Petten, C. & M. Kutas. 1991. Electrophysiological evidence for the flexibility of lexical processing. In G. Simpson (ed.), Understanding word and sentence, Amsterdam : North-Holland, pp. 129-184.
CONTINUITY, POLYSEMY, AND REPRESENTATION : UNDERSTANDING THE VERB CUT DAVID S. TOURETZKY Carnegie Mellon University (School of Computer Science), USA
Introduction In this chapter 1 would like to outline two senses of continuity, at entirely different levels of analysis, that one may encounter in the context of language un derstanding problems. Specifically I will look at the English verb cut, which is polysemous, imagery-laden, and open to numerous metaphorical extensions. The first hypothesis of this chapter is that while there may well be a small set of dis tinct senses of cut, when encountering the word in context we activate a blend of these senses, with some more primary than others. Even subliminally active senses may contribute to our understanding of an utterance, by for example pri ming future inferences. A second hypothesis has to do with representation in connectionist net works, which are often touted as having a continuous flavor. The essential pro perties of localist and distributed representations are reviewed, and shown to be fundamentally at odds. The holy grail of connectionist knowledge representation, in my view, is to reconcile these two approaches. Until that occurs, connectionist representations will not be very brain-like, nor are they likely to contribute much to our understanding of polysemy. 1.
The Verb cut
The principal senses of cut include sever, section, slice, excise, incise, di lute, diminish, terminate, traverse, and move quickly. Senses are frequently as sociated with syntactic particles, and their meanings often invoke image schemas (Langacker, 1987), as shown below1. - sever a one-dimensional object : cut. ; or sever the distal end of a onedimensional object : cut off. 1) This analysis was largely done by Catherine Harris, who has collaborated with me in investigating the meanings of cut.
232
DAVID S. TOURETZKY
- excise a two- or three-dimensional part from a whole : cut out. - incise into a two- or three-dimensional whole : cut into. - slice an object by dividing it into parallel sections perpendicular to its major axis or axis of symmetry, e.g., cut up a salami. - section an object by dividing it radially through its axis of symmetry, e.g., cut a cake. - dismember an object by dividing it into irregularly-shaped parts. - diminish some quantity, e.g., cut the amount of sugar in a recipe. - dilute a subtance by mixing it with other substances, e.g., cut whiskey with water. - terminate some action or process, e.g., cut off the flow of water from a pipe. - traverse an area, e.g., the runner cut across the field. - move quickly, e.g., the quarterback cut left to avoid a tackle. The particle structure of cut is far richer than the above listing indicates. For example, cut up can mean slice, section, or dismember, depending on the direct object. The particle up can also invoke an image schema implying repetitive action over an area (Lakoff 1987 ; Brugman 1988), so that cut up can mean "multiplyincised", as in The accident left him bruised and cut up. Similarly, cut down can either mean "sever" (as in cutting down a tree) or "diminish" {cutting down on salt in ones diet.) Cut also has many metaphoric uses, often involving image schemas. For example, That car cut me off evokes the "sever" sense of cut, with paths viewed metaphorically as one-dimensional objects. To be cut from the team means "excised", with the team (a social structure) viewed as a physical object one of whose components was removed. When someone's expertise cuts across several disciplines, we have a more complicated metaphor in which intellectual domains are viewed as physical regions, and spanning a domain is metaphorically descri bed as motion through it ; thus, the expertise "traverses" a broad region of intellec tual space. In many instances of the use of cut, a complete understanding of the mea ning relies on multiple senses interacting synergistically. For example, in (1) be low, the primary sense might be "excise", while in (2) it is "sever". (1) (2)
John cut the applefromthe tree John cut the boat from the dock
If asked to visualize the cutting in (1) and describe the literal object of the cut action, people generally report that what is being cut is not the apple, but rather the stem that binds the apple to the tree. This focuses on a secondary sense of cut, "sever", which applies to objects like stems that have one-dimensional extent.
POLYSEMY AND REPRESENTATION
233
In (2), "sever" is the most salient sense. People know that boats are typi cally connected to docks by mooring ropes, and ropes are one-dimensional (hence severable) objects. The use of a particle also provides an important cue : cut $x$ from $y$ is associated only with certain senses of cut. It is never used to express "incise" or "terminate", for example. Thus we see that the meaning of cut in con text is determined by a combination of syntactic cues (particles), world knowledge (apples are connected to trees by stems, boats are connected to docks by ropes), and image schemas (things with one-dimensional extent are severable.) In the classical model of polysemy, the language understander picks one sense as the "correct" meaning of the word and discards the others. However, it's clear that many senses of cut contribute to the meaning of (3) : (3) John cut a piece from the cake Here is a list of the relevant senses, roughly ordered by saliency : - Sense 1 : "excise" a part from the whole. - Sense 2 : "sever" the piece from the material it's attached to. - Sense 3 : "section" the cake. - Sense 4 : "incise". A knife (the default instrumentfor cutting cakes) enters the cake as part of the cut action. - Sense 5 : "traverse". The knife traverses the surface of the cake as part of the cut action. - Sense 6 : "diminish". The cake is diminished by having a piece removed. The meaning of cut in this context can be regarded as a blend of the above senses, with different degrees of participation based on their respective salience. This blending operation is not some blind mechanical superposition ; we don't confuse roles between the senses and think that the piece is being traversed or the knife is being diminished, nor do we confuse the construal of cake as "substance" to be severed with its construal as "radially symmetric object" to be sectioned. Rather, the effect of the blending is to generate a collection of inferences, or a po tential for inferences, based on all of the above ways of understanding the action. So we infer that there is motion of a knife because of the incise and traverse as pects ; we infer that there is now less cake because of the excise and diminish as pects ; we view the action as irreversible because of the nature of the severing ; and so on. By asking questions about the sentence, or using it to prime the understanding of a following sentence, we can demonstrate that these inferences do in fact take place. The notion of semantic continuity I am suggesting is that a broad range of shades of meaning may be derived from a polysemous word in context, by mixing multiple senses weighted by salience. This weighting is a function of both world knowledge, e.g., knowledge about the properties of objects like apples,
234
DAVID S. TOURETZKY
boats, and cakes, and contextual priming. The latter causes particular senses to fi gure more prominently in the conglomeration of meanings without necessarily eliminating any of the lesser senses. Compare (4), which emphasises "incise" and "traverse", with (5), which emphasises "excise" and "diminish" : (4) (5)
With a bent, rusty knife, John cut a piece from the cake Prompted by jealousy, John cut a piece from the cake
This view of word meaning does not require continuity in the technical ma thematical sense, in which there are an infinite number of points between any two points in the semantic space. A discrete semantic space would certainly suffice, provided only that the grain was fine enough to accomodate a sufficiently large number of semantic distinctions. Finally, as Norvig (1988) has observed, texts can admit multiple simulta neous interpretations with distinct meanings. An example is : (6)
John cut the engine
For some readers, this sentence invokes both the "terminate" and "sever" senses of cut. The former is based on John's performing some unspecified act to terminate the engine's running. (Here, the engine is used metonymically to refer to its operation.) Processes like the running of an engine have time lines that can be viewed metaphorically as one-dimensional objects, and hence cut applied to a pro cess implies interruption or termination via "sever". But a second interpretation of (6) is that John literally severed something such as a fuel line, which either pre vented the engine from running or stopped it if it was already running. These two readings of (6), one metaphorical and one literal, are not the sort of blending of senses I was discussing earlier ; they are distinct interpretations. But they are not incompatible interpretations. Some readers produce both simultaneously. Thus we see that multiple senses may contribute to an interpretation, and on another scale, multiple interpretations may exist simultaneously. The shape of "meaning space", whether continuous or not, is certainly complex. 2.
Inference in Localist Networks
In earlier work, Joseph Sawyer and I built a system for understanding simple usages of cut. The system combined syntactic cues, semantic features, and world knowledge to select the dominant sense of an instance. The syntactic cues came from particles and prepositions ; the system accepted verb phrases of a small number of types, such as cut the $x$from the $y$, cut off the $x$9 cut into the $x$9 etc. The nouns that could fill the $x$ and $y$ slots were tagged with a va riety of semantic features, such as "physob" for things that were physical objects, or "1-dim" for things that could be viewed as essentially one-dimensional. Some
POLYSEMY AND REPRESENTATION
235
nouns offer a choice of construals, e.g., team can mean either an abstract set of players or a physical collection of players. To be cut from the team means to have set membership revoked, but The team boarded the bus is a statement about phy sical objects. Multiple construals were represented by multiple nodes linked to the same word node. The various senses of cut were also represented by nodes linked into this syntactic/semantic network. No one piece of evidence was sufficient to determine the meaning of an instance, but by accumulating bits of syntactic and semantic support, a sense node could increase its activity level. The most active node would eventually win the competition. Thus, rather than using a hard deductive proce dure to grind out an interpretation of a cut instance, the system took a softer ap proach in which multiple competing interpretations would be partially active, but in the end only the strongest remained1. There is a sort of continuity of representation here, in that the activation le vels of nodes are continuous values, giving the network an infinite number of potential states. This is a property shared by all localist, spreading activation net works. However, this particular type of continuity is not very interesting, since there is no blending going on. The system is still built from a finite number of discrete nodes, and its goal is to select the "best" sense node based on the avai lable evidence. Actually, one of the strengths of this sort of representation is that it is able to entertain competing hypotheses without blending them together into an incoherent mush. But pure spreading activation is not sufficient to solve even the simple ver sion of the cut understanding problem, because it doesn't address the issue of how world knowledge is brought in. For example, the preferred sense of (1) is "excise", but if we want to focus on the physical action we will have to switch to "sever". Apples have the semantic feature "3'-dim" while "sever" requires "7dim", so a literal match is not very satisfactory. Section or slice would be compa tible with a three-dimensional direct object, but they don't fit the cut $x$from $y$ syntactic form of the input, whereas one sub-sense of "sever" does. Since no sense offers an acceptable syntactic/semantic match at this point, the system must try a less literal reading. A parallel search of the knowledge base produces the fact that apples are connected to trees by stems, and stems have the semantic feature "7-dim".
1) Although this approach is frequently associated with connectionist-style computation, in contrast with the "classical AI" deductive approach, in truth the former is also part of classical AI, which includes such things as parallel relaxation, heuristic evaluation functions, and production rules with numerical certainty factors.
236
DAVID S. TOURETZKY
So, since one of the sub-senses of cut is to sever a binding between an ob ject and an anchor, and there is the right kind of binding relationship between apples and trees in the knowledge base (i.e., apple fills the "bound" role, and the object of from fills the "anchor" role), this provides much stronger support for the "sever" interpretation, and furthermore allows us to infer that the deep object of cut in this case is "stem", which wasn't even mentioned in the input. This type of reasoning, in which unconnected bits of knowledge are brought together to solve a problem and give rise to new bits of knowledge as a result, I have called dynamic inference (Touretzky 1991). It is notorously difficult for connectionist systems, and well beyond the power of spreading activation models. In our toy cut system, we tackled the dynamic inference problem by allowing nodes to propagate messages to each other rather than scalar activation values ; we used elementary artificial intelligence search techniques to control the generation of these messages. The resulting system did not look at all connectionist. It was suc cessful in a limited way, but suffered form the combinatorial explosion and brittleness problems that have long plagued classical artificial intelligence. 3.
Continuity and Distributed Networks
An entirely different approach to representation is that of distributed connec tionist networks, where concepts are represented not by a single node, but by patterns of activity distributed over a collection of nodes, as in McClelland and Kawamoto's verb sense disambiguation model (McClelland & Kawamoto 1986). The nodes represent semantic features1 that collectively encode meaning. For example, some of the features that could make up cut senses are : 1-d object
2-d object
3-d object
|
penetration
separation
termination
|
physical motion
metaphorical motion
no motion
continuous motion
repeated motion
random motion
radial symmetry
parallel symmetry
no symmetry
single incision
multiple incisions
no incision
proximal/distal
part/whole
single mass
1) It is fashionable to refer to these as "microfeatures", implying that they are autonomously-constructed features encoding subtle statistical regularities of the domain, rather than gross semantic features that would be easily interpretable by human observers. However, in practice most models do in fact use gross semantic features, constructed by the experimenter. Calling these "microfeatures" is just wishful thinking.
POLYSEMY AND REPRESENTATION
237
A particular sense of cut, such as "sever a physical binding", would be en coded as a set of these features. In other words, some of the feature nodes in a connectionist network would be active and some inactive. The obvious advantage of this encoding is that we are not restricted to a small number of pre-defined verb senses. Instead, assuming we start with a rich feature inventory (a few dozen or perhaps a few hundred features), we can encode many thousands of subtly different cut senses, and have a built-in similarity metric (the dot product) for semantically related senses. And if we allow nodes to have real-valued rather than binary activation levels, we have a continuous $n$-dimensional semantic feature space. Although this encoding has great representational power, it also has a fun damental weakness. It is impossible to represent multiple competing hypo theses in a distributed semantic representation1, since there is only one set of semantic features. Thus the system cannot represent the competition "incise vs. terminate", but only some novel and perhaps nonsensical pattern derived from a mixture of the two competitors. A common solution to this problem is to use an associative memory architecture, such as a Hopfield network, to "clean up" a pattern by settling into one of a set of learned stable states. The problem here, though, is that now we're back to a small set of canonical meanings that have been set up as stable states ; we have lost much of the richness of a combinatorial representation. This cleanup difficulty can in theory be addressed by introducing hidden units to enforce semantic constraints among features, but this still does not allow competing meanings to (a) coherently coexist in the network, and (b) collect evidence and trigger supporting inferences to determine an eventual winner of the competition. 4.
The Holy Grail
What people appear to do, and connectionist networks at present cannot do, is represent multiple competing hypotheses each of which has a fine-grain seman tic structure and can potentially give rise to significant inferences. Localist spreading-activation representations support incremental evidence accumulation and soft competition (one form of continuity), but can't handle structured inferences,
1) It is however possible to represent competing hypotheses in a sparse, non-semantic distributed representation, as in (Touretzky & Hinton, 1988).
238
DAVID S. TOURETZKY
e.g., keeping track of which objects fill which slots in which schemas 1. Distributed encodings promise subtle and fluid representations (another type of continuity), but don't easily support representation of multiple entities simulta neously, and also have problems with structure. Artificial intelligence-style se mantic nets are perfectly suited to representing structured information, but fall short in the other areas. I can't predict when we will find a representation that supports all the pro perties we desire in an inference architecture. This is the "holy grail" of connectionist knowledge representation, and a topic of much ongoing research. At pre sent, representing concepts like cut, including the subtle contributions that mul tiple senses make in understanding instances such as (3), is simply beyond the state of the art of connectionist systems. Classical artificial intelligence approaches based on semantic nets at least provide a notation for formalizing our knowledge, and a means, albeit a slow and brittle one, for producing inferences based on it. Hybrid systems, such as Hofstadter and Mitchell's CopyCat (Mitchell 1990), combine spreading activation, stochastic search, and demons that dynamically create and modify network structure. These may prove a useful stepping stone toward more brain-like reasoning architectures.
Acknowledgements : The discussion of "cut" in this paper owes much to conversa tions I've had with Catherine Harris over the past five years. I thank Joseph Sawyer for his work on the computer simulation.
1) There have been some attempts to deal with schema-like representations in a spreading activation or marker propagation networks, but these either get bogged down in combinatorial problems, require excessive machinery to maintain parallelism, or else cannot deal with competition among schemas.
POLYSEMY AND REPRESENTATION
239
REFERENCES Brugman, C. 1988. The Story of 'Over' : Polysemy, Semantics, and the Structure of the Lexico, New York : Garland Press. Lakoff, G. 1987. Women, Fire, and Dangerous Things : What Categories Reveal About the Mind, University of Chicago Press. Langacker, R.W. 1987. Foundations of Cognitive Grammar, vol. I : Theoretical Prerequisites, Stanford, CA : Stanford University Press. McClelland, J.L. and A.H. Kawamoto. 1986. Mechanisms of sentence processing : assigning case roles to constituents of sentences. In J.L. McClelland and D.E. Rumelhart (eds.), Parallel Distributed Processing : Explorations in the Microstructure of Cognition, vol. 2, Cambridge, MA : MIT Press. Mitchell, M. 1990. Copycat : A Computer Model of High-Level Perception and Conceptual Slippage in Analogy Making. Doctoral dissertation, University of Michigan. Norvig, P. 1988. Multiple simultaneous interpretations of ambiguous sentences. Proceedings of the Tenth Annual Conference of the Cognitive Science Society, Hillsdale, NJ : Erlbaum, pp. 291-297. Touretzky, D.S. and G.E. Hinton. 1988. A distributed connectionist production system. Cognitive Science, vol. 12, n° 3, pp. 423-466. Touretzky, D.S. 1991. Connectionism and compositional semantics. In J.A. Barnden and J.B. Pollack (eds.), Advances in Connectionist and Neurally Oriented Computation, vol. 1 : High-Level Connectionist Models, Norwood, NJ: Ablex,pp. 17-31.
THE USE OF CONTINUITY IN MODELLING SEMANTIC PHENOMENA BERNARD VICTORRI CNRS (URA 1234, University of Caen), France Introduction Why should we use continuous models in semantics ? At first glance, this question seems simple : we have to use continuous models if and only if seman tic phenomena are continuous. But this last statement is wrong for at least two reasons. First, continuity or discreteness are not properties of phenomena, they are characterizations of theories upon phenomena. Second, as stressed by D. Kayser in this volume, one can use discrete models to represent continuous concepts, and the other way round. So, our first question must be split into two questions : (1) What kinds of linguistic theories concerning semantic phenomena need concepts related to continuity ? (2) What kinds of mathematical and com puter tools can deal with these concepts ? 1.
Linguistic issues
1.1. Categorisation At every level of linguistic description, we are confronted with the pro blem of classification. The scenario is always the same : in order to reach some degree of generality in the description of any phenomenon, one must define classes of linguistic expressions or relations and state rules in terms of these classes. But how to decide if one given expression or relation belongs to one gi ven class ? In most cases, a single criterion proves to be insufficient since lin guistic data show a great variability. So linguists tend to use sets of criteria to define classes. As a consequence, some graduality is obtained : expressions sa tisfying the whole set of criteria can be said typical elements of the correspon ding class, whereas other expressions can be viewed as more peripheral, further from the center of the class as they satisfy a smaller number of criteria. As we said, this situation prevails in every domain of linguistics. Even in syntax, we can find some sort of graduality and typicality in the definition of syntactical categories, functional relations, classes of transformations, and so on
242
BERNARD VICTORRI
(cf. P. Le Goffic, in this volume). But the domain where these notions most ob viously apply is semantics. Examples are numerous : semantic lexical features (like 'animate' versus 'inanimate'), types of process ('activity', 'accomplishment', 'achievement',...), aspect and modal classifications. In each case, it is relatively easy to exhibit typical examples showing all the features which can characterize a class. But it is also easy to find examples on the border of two classes, where we need to refine the classification, to distinguish between different characteri zations, and to introduce new factors that can take into account these differences. As the number of factors grows, and their interrelations become more complex, an alternative to the classical combinatorial representation comes out : the cons truction of a space representation, whose dimensions can combine the effects of different factors, each one with a specific strength. The advantage is to preserve the relation of proximity between elements (from the same class as well as from different classes) within the frame of a tractable low-dimensional space, by means of a distance on this space. Every element can be assigned a place in this representation, and the notions of center and borders of classes are fully accounted for by their geometrical counterparts. Obviously, some details will be lost from the original complexity, but with a judicious choice of the dimensions, such a compromise can be the best solution to represent the main tendancies in an efficient way, without completely eliminating any relevant factor. It is important to note that the choice of a space representation by no means implies that the phenomena are considered as continuous. Only graduality and typicality are assumed : in other words, here continuity is nothing but an ef ficient tool to deal with a multiplicity of interrelating discrete factors. 1.2* Compositionality One of the main issues for any semantic theory consists in explaining how the meaning of a whole (phrase, sentence, ...) can be computed from the mea ning of its components. The classical approach to this problem is compositiona lity. The starting point is the syntactical structure : at each node of the syntactic tree, meaning is computed bottom-up by applying rules that give the meaning of the current node as a function of the meaning of its directly dependant nodes. Two main difficulties arise in this appoach. First, local rules cannot be sufficient : very often, an element, far away in the syntactical tree, exerts a definite influence over the meaning of a given part of the sentence. The second point concerns the pervasive phenomenon of polysemy. It is well known that many words, and specially the most frequent ones, are highly polysemous. Their precise meanings depend upon the rest of the sentence and have to be computed during the process, so they cannot be taken as a basis for a bottom-up computation.
MODELLING SEMANTIC PHENOMENA
243
These difficulties are tackled by the classical approach. For instance, the so-called mecanism of 'recategorization' enables a node value to be changed in accordance with some operation on an upper node in the tree. But here again, the combinatorial complexity of these mecanisms grow very quickly and one can be sceptical about their capacity to handle the whole set of semantic interrelations in non-trivial sentences (for a discussion about the limits of recategorization, see Fuchs et al. 1991, pp 157-162). The alternative is to consider a sentence as a 'Gestalt' where relations between whole and parts are fully bi-directional. In this view, each component of the sentence interacts with each of the others, in no precise predefined order. What is important is the relative strength of each inte raction which acts as a constraint upon the potential of meanings carried by each polysemous element. Construction of meaning can be seen as the result of a dy namical global process in which a stable solution is obtained when the maximum of constraints are satisfied (Victorri 1992). Related to this issue is the question of 'degrees of analysability' pointed out by R. Langacker in this volume. The compositional approach imposes a dicho tomic choice : a phrase must be fully analysable (computable from its compo nents) or fully idiomatic (considered as a new genuine element). In the gestaltist approach, these configurations are only two extreme cases of a more general si tuation where meaning is partly due to each component and partly an irreducible quality acquired by their interaction. The need of continuity in gestaltist approaches is obvious. Claiming that construction of meaning is a dynamical process implies to define a space where this process can take place, where stable states can be defined, and so on. We must be aware that precise quantification is not necessary : the point of interest is the qualitative behavior of the process. A continuous space is the natural frame in which qualitative properties of dynamical systems can be handled, and nothing more. 1.3. Representation of meaning Another big issue for invoking continuity comes from representation of meaning. The main trend is to use the apparatus of logics, in one way or another, for this purpose. But a significant number of linguists challenge the prevailing views by advocating the use of topological representation. This is a very attrac tive idea : many lexical and almost all grammatical units can be associated to small graphic configurations that outline the kernel of their semantic value whe reas logical representations tend to split the different precise meanings in as many different representations. Moreover, topological concepts and their per ceptive counterparts seem to be more efficient than logical tools to explain the functional properties of these units in a cognitive perspective. The works of
244
BERNARD VICTORRI
A. Culioli (1991), R. Langacker (1987), G. Lakoff (1987) or L. Talmy (1988) are representative of the diversity and creativity one can find in this area. In these theories, continuity and other related topological, geometrical and dynamical notions are used as a conceptual framework. The abstract properties of, say, open intervals, boundaries, attractors, ... are the basic elements of the representation. Here quantification is completely irrelevant. As a matter of fact, one can say that the deep reason to use these concepts is their capacity to group in a single class situations which differ in their quantitative and domain-refe rence aspects. 2.
Mathematical definitions
To discuss the role continuity can play in semantic models, we first have to define this term. Most often, continuity is seen as a property of variables for which there is always an intermediate value between any two given values. Another frequent formulation uses a notion of distance : around any point in a continuous space one can always find other points as close as one wants. But from a mathematical point of view, at least three different notions related to continuity can be defined, and none of them corresponds precisely to these intui tive definitions. 2.1. Continuity versus discontinuity The opposition continuity/discontinuity applies to functions. It was first defined for numeric functions. Technically speaking, a real function f of one real variable is continuous in a given point x if for any open interval J comprising f(x) one can find an open interval I comprising x such as f(I) is included in J. In other words, there is no sudden "jump" in the value of the function when the va riable passes through x. In terms of distance, one can say that a continuous func tion preserves 'closeness' : as the variable get closer to x, the value of the func tion gets closer to f(x). This definition can be easily extended to real functions of several variables, and more generally for any function from one multi-dimensio nal real space to another. When using such a function in a model, points where the function is discontinuous are most often the most interesting ones, because they correspond to situations where the phenomenon under study changes its be haviour in an observable way. 2.2. Discreteness The definition of continuity is by no means limited to functions of real va riables. This property can be defined for functions from any set to any another, as soon as these sets have been provided with a topological structure. To provide a set with a topology, one must define a family of subsets, called open subsets, verifying a small number of rather simple axioms : the entire set and the empty
MODELLING SEMANTIC PHENOMENA
245
set must belong to the family, the intersection of a finite number of members of the family must belong to the family, as well as the union of any (possibly infi nite) number of members. Any set can be provided with several more or less in teresting topological structures. One of them is the discrete topology, for which the family of open subsets is constituted by all the subsets of the set. As a matter of fact, this topology is not very helpful because every function defined on a dis crete space is continuous ! Whenever a distance is provided on a space, a corresponding 'natural' topo logy can be derived for which continuous functions preserve closeness. But this topology can be the discrete one, as is the case, for instance, for the set of inte gers with the standard distance. In fact, no interesting topology can be defined on such a set, where the distance between any two points is greater than a cons tant value. 2.3. Continuum The third definition to be introduced is the notion of a continuum, some times called a continuous space. A continuum is a non-discrete space, but the converse is not true. For instance, the set of rational numbers with the standard distance is neither a discrete space nor a continuum. Though one can find ano ther rational number as close as one wants to any given one, there are neverthe less "holes" in this set. In a sense, filling these holes is equivalent to the axio matic construction of the set of real numbers, which is a continuum. Many simple geometrical properties expected from continuous spaces depend crucially on topological properties of real numbers. To give only one example, given any closed curve in the plane, one expects that a line joining a point inside the curve to a point outside must intersect the curve : such a property would not be true if one considers only points with rational coordinates in the plane. Therefore, the whole complexity of real numbers is required to build an adequate framework for most mathematical models using continuity in a geome trical sense. In particular, dynamical systems are defined on so-called differen tial manifolds, which are generalizations of curves and surfaces, and the topolo gical properties of these manifolds play a central role in the theory. 3. 3,1
Modelling considerations
Qualitative modelling Continuity is generally associated with quantitative modelling. We have just seen why one needs real numbers to get the advantages of continuum topo logy properties. But it does not mean that the model must be quantitative. Once a continuum framework has been built up, one can use it to represent a phenome non, and very often only its qualitative features are of interest. For instance, to
246
BERNARD VICTORRI
catch the notion of graduality, the best solution is to adopt a continuous repre sentation where graduality can be differentiated from sudden jumps by means of continuity and discontinuities in functions, even if one knows that not every point in the space of parameters corresponds to an observable value. To have a discontinuity in one point of the space is a qualitative property for a function, and we do not need to specify the exact position of the point nor the exact value of the jump at this point to characterize the class of functions presenting this outstanding feature. On the contrary, if one adopts a discrete representation, the only way to distinguish this feature from graduality is some kind of threshold : one must define "small" jumps and "big" ones, since there is anyway a jump when passing from one point to the next one. As shown by this last example, in a qualitative model, the focus is not on one particular numeric function but on a class of functions exhibiting a common behavior. Nevertheless, a qualitative model can be predictive. Many qualitative relationships between data can be tested to ascertain its validity. For instance, a model may imply that data must respect a given order relative to the importance of a gradual phenomenon, that some "jump" in its behavior must be observed during the combined variations of a set of parameters, and so on. Such predic tions are as useful as quantitative ones to validate or invalidate a model. 3.2
Dynamical systems As shown by our earlier discussion about linguistic issues, dynamical sys tems theory looks likely to play a central part in linguistic continuous models. It can be used to represent meaning of units, and also to model units interactions in a sentence. In both cases, qualitative modelling is needed. Actually, dynamical systems theory lends itself remarkably well to qualitative modelling. One can define classes of equivalent systems, characterized by a similar behavior, inclu ding in a same class systems on spaces of different dimensionality. This last point is very important in linguistics, where the same unit is used for a great va riety of domain-reference spaces. One important example is the notion of bifur cation, discussed in this volume by R. Thorn, and used as a central concept in Culioli's works on determination. Moreover, dynamical systems can easily deal with the full range of lin guistic phenomena known as ambiguity, indetermination and vagueness. In our work on polysemy (Fuchs et Victorri 1988 ; Victorri et Fuchs 1992), we built a mathematical model in which the precise meaning of a polysemous unit in any given sentence is represented by a dynamical system on a semantic space. The dynamics is parametrized by the other units present in the sentence. Each stable state (i.e. point attractor) of the dynamics correspond to a possible semantic va lue of the polysemous unit, so that the number of attractors and the form of the
MODELLING SEMANTIC PHENOMENA
247
basins of attractors characterizes the meaning of the unit in the given sentence. Thus the presence of two (or more) attractors is related to the existence of an ac tual ambiguity. A large shallow basin of attractor represents an indetermination whereas a deep narrow one represent a specific precise meaning. These different cases can be classified and the semantic behavior of the unit can be defined in terms of the relation between these classes of dynamics and the parameters de pending on the other units present in the sentence which are responsible for the form of the dynamics. For instance, one can observe how small modifications of the sentence induce qualitative changes in the dynamics, such as the appearance or disappearance of an ambiguity as an element of the sentence is replaced by another. 3.3
General framework If we try to outline the general framework emerging from the preceding considerations, we can bring out a few principles which constitute a common basis to continuous modelling in semantics. Two representations can be associated to each linguistic element. The first one is a representation of its kernel of meaning, sometimes called its 'iconic' re presentation, which specifies the constant contribution of this unit to the mea ning of any sentence comprising it. The second one is what we called here its 'semantic space', whose dimensions reflect the degrees of freedom corresponding to the variable precise meanings this unit can convey in different sentences. The interactions between units in a given sentence are then twofold. On the one hand, kernel representations interact, bringing out the full representation of the meaning of the whole sentence. On the other hand, each unit receives from the others a set of constraints which defines its behavior in its semantic space. In both cases dynamical systems theory seems to be the appropriate tool to com pute these interactions. It is at least the right tool to model these interactions as a gestaltist process. As it stands, this framework is not actually an effective semantic model. Our claim is that it defines a research program in which most of linguistic stu dies using continuity as a main ingredient can be included. 4.
Computer tools
4.1. Continuity on a digital computer If we turn now towards computer implementation, representing continuity on a machine is all but a simple problem. From a rigorous mathematical point of view, continuity cannot be reached with a digital computer : even the so-called 'real' variables take their values, whatever the precision, on a discrete finite set of numbers. So, the best we can do is to approximate continuous functions by dis-
248
BERNARD VICTORRI
crete gradual operations and the subtle distinctions we made at the mathematical level no longer apply in this context. Nevertheless, in many domains, and spe cially in physics, computer simulations are used to study continuous mathemati cal models, and machine precision is sufficient to obtain reliable results. So it cannot be argued that a digital computer is not suitable for representing conti nuity. But the problem is elsewhere. In domains like physics, the focus is on quantitative simulations, and computers are used for what they do the best : nu meric computations. In our case, we are most interested in qualitative behavior and choosing numeric values is most of the time an irrelevant burden. As we have shown, quantification is the opposite of what is needed in the mathematical apparatus related to continuity. If computer implementation of a qualitative con tinuous mathematical model imposes arbitrary numeric coding, it will be devoid of interest. 4.2. Connectionism Connectionist networks seem to provide an elegant solution to this pro blem. They are essentially numeric, of course, since the relations between units are defined by a 'real' number, i.e. the weight of their link. But these weights are not to be arbitrarily chosen by the designer of the system. They are adjusted as the result of a learning process. In concrete terms, what is needed is the encoding of a sample of input data and of expected corresponding output results. Then the learning algorithm automatically computes weight values so that the system gives a correct response when presented with data close to the learned sample. The most simple example is given by the so-called 'feed-forward' net works. They are constituted by an ordered set of layers of units, each unit of one layer being connected to all units of the next layer. The first and the last layers are respectively the input and output layers. Mathematically speaking, these two layers implement two spaces, and the learning process is equivalent to compu ting the most regular function from the input space to the output space satisfying the constraints given by the learning sample. To use such a system, one has to design the two basic spaces by specifying their dimensions and the rules of en coding data and results onto these spaces. Strictly speaking, encoding operations are also numeric, but here the situation is different : each unit must correspond to a linguistic criterion which is part of the model, and the choice of coding va lues which can be limited to two or three values is not arbitrary from a linguistic point of view. Connectionist approach is often opposed to symbolic approach, but in fact, there is a symbolic aspect in any connectionist network, precisely be cause input and output units are inevitably given a symbolic meaning, even in so-called 'distributed representations', otherwise the system could not be of any
MODELLING SEMANTIC PHENOMENA
249
use for modelling. The non-symbolic aspect of connectionist networks is limited to what happens strictly inside the network, in the correspondance between input and output, where the learning process takes place. One of most interesting classes of connectionist networks is the family of 'recurrent' networks. As opposed to feed-forward networks, they allow bi-direc tional links between units and so they are direct implementations of dynamical systems which we argued to be of cardinal importance in semantic continuous models. With these systems, one can capture the notions of attractors, bifurca tions, and so on. As an example, we used a recurrent architecture to implement our model of polysemy, and it enabled us to differentiate phenomena of ambi guity, indetermination, ... by the form of the basins of attractors of the dynamics created inside the recurrent network associated to the polysemous unit in diffe rent sentences (Victorri et ál. 1989 ; Gosselin et al. 1990). 4.3. Current limits of connectionism So connectionism is already an essential tool to implement continuous models. But it has a drawback that prevents it from playing in continuous mo delling the same role as classical artificial intelligence tools play in discrete mo delling. This flaw is related to an important notion developed in artificial intelli gence : the notion of control. A connectionist network remains a "black box" which does not allow much reasoning about its functioning. Our experience with polysemy modelling showed us how frustrating it was to work with such a sys tem. Even when it gave us roughly satisfactory results, we could not use it to answer the most important questions which motivated the implementation of the model : what were the decisive factors that explained the good performances ? In which direction might we modify the system to improve it ? How could we cha racterize the class of systems giving acceptable results ? To answer these questions, one must "open the black box", i.e. study the relation between the performances of the network and its internal configuration. Theoretical work is implied to classify networks in terms of qualitative behavior. This direction of research constitutes a major challenge for connectionism. This work has already started, as shown for instance by the important theoretical re sults obtained by D. Amit on one family of recurrent systems, the Hopfield net works. The usefulness of continuous model implementations crucially depends upon progress in this area. Conclusion Continuous models look likely to play a more and more important role in current research in semantics. In this paper, we tried to delimit to what extent mathematical and computer tools are adapted to this task. Clearly, many efforts
250
BERNARD VICTORRI
rare still to be made in this field to become an actual alternative to discrete for malisation. But an original theoretical framework already emerges : it can bring new light to many well-known linguistic issues, which cannot be taken into ac count by discrete modelling.
MODELLING SEMANTIC PHENOMENA
251
REFERENCES Culioli, A. 1990. Pour une linguistique de renonciation. Opérations et représentations, t. 1, Paris : Ophrys. Fuchs, C. et B. Victoni. 1988. Vers un traitement automatique de la polysémie grammaticale. TA. Informations 29, Paris. Fuchs, C. ; L. Gosselin et B. Victoni. 1991. Polysémie, glissements de sens et calcul des types de proès,Travaux de linguistique et de philologie, XXIX : 1, Strasbourg, pp. 137-169. Gosselin, L. ; A. Konfé et J.P. Raysz. 1990. Un analyseur sémantique automatique d'adverbes aspectuels du français, Actes du 4è colloque de l'A.R.C, Paris. Lakoff, G. 1987. Women, fire and dangerous things : what categories reveal about the mind, University of Chicago Press. Langacker, R.W. 1987. Foundations of cognitive grammar, Stanford University Press. Talmy, L. 1988. Force dynamics in langage and cognition, Cognitive Science 9:1. Victorri, B. et C. Fuchs. 1992. Construction de l'espace sémantique associé à un marqueur grammatical polysémique, Linguistica Investigationes 16 : 1. Victorri, B. 1992. Un modèle opératoire de la construction dynamique de la signification. In La théorie dAntoine Culioli, Ouvertures et incidences, Paris : Ophrys, pp. 185-201. Victorri, B., J.P. Raysz, A. Konfé. 1989. Un modèle connexionniste de la polysémie, Actes de Neuro Nimes 89, EC2.
INDEX
ambiguity, ambiguous 4, 47, 101, 112, 195, 198, 199, 201, 202, 246, 247, 249 analyzability 18,243 artificial intelligence 3,111, 145, 172, 202, 235, 249, attractor 132-134, 160, 167-169, 171, 184, 244, 246-247, 249 bifurcation 159-160, 169, 184, 246, 249 boundary 10, 22, 49, 57, 96-98, 115, 138, 145, 156, 175, 224, 244 catastrophe theory 4, 156, 184 category, categorization 3, 13, 49, 59, 67, 86, 98, 103, 243 cognition, cognitive 3, 9-18, 58, 95, 104, 119, 127-151, 164, 170-175, 205, 222 compositionality, compositional 3, 18, 102, 167, 189, 209, 242 connectionism, connectionnist 4, 119, 167, 189, 206, 216, 231, 237, 248 consistency 64, 122, 201 context, contextual 18, 37, 64, 74, 85, 90, 99, 101, 129, 161, 189, 197, 200, 205-216, 219, 233 contiguity 23, 27, 30, 99 corpus 57-61, 73, 101, 211, 216-221 cusp 136, 156 cut locus 175-177 deformability 23 degree 9, 16, 18, 47, 57, 60, 82, 85, 94, 97, 113, 119, 136, 209, 216, 219, 233,241,243 differential geometry, differential equation 4, 119, 127, 135, 176 discontinuity, discontinuous 5, 10, 23, 78,97, 101, 120, 162, 177, 244 discrete, discreteness 3, 9-18, 33, 50, 54, 73, 93, 97, 99-101, 104, 111-124, 127, 134, 151, 156, 170, 190, 193, 198, 234, 235, 244, 247 distance 118, 142, 195, 242, 244 distortion 28, 99, 104 distributed 171, 206-208, 216, 217, 236 dynamics, dynamic, dynamical, dynamistic 3,34, 37, 54, 83, 87, 97, 100, 131, 149, 157, 167, 172, 176, 193, 196, 236, 244, 246 filtering process 25, 46 frequency 62, 206, 211, 215, 222 fuzziness, fuzzy 10, 61, 72, 97, 120
254
CONTINUITY IN LINGUISTIC SEMANTICS
gap 9, 25, 29, 34, 40, 155, 162 generality 94, 193,223 geometry, geometrical 4, 98, 127, 136, 140-150, 168, 174, 244 Gestalt, gestaltist, gestaltic 3, 38, 103, 174, 217, 243 gradation, graduality, gradual, gradually 4, 17, 35, 37, 4 1 , 54, 86, 96, 97, 98, 99, 101, 162, 241, 242, 246, 248 gradience, gradient 4 1 , 57, 58, 59, 60, 6 1 , 7 1 , 72 73, 74, 97 grain 189, 208, 234, 237 holism, holistic 35, 174, 184 homonymy, homonym, homonymic 48, 5 1 , 54, 79, 83, 85, 99, 100, 190, 192 iconicity, iconic 39, 41, 164, 174, 184, 247 indeterminacy, indeterminate 10,58,95 jump 9-11, 244, 246 kinetism 83-85, 91, 93, 97, 101 lexical access 207, 215, 222 limit 22,31 localist 207, 208, 231, 234, 237 logics, logic, logical 50, 57, 93, 94, 114, 116, 118, 121, 122, 173, 189, 198, 243 memory 132, 155, 171, 202, 206, 215, 237 morphodynamics 167-187 network 130-138, 167, 170, 171, 177, 178, 190, 192, 206, 216-219, 222, 231, 234, 236-238, 248-249 non-monotonic 121, 196 perception, perceptive 3, 94, 112-113, 128-130 135, 149, 155, 160, 174, 184 polysemy, polyseme, polysemous 10, 77-92, 95, 96, 99-103, 150, 189-193, 196, 198, 201, 202, 205-229, 231-239, 242, 246, 249 probability 66-73, 94, 97 prototype, prototypicality, prototypical 10, 11, 39, 57, 86, 103, 136, 160, 161, 165 proximity 144, 208, 242 psychology, psychological, psychologically 3, 12, 14, 42, 57, 86, 89, 94, 111, 112, 133, 140, 171,206,222 qualitative5, 50, 114, 119, 120, 121, 133, 136, 143, 161, 162, 243, 245-246, 248, 249 quantification, quantitative 5, 50, 59, 74, 95, 102, 161, 243, 245, 246, 248 salience, saliency, salient 11, 12, 13, 14, 17, 18, 54, 103, 156, 164, 197, 199, 200,201,210,217,218,233 scale 11, 12, 15, 34, 57, 60, 67, 68, 72, 95, 97, 120, 210, 214 schema, schematicity 14, 142, 163, 174, 231
INDEX
255
semantic potential 198-199 semantic space 30, 101, 115, 159, 208, 234, 246-247 separability, separation 27, 156 singularity 156, 158, 182 statistics, statistical 5, 57-76, 159, 206, 216, 224, 236 symbolic 3, 117-118, 122, 123, 124, 128-130, 167-174, 189, 191, 248 synonymy, synonym, synonymous 59, 74, 83, 190-191 topology, topological 22, 93, 96, 98, 136, 142-145, 148, 156, 159, 162, 172, 174-176, 198, 243-245 trajector 16,217 transition, transitional 25, 28-30, 38-39, 4 1 , 45-46, 50, 80, 82-84, 87, 96, 100, 102, 129, 133, 156, 158-159 typicality, typical 15, 47, 53, 57, 59, 86, 94, 103, 210, 214, 219, 222, 223, 241242 undecidability, indecidable 97,190 vagueness, vague 10, 13, 58, 85, 95, 96, 120-121, 201, 246 variable depth 194-196 vision, visual 15, 58, 162, 176, 184, 207-208, 225