Lexical-Semantic Relations
LINGVISTICÆ INVESTIGATIONES: SUPPLEMENTA
Studies in French & General Linguistics / Études...
301 downloads
599 Views
6MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Lexical-Semantic Relations
LINGVISTICÆ INVESTIGATIONES: SUPPLEMENTA
Studies in French & General Linguistics / Études en Linguistique Française et Générale This series has been established as a companion series to the periodical “LINGVISTICÆ INVESTIGATIONES”, which started publication in 1977.
Series Editors: Éric Laporte (Université Paris-Est Marne-la-Vallée & CNRS) Annibale Elia (Università di Salerno) Gaston Gross (Université Paris-Nord & CNRS) Elisabete Ranchhod (Universidade de Lisboa)
Volume 28 Petra Storjohann (ed.) Lexical-Semantic Relations. Theoretical and practical perspectives
Lexical-Semantic Relations Theoretical and practical perspectives Edited by
Petra Storjohann Institut für Deutsche Sprache, Mannheim
JOHN BENJAMINS PUBLISHING COMPANY AMSTERDAM/PHILADELPHIA
8
TM
The paper used in this publication meets the minimum requirements of American National Standard for Information Sciences — Permanence of Paper for Printed Library Materials, ANSI Z39.48-1984.
Library of Congress Cataloging-in-Publication Data Lexical-semantic relations : theoretical and practical perspectives / edited by Petra Storjohann. p. cm. -- (Linguisticae investigationes. Supplementa ISSN; 0165-7569; v. 28) Includes bibliographical references and index. 1. Semantics. 2. Lexicology. 3. Relational grammar. I. Storjohann, Petra, 1972P325.L482 2010 401'.43--dc22 2010009954 ISBN 978 90 272 3138 3 (Hb: alk. paper) ISBN 978 90 272 8816 5 (Eb) © 2010 – John Benjamins B.V. No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any other means, without written permission from the publisher. John Benjamins Publishing Co. • P.O.Box 36224 • 1020 ME Amsterdam • The Netherlands John Benjamins North America • P.O.Box 27519 • Philadelphia PA 19118-0519 • USA
Table of contents
Preface
vii
Introduction Petra Storjohann
1
Lexico-semantic relations in theory and practice Petra Storjohann
5
Swedish opposites: A multi-method approach to ‘goodness of antonymy’ Caroline Willners and Carita Paradis
15
Using web data to explore lexico-semantic relations Steven Jones
49
Synonyms in corpus texts: Conceptualisation and construction Petra Storjohann
69
Antonymy relations: Typical and atypical cases from the domain of speech act verbs Kristel Proost An empiricist’s view of the ontology of lexical-semantic relations Cyril Belica, Holger Keibel, Marc Kupietz and Rainer Perkuhn The consistency of sense-related items in dictionaries: Current status, proposals for modelling and applications in lexicographic practice Carolin Müller-Spitzer
95 115
145
Lexical-semantic and conceptual relations in GermaNet Claudia Kunze and Lothar Lemnitzer
163
Index
185
Preface
The availability of corpus-guided methods and the emergence of new semantic models, particularly cognitive and psycholinguistic frameworks, have prompted linguists to develop a range of immensely fruitful new approaches to sense relations. Not only does the field of sense relations have immediate relevance for the study of paradigmatic structures in lexicology, it is also a much discussed field for a variety of other application-oriented areas such as lexicography, Natural Language Processing and database engineering of lexical-semantic webs. It was in this context that the Institut für Deutsche Sprache in Mannheim (Germany) held an international colloquium from 5th–6th June 2008 on the subject of “Lexical-Semantic Relations from Theoretical and Practical Perspectives”. This event brought together researchers with an interest in semantic theory and experts with a more practical, application-based background looking at different languages. The papers in this volume derive from the colloquium and address specific semantic, lexicographic, computational and technological approaches to a range of meaning relations, particularly those which have traditionally been classified as “paradigmatic” sense relations, as well as exploring the construction, representation, retrieval and documentation of relations of contrast and meaning equivalence in a variety of languages including German, English and Swedish. This book provides specialists from different disciplines and areas with the opportunity to gain an insight into current cross-linguistic research in semantics, corpus and computer linguistics, lexicology, applied teaching and learning, and lexical typology as well as technological applications such as computational lexical-semantic wordnets. The overall aim of this book is to make up for some of the shortcomings of more traditional and often non-empirical studies, by providing an overview of current theoretical perspectives on lexical-semantic relations and presenting recent application-oriented research. Above all, its aim is to stimulate dialogue and revive discussion on sense relations in general, a subject which requires reappraisal in the light of recent semantic theories and which merits application via contemporary linguistic/lexicographic methods and procedures.
viii Lexical-Semantical Relations
I am appreciative of the help of all authors who have contributed to this book and who have clearly demonstrated the tremendous scope of the field and the importance of current trends in the study of paradigmatic structures. I also wish to thank some colleagues and friends for their criticisms, their help and support. My gratitude also goes to that handful of very special people for their understanding, their sense and sensitivity when making this book. Beyond these, a sincere thanks also goes out to the Institut für Deutsche Sprache Mannheim, for hosting the colloquium, thereby enabling semanticists, lexicographers, experts in Natural Language Programming and computer linguists to share their common interest in lexical-semantic relations.
Petra Storjohann (Institut für Deutsche Mannheim)
Introduction Petra Storjohann
This collective volume focuses on what have traditionally been termed the “paradigmatics” or “sense relations” of a lexical unit. These include relations of contrast and opposition, meaning equivalence, hyponymy, hyperonymy etc., all of which have captured the interest of researchers from a range of disciplines. In the existing literature, studies on sense relations often just cover one specific phenomenon of one specific language, stressing specific semantic or methodological aspects. The present book covers different languages and different paradigmatic phenomena. It outlines the full complexity of the subject, combining linguistic and methodological elucidations with discussions on current practical and application-oriented research. The papers in this volume which are concerned with lexicological questions examine a range of linguistic models and semantic modelling, and the use of data such as corpora or the Internet as a lexical resource for information retrieval. Various authors demonstrate that research on language and cognition calls for evidence from different sources. They explain the nature of different lexical resources and the working methods associated with them, and they suggest some theoretical implications of a larger semantic model. The lexicological papers have as a common theme contextualised and dynamically constructed structures and look at the phenomenon as it is governed by conventional, contextual and cognitive constraints which favour specific choices. The semantic approaches used here concentrate on questions of mental representation, linguistic conventionalisation, cognitive processes and ideas of constructions, and they respond to the opportunities presented by methodologies from psycholinguistics or corpus studies. Moreover, the book explores recent developments in building large lexical resources as well as some lexicographic and text-technological aspects. These include for example elucidations on the structure and possible applications of reference databases such as GermaNet in natural language processing and computational linguistics. This database has been constructed in the style of its English counterpart WordNet and it is an integral part of EuroWordNet. This volume
Petra Storjohann
is also concern with sense-related items in dictionaries. Recent insights into the paradigmatics of a word from a lexical-semantic point make a compelling case for dictionary makers to include more appropriate and innovative descriptions of sense-related items in reference works. Additionally, new technical standards and text-technological facilities are still largely being ignored by lexicographers although they offer opportunities to enhance dictionaries and to make them more consistent. These too are some of the concerns of this book. Together, these papers not only give an impression of the scope of the field by looking at lexico-semantic relations from a range of positions and for different purposes, but they also demonstrate how cross-linguistic examinations benefit each other, how research areas fertilise and complement each other and how results in one field have an impact on research in other disciplines, each enriching the insights and developments of the others. In “Lexico-semantic relations in theory and practice” Petra Storjohann provides an overview on the subject of sense relations in different linguistic fields such as lexicology, corpus studies, lexicography and computer linguistics. The paper particularly focuses on the shift of approaches and methodologies to lexicosemantic relations in lexical semantics and concentrates on perspectives taken in German and in English linguistics. The paper is thought as a general discussion on the subject and it reveals some of the open questions and the challenges that need to be approached in the future. Both psycholinguistic and corpus-linguistic approaches are taken in the paper “Swedish opposites. A multi-method approach to goodness of antonymy” by Caroline Willners and Carita Paradis where the nature of English and Swedish antonymy and the degree of conventionalisation of antonymic word pairs in language and in memory are examined. Methodologically, their analyses are conducted on the basis of data from textual co-occurrence, and from judgement and elicitation experiments. Both types of evidence are used as a means of substantiating semantic theories. The paper not only examines differences in applying various methods, but also addresses the meanings of conventionalised canonical antonym pairings, including issues such as dimensional clarity, symmetry and contextual range. In terms of theoretical implications, it is argued that opposite meaning is construed and the authors show this with help of both highly conventionalised and less conventionalised binary opposites. Whether the Internet can be used as a valid corpus source for linguistic analyses is a question addressed by Steven Jones in “Using web data to explore lexicosemantic relations”. Taking English antonymy as an example, he explores how the web can be used as a lexical resource to reveal and quantify relational structures, and raises the question of whether prior semantic statements on canonicity can be based on such a methodology. He starts out from the assumption that specific
Petra Storjohann
theory and semantic models, they also propose an empirically-driven methodology specifically for the explorative examination of lexical-semantic relations, and they support the development of methods for empirical work with corpora and the detection of language structures in comprehensive linguistic data in more general terms. An interesting insight into the problems of inconsistent lexicographic information is provided in Carolin Müller-Spitzer’s paper “The consistency of senserelations in dictionaries. Current status, proposals for modelling and applications in lexicographic practice”. The paper reveals how inconsistent reference structures (e.g. ‘non-reversed reference’) are in German dictionaries of synonymy and antonymy. Problems such as bidirectional linking that is realised as unidirectional references are challenges for lexicographers as well as dictionary users. Although computational procedures are available to solve the problem, these have not been implemented so far. As Müller-Spitzer argues, a coherent lexicographic XMLbased modelling architecture is a prerequisite for an effective data structure. With the help of elexiko, a specific corpus-based, electronic reference work, she illustrates how text-technological methods can provide support for the overall consistency of sense-related pairings during the process of compiling a dictionary. Her discussion also outlines the technical requirements for achieving consistency in data-managing and data-linking. In their article “Lexical-semantic and conceptual relations in GermaNet”, Claudia Kunze and Lothar Lemnitzer discuss the relevance of some lexical as well as conceptual relations for a lexical resource of German. An overview of how these relations have been integrated in the construction and maintenance of GermaNet is given, and recent developments and their repercussions in terms of theoretical and application-oriented research are discussed. Other practical perspectives on the subject of meaning relations include the possible innovative and beneficial applications of GermaNet. It is beyond any doubt that this book remains a comparatively brief account of a complex field with a number of issues that are not even touched upon. Nevertheless, the intention is to stimulate further discussion and promote closer collaboration between the different fields. This collective volume attempts to show what research on lexical-semantic relations has to offer and to demonstrate, as Alan Cruse (2004: 141) asserts, “that sense relations are a worthwhile object of study”. As such, it is an invitation to scholars from every field whose interests involve words, meaning, the mind and/or language technology and who have a shared interest in lexical-semantic relations and approach the subject from various positions in philosophy, psychology, neuroscience, linguistics, lexicography, computer science, early childhood language acquisition and second language education, to name but a few.
Lexico-semantic relations in theory and practice Petra Storjohann
This paper provides a general overview of the treatment of lexico-semantic relations in different fields of research including theoretical and application-oriented disciplines. At the same time, it sketches the development of the descriptions and explanations of sense relations in various approaches as well as some methodologies which have been used to retrieve and analyse paradigmatic patterns.
1.
Lexicology: From structural to cognitive approaches
1.1
Structuralist approaches
From a lexicological point of view, the subject of sense relations has long been closely linked with several traditions of structural semantics and lexical field analysis, particularly within German linguistics. For decades, the theory of lexical field analysis was a very popular area of research, reaching its peak in the 1970s and 80s. Hence, it is automatically associated with the classical notion of the study of a language system, with atomised and isolated approaches, and the semantics of lexemes in terms of distinctive features. The emphasis is simultaneously on fixed and inherent semantic properties, componential meaning analysis and the idea that meaning can be neatly decomposed and described. The view was held that language is as an “externalized object” (Paradis 2009) with clearly recognisable structures. Sense relations were of particular interest since the basic assumption was that lexical meaning is constituted by the relations a lexeme holds with other lexemes in the same lexical-semantic paradigm. Structuralists not only made use of language as a system but also refered to lexical relations in terms of paradigmatic and syntagmatic structures implying strict distinctions between them.
. For an overview on the development of lexical field theory see Storjohann (2003: 25–40).
Petra Storjohann
Formalist linguists sought to define the meaning of lexical items by decompositional approaches, which worked well for modelling structural aspects such as phonology or syntax. But classical decompositional theories suffered from a number of problems, above all the belief that vocabulary has a definitional structure with distinct boundaries that can be precisely delimited. The traditional conception of sense relations was that of semantic connections between words and semantic interrelations among members of paradigmatic sets were viewed as stable and context-independent structures. Today, as a result, the phenomenon of sense relations is stigmatised and too closely linked to traditional or old-fashioned models. In German linguistics in particular, where once research on lexical-semantic structures flourished, the chapter on sense-related lexical terms was essentially closed by the works of Lutzeier (1981, 1985). His studies not only offered systematic examinations of lexical fields and their sense relations but they also made use of a stringent terminology and introduced the notion of contextual restrictions by bringing in key elements such as verbal context, syntactic category and semantic aspects. Particularly his later work pointed out the discrepancy between structuralist descriptions and textual structures in language use and prepared the ground for more empirical research on lexico-semantic relations in actual discourse. Nonetheless, the perception that the subject is obsolete persists to the present day and German semanticists have not further contributed to the general discussion on sense-related items in more recent contexts. In contrast, the situation with regard to semantics was never as bleak in the case of English linguistics, where scholars around the world were not so keen to avoid the subject of sense relations after the decline of the structuralist period. Lexical semantics in general has thrived in the UK, and its tradition is best exemplified by names such as John Lyons and Alan Cruse, both of whom have developed exhaustive definitions and descriptions of semantic relations. For them, the study of sense relations was central to the study of meaning. At the same time, the Firthian tradition developed which concentrated on syntagmatic relations. Collocations become the key notion and later the centre of attention to corpus linguists. Generally, more contextualised approaches to sense relations were encountered at that time with Cruse’s (1986) approach as a central piece of work in the tradition of the British Contextualism and consequently, it is studies on the English language which have succeeded in further advancing theories about lexico-semantic relations.
. Cf. Lyons (1968, 1977) and Cruse (1986). . “You shall know the meaning of a word by the company it keeps” (Firth 1957: 179).
1.2
Lexico-semantic relations in theory and practice
Cognitive approaches
As the notion of the lexicon started to be of interest to approaches to syntax which left behind the division between grammar and lexis, the nature of lexical semantics and the notion of the mental lexicon changed. New methodologies were introduced which looked at language from a usage-based perspective. However, corpus linguistics has largely focused its efforts on collocations and co-occurrences, and although linguistic theories have progressed, particularly in the area of cognitive linguistics, most semantic research has centred around issues such as polysemy and metaphor. And although the cognitive strand generally has had a major impact on lexical studies (cf. Geeraerts/Cuyckens 2007), the study of sense relations has not been a central component in the new semantic paradigm, and, as Cruse (2004: 141) concludes, “cognitive linguists, for the most part, have had very little to say on the topic”. Throughout his later work, Cruse himself has been concerned with bringing the cognitive aspect into his theory of meaning (Cruse 1992; Cruse/Togia 1995; Croft/Cruse 2004), unfortunately without incorporating new methodological approaches to substantiate his ideas. New guiding principles, assumptions and foundational hypotheses have become points of departure for semantic research in general, and they have gradually been transferred to the understanding of how sense relations are established in text and discourse. These concern how meaning is constructed. According to the cognitive school, meaning construction is equated with knowledge representation, categorisation and conceptualisation. Meaning is a process, it is dynamic, and it draws upon encyclopaedic knowledge and the subject of sense relations has started to be re-examined from a cognitive point of view. We now have a different understanding of how semantic relations are mentally represented and linguistically expressed, notions that are owed to the proliferation of research in the field of cognitive linguistics. Today, a number of linguists, mostly outside German linguistics, with a particular interest in lexical semantic relations, have reopened the chapter on sense relations offering new perspectives, employing new methodologies and using empirical evidence for their work. In particular, the Group for Comparative Lexicology has sought to advance theories around English and Swedish lexical relations. They have succeeded in showing how sense relations materialise in text and discourse. The question of whether sense relations are lexical relations, or rather conceptual-semantic relations, or relations among contextual construals, has been addressed. As a result, classical notions of the paradigmatics . Steven Jones (e.g. 2002), Lynne Murphy (e.g. 2003, 2006), Carita Paradis (e.g. 2005) and Caroline Willners (e.g. 2001) are particularly concerned with the study of English and Swedish opposites.
Petra Storjohann
of a lexical item have largely been abandoned. Recent semantic theories now account for lexical-semantic relations and are capable of accommodating all kinds of relations “ranging from highly conventionalized lexico-semantic couplings to strongly contextually motivated pairings” (Paradis forthcoming).
1.3
Corpus material and language in use
As Paradis (forthcoming) notes, it is methodologies which have radically changed studies on meaning and semantic relations. The basis of investigations is now determined by corpus procedures, by different observational and experimental techniques and by computational facilities and these contribute profitably to insights into the nature of the paradigmatics. A particularly promising trend within the new linguistic context is the fact that recent theories have also brought lexical semantics, and thus the subject of lexical-semantic relations, much closer to language in use and thought. Through the use of corpora, for example, we gain a different notion of language as it emerges from language use. The central function of language as a means of natural communication and its role in social interaction are no longer ignored. Conclusions are drawn not on the basis of intuitive judgement, but from real data and on the basis of mass data which account for recurrence, variability and the distribution of patterns. Generally, semanticists from various schools of thought have for a long time proved to be immune to corpus methods, and it is only recently that some researchers have made a compelling case for incorporating methods of corpus linguistics into semantics. This is all the more astonishing since both cognitive linguists and corpus linguists share an interest in contextualised, dynamically constructed meaning and in the grounding of language use in cognitive and social-interactional processes. Language in natural communicative situations involving speakers and addressees has come to occupy the seat of honour in cognitive linguistic research and the combination of the theoretical and empirical developments has sparked new interest in research on lexico-semantic relations and their functions in language and thought. (Paradis forthcoming)
In terms of empirical corpus studies, it is however predominantly the subject of English opposites that has attracted interest from a corpus-based perspective (e.g. Jones 2002; Murphy 2006), demonstrating how corpus evidence can be used to derive semantic models. Until now, corpus-oriented studies of sense relations have been rather few and far between. However, systematic corpus-guided investigations have shown that corpus methodologies have contributed greatly to the study of lexical-semantic paradigms, and yielded new insights into issues such as how these relational patterns behave and function in discourse.
2.
Lexico-semantic relations in theory and practice
Lexicography
The field where findings on semantic relations demand to be accounted for and where they are of potential utility is lexicography. Sense relations are documented in dictionaries of synonymy and antonymy or in onomasiological reference books such as a thesaurus. There is a striking clash between the findings of theoretical semantic research on the one hand, and the commercial and practical missions of dictionaries on the other hand. Dictionary entries provide lists of sense-related items which are treated as stable relations between words, often not even assigned to a specific sense. And however inappropriate and inconsistent the representations of the facts about a word and its relations might be, it seems impossible to make a reference book radically different. The pressure of a dictionary is to present definite answers and clear-cut definitions. Hence, often sets of discrete synonyms or antonyms are given for words without overlapping meanings. Although it is commonly agreed that the construction of lexico-semantic relations is flexible, lexicographers continue to offer only vague descriptions and struggle to present meaning, and hence sense relations, as context-dependent, variable and dynamic. In addition, although corpora have been available for some time now, the exploration of mass data and the use of corpus tools for lexicographic analysis are restricted to corpus-based investigations, leaving a pool of linguistic evidence to be used for acts of verification only. Corpus-driven methodologies, however, where the corpus is approached without any prior assumptions and where collocation profiles reveal insights into the use of sense-related items, are largely ignored. As a result, as Alan Cruse comments: No one is puzzled by the contents of a dictionary of synonymy, or by what lexicographers in standard dictionaries offer by way of synonyms, even though the great majority of these qualify neither as absolute nor as propositional synonyms. (Cruse 2004: 156)
An analysis of dictionary consultations by Harvey and Yuill in 1994 showed that in 10% of cases, users were looking for meaning equivalent terms. In over 36% of these situations, users were left without answers, or the information given was not satisfactory. Information on contextual conditions and situational usage was lacking. No other type of search showed the same degree of dissatisfaction. Users do
. For further differences between corpus-based and corpus-driven methodologies see Tognini-Bonelli (2001). . Unpublished research paper quoted in Partington (1998: 29).
10
Petra Storjohann
have an intuition for contextual restrictions of synonyms and they need to know the precise circumstances in which a lexical item can be substituted by a similar item. Yet most synonymies do not go beyond providing information on style or regional restrictions, although there are good methods available for comparing collocational profiles. As Partington stresses: Concordances of semantically similar lexical items can be studied, and students will inevitably discover differences in use which are not contained in grammars and dictionaries. (Partington 1998: 47)
In fact, lexicological insights about natural language, about synonyms and antonyms in use, should be reflected in lexicographic descriptions. And corpus methods may help to correct or supplement dictionary information to support users when deciding in what circumstances substitution is possible. Essentially, the new approaches and methodological opportunities to examine a word’s meaning and sense relations show that the lexicographer’s craft needs to be rethought. Another problem we are still facing today is that of persistent inconsistencies in the documentation of synonyms and antonyms even in computer and corpusassisted dictionaries. Although there are analyses of different dictionaries pointing out some of these inconsistencies (e.g. Storjohann 2006; Paradis/Willners 2007), they do not include ways of implementing methods and tools to equip a reference work with stable and consistent cross-referencing or alternatively bidirectional linking as found in electronic resources. This is a field of research where computational linguistics needs to come up with answers.
3.
Computational linguistics – lexical databases and corpus tools
One area where the subject of sense relations has a more encouraging track record is the vast field of natural language processing, machine translation, ontologies, lexical database projects and the development of tools for computational and corpus linguistics. These are areas where sense relations have been topical and prosperous over the past two decades. Building a large computational lexicon has for example been the aim of Princeton WordNet and versions of this in other languages (e.g. GermaNet). It is not their objective to describe the nature of lexico-semantic relations in language use nor is it their goal to create a reference system designed to be a psychologically realistic model of the mental lexicon as the Princeton Group initially aimed at. Such resources have been de. For further problems of inconsistencies in dictionaries and possible answers to avoid them see Müller-Spitzer in this volume.
Lexico-semantic relations in theory and practice
veloped to support automatic text analysis, to serve as a thesaurus application and to demonstrate the organisational principles of the lexicon by listing relational structures. Particularly paradigmatic relations play a central role, as so-called synsets are grouped and interlinked by means of conceptual-semantic and lexical relations such as direct and indirect antonymy. These synsets and their relations constitute a supposedly stable language system. Semantically vague lexemes and lexical gaps within the lexicon pose problems for its model and generally, from a psychological and lexicological point of view, the plausibility of some explanations and descriptions need to be critically questioned. As is the case for lexicography, greater collaborative research is also required between semanticists and IT specialists in order to construct computational resources able, on the one hand, to provide more objective descriptions of the lexicon and, on the other, to provide adequate research tools for the linguistics community. It is also the work of computational and corpus linguists as well as IT experts which furnishes theoreticians and lexicographers with machine-readable corpora and with the procedures and research methods required to access mass data in order to extract synonyms or antonyms and to confirm or revise prior assumptions. As a matter of fact, as Church et al. (1991: 116) point out, the human mind is incapable of discovering statistically relevant, typical patterns or even ordering them according to significance scores.
Recognising recurring structures is an essential goal of any linguistic interpretation. To all those retrieving, identifying and analysing paradigmatic relations, the application of various linguistic methods and tools have become indispensable for linguists and lexicographers with empirical pursuits. Irrespective of their view on semantic models, more and more linguists and lexicographers base their findings on corpus-analysing methods and hence on the employment of semantic and mathematical-statistical models. The work of those who develop and refine methods of analysis has therefore become increasingly important. But on the other hand, researchers in linguistics should participate more fully in the development of computational tools so that these can also meet more theoretical research needs. Despite the fact that corpus search and analysing tools are widely used in linguistics now, there are hardly any publications which examine the underlying ideas, methods and models behind most corpus applications. Essentially, for any empirical lexicological work, knowledge about underlying models is crucial . Cf. Miller et al. (1990). . See Willners/Paradis in this volume and Divjak/Gries (2008).
11
12
Petra Storjohann
in order to be able to analyse, evaluate and interpret the retrieved corpus data. Again, only much more collaboration between the different fields of linguistic research can provide greater theoretical as well as practical understanding.
References Church, Kenneth, Gale, William, Hanks, Patrick and Hindle, Donald. 1991. “Using statistics in lexical analysis.” In Lexical Acquisition: Using On-Line Resources to Build a Lexicon, Uri Zernik (ed.), 114–164. Hillsdale, NJ: Lawrence Erlbaum. Croft, William and Cruse, Alan. 2004. Cognitive Linguistics. Cambridge: Cambridge University Press. Cruse, Alan. 1986. Lexical Semantics. Cambridge: Cambridge University Press. Cruse, Alan. 1992. “Antonymy revisited: Some thoughts on the relationship between word and concepts.” In Frames, fields and contrasts: New essays in semantic and lexical organization, Adrienne Lehrer and Eva Feder Kittay (eds), 289–306. Hillsdale, NJ: Lawrence Erlbaum. Cruse, Alan. 2004. Meaning in Language. (2nd ed.) Oxford: Oxford University Press. Cruse, Alan and Togia, Pagona. 1995. “Towards a cognitive model of antonymy.” Journal of Lexicology 1: 113–141. Divjak, Dagmar and Gries, Stefan. 2008. “Clusters in the mind? Converging evidence from near synonymy in Russian.” The Mental Lexicon 3(2): 188–213. Firth, John R. 1957. Papers in Linguistics. London: Oxford University Press. Geeraerts, Dirk and Cuyckens, Hubert (eds). 2007. The Oxford Handbook of Cognitive Linguistics. New York: Oxford University Press. Harvey, Keith and Yuill, Deborah. 1994. “The COBUILD testing initiative: The introspective, encoding component”. Unpublished research paper. Cobuild/University of Birmingham. Jones, Steven. 2002. Antonymy: A corpus-based perspective. London: Routledge. Lutzeier, Peter Rolf. 1981. Wort und Feld. Wortsemantische Fragestellungen mit besonderer Berücksichtigung des Wortfeldbegriffes. Tübingen: Niemeyer. Lutzeier, Peter Rolf. 1985. “Die semantische Struktur des Lexikons.” In Handbuch der Lexikologie, Christoph Schwarze and Dieter Wunderlich (eds), 103–133. Königstein/Ts: Athenäum. Lyons, John. 1968. Introduction to Theoretical Linguistics. Cambridge: Cambridge University Press. Lyons, John. 1977. Semantics. 2 vols. Cambridge: Cambridge University Press. Miller, George, Beckwith, Richard, Fellbaum, Christiane, Gross, Derek and Miller, Katherine. 1990. “Introduction to WordNet: An on-line lexical database.” International Journal of Lexicography 3(4): 235–244. Murphy, M. Lynne. 2003. Semantic relations and the lexicon. Cambridge: Cambridge University Press. Murphy, M. Lynne. 2006. “Is ‘paradigmatic construction’ an oxymoron? Antonym pairs as lexical constructions”. Constructions SV1. http://www.constructions-online.de/. Paradis, Carita. 2005. “Ontolgies and construals in lexical semantics.” Axiomathes 15: 541–573. Paradis, Carita. (forthcoming). “Good, better, superb antonyms: A dynamic construal approach to oppositeness.” In Proceedings of 19th International Symposium on Theoretical and Applied Linguistics (Aristotle University of Thessaloniki, April 2009).
Lexico-semantic relations in theory and practice
Paradis, Carita and Willners, Caroline. 2007. “Antonyms in dictionary entries: Selectional principles and corpus methodology.” Studia Linguistica 61(3): 261–277. Partington, Alan. 1998. Patterns and Meanings. Using Corpora for English Language Research and Teaching. Amsterdam/Philadelphia: John Benjamins. Storjohann, Petra. 2003. A Diachronic Contrastive Lexical Field Analysis of Verbs of Human Locomotion in German and English. Frankfurt: Peter Lang. Storjohann, Petra. 2006. “Korpora als Schlüssel zur lexikografischen Überarbeitung. Die Neubearbeitung des Dornseiff.” In Lexikographica. Internationales Jahrbuch für Lexikographie. 21/2005, Fredric F. M. Dolezal, Alain Rey, Herbert Ernst Wiegand, Werner Wolski and Ladislav Zgusta (eds), 83–96. Tübingen: Niemeyer. Tognini-Bonelli, Elena. 2001. Corpus Linguistics at Work. Amsterdam/Philadelphia: John Benjamins.
13
Swedish opposites A multi-method approach to ‘goodness of antonymy’* Caroline Willners and Carita Paradis
This is an investigation of ‘goodness of antonym pairings’ in Swedish, which seeks answers to why speakers judge antonyms such as bra–dålig (good–bad) and lång–kort (long–short) to be better antonyms than, say, dunkel–tydlig (obscure–clear) and rask–långsam (speedy–slow). The investigation has two main aims. The first aim is to provide a description of goodness of Swedish antonym pairings based on three different observational techniques: a corpus-driven study, a judgement experiment and an elicitation experiment. The second aim is to evaluate both converging and diverging results on those three indicators and to discuss them in the light of what the results tell us about antonyms in Swedish, and perhaps more importantly, what they tell us about the nature of antonymy in language and thought more generally.
1.
Introduction
In spite of the widespread consensus in the linguistic literature that contrast is fundamental to human thinking and that antonymy as a lexico-semantic relation plays an important role in organising and constraining the vocabularies of languages (Lyons 1977; Cruse 1986; Fellbaum 1998; Murphy 2003), relatively little empirical research has been conducted on antonymy, either using corpus methodologies or experimental techniques. No studies have been conducted using a combination of both methods. The general aim of this article is to describe a combination of methods useful in the study of antonym canonicity, to summarise the results and to assess their
* Thanks to Joost van de Weijer for help with the statistics, to Anders Sjöström for help with producing figures and to Simone Löhndorf for help with data collection.
16
Caroline Willners and Carita Paradis
various advantages and disadvantages for a better understanding of goodness of antonymy as a lexico-semantic construal. By combining methods, we hope to contribute to the knowledge about the nature of antonymy as a relation of binary contrast. A mirror study has been performed for English and is reported on in Paradis et al. (2009). Antonyms are at the same time minimally and maximally different from one another. They activate the same conceptual domain, but they occupy opposite poles/parts of that domain. Due to the fact that they are conceptually identical in all respects but one, we perceive them as maximally similar, and, at the same time, due to the fact that they occupy radically different poles/parts, we perceive them as maximally different (Cruse 1986; Willners 2001; Murphy 2003). Words that we intuitively associate with antonymy are adjectivals (Paradis and Willners 2007). Our approach assumes antonyms, both more strongly canonical and less canonical, to be conceptual in nature. Conceptual knowledge reflects what speakers of languages know about words, and such knowledge includes knowledge about their relations (Murphy 2003: 42–60; Paradis 2003, 2005; Paradis et al. 2009). Treating relations as relations between concepts, rather than relations between lexical items is consistent with a number of facts about the behaviour of relations. Firstly, relations display prototypicality effects, in that there are better and less good relations. In other words, not only is torr (dry) the most salient and wellestablished antonym of våt (wet), but the relation itself may also be perceived as a better antonym relation than, say, seg–mör (tough–tender). When asked to give examples of opposites, people most often offer pairs like bra–dålig (good–bad), svag–stark (weak–strong), svart–vit (black–white) and liten–stor (small–large), i.e. common lexical items along salient (canonical) dimensions. Secondly, just like non-linguistic concepts, relations in language are about construals of similarity, contrast and inclusion. For instance, antonyms may play a role in metonymisation and metaphorisation. At times, new metonymic or metaphorical coinages seem to be triggered by relations. One such example is slow food as the opposite of fast food. Thirdly, lexical pairs are learnt as pairs or construed as such in the same contexts. Canonicity plays a role in new uses of one of a pair of a salient relation. For a longer introduction to this topic, see Paradis et al. (2009). The central issue of this paper concerns ‘goodness of antonymy’ and methods to study this. Like Gross and Miller (1990), we assume that there is a small group of strongly antonymic word pairs (canonical antonyms) that behave differently from other less strong (non-canonical) antonyms. (Direct/indirect and lexical/ conceptual are alternative terms for the same dichotomy.) For instance, it is likely that speakers of Swedish would regard långsam–snabb (slow–fast) as a good example of canonical antonymy, while långsam–kvick (slow–quick), långsam–rask (slow–rapid) and snabb–trög (fast–dull) are perceived as less good opposites.
Swedish opposites
All these antonymic pairs in turn will be different from unrelated pairs such as långsam–svart (slow–black) or synonyms such as långsam–trög (slow–dull). As for their behaviour in text, Justeson and Katz (1991, 1992) and Willners (2001) have shown that antonyms co-occur in the same sentence at higher than chance rates, and that canonical antonyms co-occur more often than non-canonical antonyms and other semantically possible pairings (Willners 2001). These data support the dichotomy view of the Princeton WordNet and Gross and Miller (1990). The test set used in the present study consists of Swedish word pairs of four different types: Canonical antonyms, Antonyms, Synonyms and Unrelated word pairs (see Tables 4 and 5). The words in the Unrelated word pairs are always from the same semantic field but the semantic relation between them is not clear even though they might share certain aspects of meaning, e.g. het–plötslig (hot–sudden). Synonyms and Unrelated word pairs were introduced as control groups. While it is not possible to distinguish the four types using corpus methodologies, we expect significant results when judged for ‘goodness of oppositeness’ experimentally and in the number of unique responses when the individual words are used as stimuli in an elicitation test. All of the word pairs included in the study co-occur in the same sentence significantly more often than chance predicts. An early study of ‘goodness of antonymy’ is to be found in Herrmann et al. (1979). They assume a scale of canonicity and use a judgement test to obtain a ranking of the word pairs in the test set. We include a translation of a subset of his test items in this study in an attempt to verify or disconfirm his results. The procedure is as follows. Section 2 discusses some methodological considerations before the methods used are described in detail in the sections that follow. Corpus-driven methods are used to produce the test set (Section 4) that is used in the elicitation experiment (Section 5) and the judgement experiment (Section 6). A general discussion of the results and an assessment of the methods are found in Section 7. Finally, the study is concluded in Section 8. Before going into details about our method and experiments, we give a short overview of previous work relevant to the present study.
2.
Methodological considerations
In various previous studies, we explored antonymy using corpus-based as well as corpus-driven approaches (e.g. Willners 2001; Jones et al. 2007; Murphy et . In current empirical research where corpora are used, a distinction is made between corpus-based and corpus-driven methodologies (Francis 1993; Tognini-Bonelli 2001: 65–100;
17
18
Caroline Willners and Carita Paradis
al. 2009; Paradis et al.). Corpus data are useful for descriptive studies since they reflect actual language use. They provide a basis for studying language variation, and they also often provide metadata about speakers, genres and settings. Another, very important property of corpus data is that they are verifiable, which is an important requirement for a scientific approach to linguistics. Through corpus-driven methods, it is possible to extract word pairs that share a lexical relation of some sort. However, there is no method available for identifying types of relation correctly. For instance, it is not possible to tell the difference between antonyms, synonyms and other semantically related word pairs (in this case word pairs from the same dimensions, which co-occur significantly at sentence level, but are neither antonyms, nor synonyms, e.g. klen (weak) – kort (short). The answer(s) to the types of question we are asking are not to be found solely on the basis of corpus data. As Mönnink (2000: 36) puts it The corpus study shows which of the theoretical possibilities actually occur in the corpus, and which do not.
The questions we are asking call for additional methods. A combination of corpus data, elicitation data and judgement data is valuable in order to determine if and how antonym word pairs vary in canonicity. It also sheds light on different aspects of the issue. Like Mönnick (2000), we believe that a methodologically sound descriptive study of linguistics is cyclic and preferably includes both corpus evidence and intuitive data (psycho-linguistic experimental data).
3.
Data extraction
3.1
Method
Antonyms co-occur in sentences significantly more often than chance would predict and canonical antonyms co-occur more often than contextually restricted Storjohann 2005; Paradis and Willners 2007). The distinction is that the corpus-based methodology makes use of the corpus to test hypotheses, expound theories or retrieve real examples, while in corpus-driven methodologies, the corpus serves as the empirical basis from which researchers extract their data with a minimum of prior assumptions. In the latter approach, all claims are made on the basis of the corpus evidence with the necessary proviso that the researcher determines the search items in the first place. Our method is of a two-step type, in that we mined the whole corpus for both individual occurrences and co-occurrence frequencies for all adjectives without any restrictions, and from those data we selected our seven dimensions and all their synonyms.
Swedish opposites
Table 1. Observed and expected sentential co-occurrences of 12 different adjective pairs (from Willners 2001: 72) Word1
Word2
N1
N2
Co
Expected Co
Ratio
P-value
bred djup gammal hög kall kort liten ljus långsam lätt lätt tjock
smal grund ung låg varm lång stor mörk snabb svår tung tunn
113 117 1050 760 102 262 1344 84 55 225 225 53
55 17 455 333 102 604 2673 126 163 365 164 85
2 1 47 47 12 21 111 7 4 5 7 4
0.12 0.04 8.84 4.68 0.19 2.93 66.48 0.20 0.17 1.52 0.68 0.08
17.39 27.17 5.32 10.04 62.32 7.17 1.67 35.82 24.11 3.29 10.25 47.98
0.0061 0.036 0 0 0 0 0 0 0 0.020 0 0
antonyms (Justeson and Katz 1991; Willners 2001). This knowledge helps us to decide which antonyms to select for experiments investigating antonym canonicity. Willners and Holtsberg (2001) developed a computer program called Coco to calculate expected and observed sentential co-occurrences of words in a given set and their levels of probability. An advantage of Coco was that it took variation of sentence length into account, unlike the program used by Justeson and Katz (1991). Coco produces a table which lists the individual words and the number of individual occurrences of these words in the corpus in the four left-most columns. Table 1 lists 12 Swedish word pairs that were judged to be antonymous by Lundbladh (1988) from Willners (2001): N1 and N2 are the number of sentences respectively in which Word1 and Word2 occur in the corpus. Co is the number of times the two words are found in the same sentence and Expected Co is the number of times they are expected to co-occur in the same sentence if predicted by chance. Ratio is the ratio between Observed and Expected co-occurrences and P-value is the probability of finding the actual number of co-occurrences that was observed or more under the null hypothesis that the co-occurrences are due to pure chance only. All of Lundbladh’s antonym pairs co-occurred in the same sentence significantly more often than predicted by chance. Willners (2001) reports that 17% of the 357 Swedish adjective pairs that cooccurred at a significance level of 10–4 in the SUC were antonyms. The study . Stockholm-Umeå Corpus, a one-million-word corpus compiled according to the same principles as the Brown Corpus. See http://www.ling.su.se/staff/sofia/suc/suc.html.
19
20 Caroline Willners and Carita Paradis
Table 2. The top 10 co-occurring adjective pairs in the SUC, sorted according to rising p-value Swedish antonyms
Translation
höger–vänster kvinnlig–manlig svart–vit hög–låg inre–yttre svensk–utländsk central–regional fonologisk–morfologisk horisontell–vertikal muntlig–skriftlig
right–left female–male black–white high–low inner–outer Swedish–foreign central–regional phonological–morphological horizontal–vertical oral–written
included all adjectives in the corpus. When the same data were (quite unorthodoxly) sorted according to rising p-value, antonyms clustered at the top of the list as in Table 2. Most of the antonym word pairs were classifying adjectives with overlapping semantic range, e.g. fonologisk–morfologisk (phonological–morphological) and humanistisk–samhällsvetenskaplig (humanistic–of Social Sciences). Among the 83% of the word pairs that were not antonyms were many other lexically related words. Furthermore, Willners (2001) compared the co-occurrence patterns of what Princeton calls direct antonyms and indirect antonyms. Both types co-occur significantly more often than chance predicts. However, there is a significant difference between the two groups: while the indirect antonyms co-occur overall 1.45 times more often than would be expected if predicted by chance, the direct antonyms co-occur 3.12 times more often than expected. The hypothesis we are testing in this study is that there are good and bad antonyms (cf. canonical and non-canonical). Coco provides a data-driven method of identifying semantically related word pairs. We used Coco to suggest possible candidates for the test set. However, since we wanted a balance between Canonical antonyms, Antonyms, Synonyms and Unrelated word pairs in the test set, human interference was necessary and we picked out the test items manually from the lists produced by Coco.
3.2
Result
Using the insights from previous work on antonym co-occurrence as our point of departure, we developed a methodology for selecting data for our experiments. To start with, we agreed on a set of seven dimensions from the output of
Swedish opposites
Table 3. Seven corresponding canonical antonym pairs in Swedish and English Dimension
Swedish antonyms
Translation
speed luminosity strength size width merit thickness
långsam–snabb mörk–ljus svag–stark liten–stor smal–bred dålig–bra tunn–tjock
slow–fast dark–light weak–strong small–large narrow–wide good–bad thick–thin
the corpus searches of sententially co-occurring items that we perceived to be good candidates for a high degree of canonicity and identified the pairs of antonyms that we thought were the best linguistic exponents of these dimensions (see Table 3). For cross-linguistic research we made sure that the word pairs also had well-established correspondences in English. The selected antonym pairs are all scalar adjectives compatible with scalar degree modifiers such as very. Using Coco, we ran the words through the SUC. All of them co-occurred in significantly high numbers at sentence level and these pairs were set up as Canonical antonyms. Next, all Synonyms of the 14 adjectives were collected from a Swedish synonym dictionary. All the Synonyms of each of the words in each antonym pair were matched and run through the SUC in all possible constellations for sentential co-occurrence. This resulted in a higher than chance co-occurrence for quite a few words for each pair. We extracted the pairs that were significant at a level of p < 0.01 for further analysis. Using dictionaries and our own intuition, we then categorised the word pairs according to semantic relations. Finally, we picked two Antonyms, two Synonyms and one pair of Unrelated adjectives from the list of significantly co-occurring word pairs for dimension. Table 4 shows the complete set of pairs retrieved from the SUC: 42 pairs in all. We also included eleven word pairs from Herrmann et al.’s (1979) study of ‘goodness of antonymy’ (see Table 5). From his ranking of 77 items, we picked every sixth word pair, translated them into Swedish and classified them according to semantic relation: Canonical antonym (C), Antonym (A) and Unrelated (U). None of the pairs from Herrmann et al. (1979) were judged to be synonymous. The word pairs as well as the individual words in Table 4 and Table 5 were used as the test set in the psycholinguistic studies described below.
. Strömbergs synonymordbok 1995. Alva Strömberg, Angered: Strömbergs bokförlag.
21
22
Caroline Willners and Carita Paradis
Table 4. The test set retrieved from the SUC. See Appendix A for translations Canonical antonyms
Antonyms
Synonyms
Unrelated
långsam–snabb
långsam–flink tråkig–het vit–dunkel melankolisk–munter lätt–muskulös senig–kraftig obetydlig–kraftig liten–väldig smal–öppen trång–rymlig dålig–god ond–bra4 genomskinlig–svullen fin–grov
långsam–släpig snabb–rask ljus–öppen mörk–svart svag–matt stark–frän stor–inflytelserik liten–oansenlig smal–spinkig bred–kraftig dålig–låg bra–god tunn–spinkig tjock–kraftig
het–plötslig
ljus–mörk svag–stark liten–stor smal–bred dålig–bra tunn–tjock
dyster–präktig flat–seg klen–kort liten–tjock fin–tokig knubbig–tät
Table 5. Test items selected from Herrmann et al. (1979) Word1
Word2
Translated from
Herrmann’s score
Semantic relation
ful smutsig trött lugn hård irriterad sparsmakad overksam förtjusande framfusig vågad
vacker fläckfri pigg upprörd böjlig glad spännande nervös förvirrad Hövlig sjuk
beautiful–ugly immaculate–filthy tired–alert disturbed–calm hard–yielding glad–irritated sober–exciting nervous–idle delightful–confused bold–civil daring–sick
4.90 4.62 4.14 3.95 3.28 3.00 2.67 2.24 1.90 1.57 1.14
C A C A A A A U U A A
. Due to sparse data, this item was added despite the fact that it did not meet the general criterion of being over the limit of 0.01. We chose ond–bra (evil–good) because we expected interesting results for the English counterpart in the mirror study. Ond–bra (evil–good) is included in the test set, but is not included in the result discussions.
Swedish opposites
4.
Elicitation experiment
This section describes the method and the results of the elicitation experiment.
Stimuli and procedure The test set for the elicitation experiment involves the individual adjectives that were extracted as co-occurring pairs from the SUC and translations of selected word pairs from Herrmann et al.’s (1979) list of adjectives perceived by participants as better and less good examples of antonyms (see Table 4 and Table 5). Some of the individual adjectives occur in more than one pair, i.e. they might occur once, twice or three times. For instance, långsam (slow) occurs three times and snabb (fast) occurs twice. All second and third occurrences were removed from the elicitation test set, which means that långsam (slow) and snabb (fast) occur once in the test set used in the elicitation experiment. Once this was done, the adjectives were automatically randomised and printed in the randomised order. All in all, the test contains 85 stimulus words. All participants obtained the adjectives in the same order. The participants were asked to write down the best opposites they could think of for each of the 85 stimuli words in the test set. For instance:
Motsatsen till LITEN är ‘The opposite of SMALL is’ Motsatsen till PRÄKTIG är ‘The opposite of DECENT is’
The experiment was performed using paper and pencil and the participants were instructed to do the test sequentially, that is, to start from word one and work forwards and not to go back to check or change anything. There was no time limit, but the participants were asked to write the first opposite word that came to mind. Each participant also filled in a cover page with information about name, sex, age, occupation, native language and parents’ native language. All the responses were then coded into a database using the stimulus words as anchor words.
Participants Twenty-five female and 25 male native speakers of Swedish participated in the elicitation test. They were between 20 and 70 years of age and represented a wide range of occupations as well as levels of education. All of them had Swedish as their first language, as did their parents. The data were collected in and around Lund, Sweden.
23
24
Caroline Willners and Carita Paradis
Predictions Our predictions are as follows: – The test items that we deem to have canonical antonyms will elicit only one another. – The test items that we do not deem to be canonical will elicit varying numbers of antonyms – the better the antonym pairing, the fewer the number of elicited antonyms. – The elicitation experiment will produce a curve from high participant agreement (few suggested antonyms) to low participant agreement (many suggested antonyms).
4.1
Results
We will start by reporting the general results in Section 5.1.1 and then go on to discuss the results concerning bidirectionality in Section 5.1.2. We performed a cluster analysis, the results of which are presented in Section 5.1.3.
4.1.1 General results The main outcome of the elicitation experiment is that there is a continuum of lexical association of antonym pairs. In line with our predictions, there was a number of test words for which all the participants suggested the same antonym: bra (good) – dålig (bad), liten (small) – stor (large), ljus (light) – mörk (dark), låg (low) – hög (high), mörk (dark) – ljus (light), sjuk (ill) – frisk (healthy), smutsig (dirty) – ren (clean), stor (large) – liten (small), and vacker (beautiful) – ful (ugly). All the elicited antonyms across the test items are listed in Appendix A. Appendix A also shows that there is a gradual increase of responses from the top of the list to the bottom of the list. The very last item is sparsmakad (fastidious), for which 33 different antonyms were suggested by the participants (including a non-answer). The shape of the list of elicited antonyms across the test items in Appendix A strongly suggests a scale of canonicity from very good matches to test items with no clear partners. While Appendix A gives all the elicited antonyms across the test items, it does not provide information about the scores for the various individual elicited responses. The three-dimensional diagram in Figure 1 is a visual representation of how some stimulus words elicited the same word from all participants. Those are the maximally high bars found to the very left of the diagram (e.g. bra (good), liten (small), ljus (light), etc.). Then four words follow for which 49 of the participants suggested the same antonym while another opposite was suggested in the 50th case. These four stimulus words were dålig (bad), svag (weak), stark (strong), and
Swedish opposites
Figure 1. The distribution of Swedish antonyms in the elicitation experiment. The Y-axis gives the test items, with every tenth test item written in full. The X-axis gives the number of suggested antonyms across the participants given on the Z-axis
ond (evil). Forty-nine of the participants suggested bra (good) as an antonym of dålig (bad), stark (strong) for svag (weak), svag (weak) for stark (strong) and god (good) for ond (evil). The ‘odd’ suggestions were frisk (healthy) for dålig (bad), klar (clear) for svag (weak), klen (feeble) for stark (strong) and snäll (kind) for ond (evil). Since there are two response words for each of the four stimuli in these cases, there are two bars, one 49 units high at the back representing the most commonly suggested antonym and one small bar, only one unit high, in front of the big one, representing the single suggestions frisk (healthy), klar (clear), klen (feeble) and snäll (kind). The further we move towards the right in Figure 1, the more diverse the responses. In fact, the single suggestions spread out like a rug covering the bottom of the diagram as we move towards the right. However, there is usually a preferred response word which most of the participants suggested. There are some stimuli for which two response words were equally popular choices or which at least were both suggested by a considerable number of participants. For example, for lätt (light/easy), 29 participants suggested tung (heavy) and 20 svår (difficult); het (hot) elicited the responses kall (cold) (24) and sval
25
26 Caroline Willners and Carita Paradis
(chilly) (20); and for god (good), participants suggested ond (evil) (20) and äcklig (disgusting) (19). A common feature of these stimulus words is that they are associated with different strongly competing meaning dimensions or salient readings. Some other examples are framfusig (bold): tillbakadragen (unobtrusive) (20) and blyg (shy) (16); trång (narrow): rymlig (spacious) (17) and vid (wide) (15); fläckfri (spotless): fläckig (spotted) (17) and smutsig (dirty) (15); grov (coarse): fin (fine) (17) and tunn (thin) (14). Like Appendix A, Figure 1 indicates that there is a scale of canonicity with a group of highly canonical antonyms to the left and a gradual decrease of canonicity as we move towards the right in the diagram. The stimulus words on the lefthand side of Figure 1 cannot be said to have any good antonyms at all.
4.1.2 Bidirectionality In addition to the distribution of the responses for all the test items across all the participants, we also investigated to what extent the test items elicit one another in both directions. For instance, 50 participants gave dålig (bad) as an antonym of bra (good) and ful (ugly) for vacker (beautiful), but the pattern was not the same in the other direction. This is part of the information in Appendix A and Figure 1, but it is not obvious from the way the information is presented. For the test items that speakers of Swedish intuitively deem to be good pairs of antonyms, this strong agreement held true in both directions, although not at the level of a oneto-one match, but one-to-two or one-to-three. While 50 participants responded with dålig (bad) as the best opposite of bra (good), two antonyms were suggested for dålig (bad): bra (good) by 49 participants and frisk (healthy) by one participant. This points to the possibility that there is a stronger relationship between bra (good) and dålig (bad) than between frisk (healthy) and dålig (bad). In other words, Figure 1 shows that the more canonical pairs elicit only one or two antonyms, while there is a steady increase in numbers of ‘best’ antonyms the further we move to the right-hand side of the figure. 4.1.3 Cluster analysis In order to shed light on the strength of the lexicalised oppositeness, a cluster analysis of strength of antonymic affinity between the lexical items that co-occurred in both directions was performed. It is important to note that only items that were also test items were eligible as candidates for participation in bidirectional relations. This means that some of the pairings suggested by the participants were not included in the cluster analysis. For instance, tung (heavy) was considered the best antonym of lätt (light) by 29 of the participants (as compared to 20 for svår (difficult)), but since neither tung nor svår were included among the test items, the pairings were not measured in the cluster analysis. The results of
Swedish opposites
the cluster analysis are, however, comparable to the results of sentential co-occurrence of antonyms in the corpus data and the results of the judgement experiment, since the same word pairs are included. To this end, a hierarchical agglomerative cluster analysis using Ward amalgamation strategy (Oakes 1998: 119) was performed on the subset of the data that were bidirectional. Agglomerative cluster analysis is a bottom-up method that takes each entity (i.e. antonym paring) as a single cluster to start with and then builds larger and larger clusters by grouping together entities on the basis of similarity. It merges the closest clusters in an iterative fashion by satisfying a number of similarity criteria until the whole dataset forms one cluster. The advantage of cluster analysis is that it highlights associations between features as well as the hierarchical relations between these associations (Glynn et al. 2007; Gries and Divjak 2009). Cluster analysis is not a confirmatory analysis but a useful tool for exploratory purposes. Figure 2 shows the dendrogram produced on the basis of the cluster analysis. The number of clusters was set to four to match the four conditions on the basis of which we retrieved our data from the sententially co-occurring pairs in the first place (Canonical antonyms, Antonyms, Synonyms and Unrelated). Figure 2 shows the hierarchical structure of the clusters. There are two branches. The leftmost branch hosts Cluster 1 and Cluster 2 and the right-most branch Cluster 3 and Cluster 4. The closeness of the fork to the clusters indicates a closer relationship. The tree structure reveals that there is a closer relation between Cluster 3 and Cluster 4 than between Cluster 1 and Cluster 2. Figure 2 gives the actual pairings in the boxes at the end of the branches. There are fewer pairs at the end of the left-most branches than at the end of the branches on the right-hand side. Five of the word pairs in Cluster 1 were included in the test set as Canonical antonyms: långsam–snabb (slow–quick), ljus–mörk (light–dark), svag–stark (weak–strong), bra–dålig (good–bad) and liten–stor (small–large) (subscripted with c in Figure 2). The other two word pairs in Cluster 1 were vit–svart (white–black) from the luminosity dimension and tjock–smal (fat–thin) from thickness. The rest of the word pairs in Cluster 1 were not included as pairs in the experiment. In Cluster 2, there are four word pairs featured in the test set as Canonical: tunn–tjock (thin–thick), bred–smal (wide–narrow), vacker–ful (beautiful–ugly) and trött–pigg (tired–alert). The rest of the word pairs in Cluster 2 are intuitively good parings. They were, however, not among the parings that we deemed canonical in the design of the test set, e.g. upprörd–lugn (upset–calm), väldig–liten (enormous–small), fin–ful (pretty–ugly), nervös–lugn (nervous–calm), ond–god (evil–good) and rymlig–trång (spacious–narrow).
27
28
Caroline Willners and Carita Paradis
Figure 2. Dendrogram of the bidirectional data
It is not obvious what the systematic differences are between the degrees of oppositeness in Clusters 3 and 4. As the dendrogram above shows, they are in fact associated. However, they do not correspond to the Synonyms and Unrelated word pairs in the test set, since the cluster analysis is based on the results of the elicitation experiment where the participants were asked to provide the best antonym.
5.
Judgement experiment
This section describes the methodology of the judgement experiment in which the participants were asked to evaluate word pairings in terms of how good they thought each pair was as a pair of antonyms. The experiment was carried out online. The design of the screen is shown in Figure 3.
Swedish opposites
Figure 3. An example of a judgement task in the online experiment (translated into English)
As Figure 3 shows, the participants were presented with questions of the form: Hur bra motsatser är X–Y? (How good is X–Y as a pair of opposites?) and Hur bra motsatser är Y–X? (How good is Y–X as a pair of opposites?) The question was formulated using bra (good) (not dålig (bad)) in order for the participants to understand the question as an impartial how-question, since Hur dåliga motsater är fet–smal? (How bad is fat–lean as a pair of opposites?) presupposes ‘badness’. The end-points of the scale were designated with both icons and text. On the left-hand side there is a sad face (very bad antonyms), while there is a happy face on the right-hand side (excellent antonyms). The task of the participants was to tick a box on a scale consisting of eleven boxes. We were also interested in whether the ordering of the pairs had any effect. Our predictions were as follows. – The nine test pairings that we deem to be canonical will receive 11 on the scale of ‘goodness’ of pairing of opposites. – The order of presentation of the Canonical antonyms as well as the Antonyms will give rise to significantly different results. Word1–Word2 will be considered better pairings than Word2–Word1. – There will be significant differences between the judgements about Canonical antonyms, Antonyms, Synonyms and Unrelated pairings.
Stimuli The same test set as in the elicitation experiment was used (see Table 4 and 5), but while the pairing of the antonyms was not an issue in the first experiment, it was essential to the judgement test. The stimuli were presented as pairs and the test items were automatically randomised for each participant. Half of the participants were given the test items in the order Word1–Word2, while the other half were presented with the words in reverse order, i.e. Word2–Word1.
29
30
Caroline Willners and Carita Paradis
Procedure The judgement experiment was performed online using E-prime as experimental software. E-prime is a commercially available Windows-based presentation program with a graphical interface, a scripting language similar to Visual Basic and response collection. E-prime conveniently logged the ratings as well as the response times in separate files for each of the participants. The participants were presented with a new screen for each word pair (see Figure 3). The task of the participants was to tick a box on a scale consisting of eleven boxes. The screen immediately disappeared upon clicking which prevented the participants from going back and changing their responses. Between each judgement task there was a blank screen with an asterisk, and when the participants were ready for the next task they signalled that with a mouse-click. Before the actual test started, the participants were asked to give some personal data (name, age, sex, occupation, native language and parents’ native language). There then followed some instructions such as how to do the mouse-clicks and information about the fact that the test was self-paced. Each participant had two test trials before the actual judgement test of the 53 test items. The purpose of the study was revealed to the participants in the instructions. As has already been mentioned, the judgement experiment was divided into two parts: 25 participants were given the test set as Non-Reverse (Word1–Word2, e.g. långsam–snabb (slow–fast)) and 25 participants were given the test set in the reverse order: Reverse (Word2–Word1, e.g. snabb–långsam (fast–slow)). This was done to measure whether the order of the sequence influenced the results in any way. Participants Fifty native speakers of Swedish participated in the judgement test. None of them had previously participated in the elicitation test. Twenty-nine of the participants were women and 21 were men between 20 and 62 years of age. All of them had Swedish as their first language. 5.1
Results
This section reports on the results of the judgement experiment. We start reporting on the results concerning sequencing in Section 6.1.1 since they affect the treatment of the data reported in the section on strength of canonicity (Section 6.1.2).
5.1.1 Sequencing As has already been pointed out, the test was performed in such a way that half of the participants were presented with the test items in the order: Word1–Word2,
Swedish opposites
and the other half in reverse order, Word2–Word1. We assumed that the order would have an impact on the results, at least for the Canonical antonyms. A subject analysis and an item analysis were performed. The factors involved were directionality, category (Canonical antonyms, Antonyms, Synonyms and Unrelated) and the interaction between directionality and category. In the subject analysis, each participant was the basic element for analysis. All judgements for the individual participants were averaged within each of the four conditions, yielding four numbers per participant. Then a repeated measures ANOVA analysis of variance (Woods et al. 1986: 194–223) was performed on both data sets. In the item analysis, each item (i.e. word pair) was the basic element for analysis. The judgements given by each participant on each condition were averaged, resulting in four numbers for each item, and a Univariate General Linear Model analysis was performed. Finally, Bonferroni’s post hoc test (Field 2005: 339) was used to compare the differences between the categories. The same procedure was used for the response times. The statistical analysis shows that the order of sequence does not have any effect on the results: F1[1,48] = 1.056, p = 0.309, F2[1,98] = 0.206, p = 0.651. The interaction between the sequence and category does not have an effect either: F1[3,144] = 0.811, p > 0.05, F2[3,98] = 0.069, p = 0.976. Category, on the other hand, does have an effect: F1[3,144] = 1777.991, p < 0.001, F2[3,98] = 138.987, p < 0.001. Figure 4 shows that the two test batches (marked with REV = 0 and
Figure 4. Sequential ordering: there is no significant difference between the mean answers of the two test batches
31
32
Caroline Willners and Carita Paradis
Table 6. Mean responses for each of the word pairs in the test set, both directions included Word pair
Mean response
Semantic category
ljus–mörk långsam–snabb liten–stor svag–stark trött–pigg dålig–bra ful–vacker smal–bred tunn–tjock
10.92 10.88 10.84 10.80 10.76 10.68 10.64 10.60 10.40
C C C C C C C C C
fin–grov trång–rymlig dålig–god smutsig–fläckfri lugn–upprörd melankolisk–munter liten–väldig framfusig–hövlig hård–böjlig långsam–flink ond–bra irriterad–glad senig–kraftig obetydlig–kraftig vit–dunkel lätt–muskulös liten–tjock tråkig–het smal–öppen sparsmakad–spännande
10.32 10.20 9.84 9.36 9.28 9.04 8.52 8.40 7.84 7.80 6.84 6.56 5.88 5.44 5.40 4.44 4.08 3.68 3.20 2.76
A A A A A A A A A A A A A A A A U A A A
bra–god dyster–präktig overksam–nervös fin–tokig förtjusande–förvirrad långsam–släpig svag–matt stark–frän stor–inflytelserik knubbig–tät genomskinlig–svullen
2.52 2.00 1.92 1.88 1.84 1.80 1.76 1.76 1.68 1.68 1.68
S U U U U S S S S U A
Word pair snabb–rask bred–kraftig flat–seg dålig–låg ljus–öppen klen–kort liten–oansenlig het–plötslig mörk–svart tjock–kraftig vågad–sjuk smal–spinkig tunn–spinkig
Swedish opposites
Mean response
Semantic category
1.60 1.60 1.56 1.56 1.48 1.48 1.44 1.44 1.40 1.40 1.32 1.28 1.24
S S U S S U S U S S A S S
REV = 1) follow the same pattern. Since the order of the sequence did not have an impact on the results, the data for the two directions will be treated as one batch and will not be separated in the analyses that follow.
5.1.2 Strength of canonicity The mean response for each word pair in the test set is presented in Table 6. The mean responses for the Canonical antonyms vary between 10.40 and 10.92. None of the word pairs have a response mean of 11, which we expected for the Canonical antonyms. They do, however, top the list. The means for the Antonyms vary greatly, from 10.32 for fin–grov (fine–course) to 1.68 for genomskinlig–svullen (transparent–swollen). Below 2.52, a mix of unrelated and synonymous word pairs are found and the word pair that was judged to be the ‘worst’ antonym pair was tunn–spinkig (thin–skinny) (1.24). The overall mean responses for the four categories are presented in Table 7. The Canonical antonyms have a mean response of 10.72, close to the maximum, 11. The standard deviation is also small for this category, 0.6, which reflects high consensus among the participants. The Antonyms have a significantly lower mean of 6.82, but with a large standard deviation, 3.37. This indicates a lower degree of consensus among the participants. The response for the Synonyms is 1.61, with a standard deviation of 1.33, and for the Unrelated it is 1.92, with a standard deviation of 1.55. There is no significant difference between the last two categories. The results in Table 7 are also illustrated in Figure 5. We performed a repeated measures ANOVA and the differences between the Canonical antonyms and Antonyms as well as between Antonyms and the two other categories (Synonyms and
33
34
Caroline Willners and Carita Paradis
Table 7. Mean responses for Canonical antonyms, Antonyms, Synonyms and Unrelated word pairs Category
Mean
Std. deviation
Canonical antonyms Antonyms Synonyms Unrelated
10.724 6.824 1.609 1.920
0.6084 3.3728 1.3279 1.5502
Figure 5. Mean responses for Canonical antonyms, Antonyms, Synonyms and Unrelated word pairs
Unrelated) were significant both in the subject analysis (F1[3,147] = 1784.874, p < 0.001) and in the item analysis (F2[3,49] = 70.361, p < 0.001). Post hoc comparisons using Tukey’s HSD procedure (Fields 2005: 340) suggested that the four conditions form three subgroups: (1) Canonical antonyms, (2) Antonyms and (3) Synonyms and Unrelated.
6.
Discussion
The main goal of this paper was to investigate and report on three different methods of studying antonym canonicity and to increase our knowledge about Swedish antonyms. We used a corpus-driven method to suggest possible candidates, categorise the semantic relations between the suggested word pairs and pick six items from each semantic dimension manually. We then used two different psycholinguistic techniques to investigate the strength of oppositeness between the
Swedish opposites
word pairs in the test set. Summaries of the results of the three parts of the study will be given in Sections 6.1, 6.2 and 6.3. Then a discussion of the advantages and disadvantages of using various types of research technique for the same topic will follow in Section 6.4.
6.1
Data extraction
Under the assumption that semantically related words co-occur significantly more often than chance predicts, we used a corpus-driven method to suggest possible candidates for the test set. We collected Synonyms of the Canonical antonyms from seven predefined semantic dimensions (speed, luminosity, strength, size, width, merit and thickness) and the figures for expected and observed sentential co-occurrence as well as p-value was calculated for all possible permutations of word pairs within each dimension. The word pairs that co-occurred significantly at a p-level of 0.05 qualified as candidates for the test set. From these pairs we selected one antonymous pair, two pairs of Synonyms and two pairs of Unrelated words for each dimension. Together with the Canonical antonyms of each dimension as well as 11 word pairs that were previously studied for antonym canonicity by Herrmann et al. (1979), they made up the test set used in the psycholinguistic experiments (see Tables 4 and 5). Due to a shortage of publicly/generally available large corpora for Swedish, the present corpus study is performed on a fairly small corpus, the SUC, which comprises one million words. It would be a significant improvement to do all the calculations for the word pairs on a larger corpus, as we have done for English data in Paradis et al. (2009), where we used the 100-million-word corpus BNC.
6.2
Elicitation experiment
Fifty participants, evenly distributed over gender, were asked to provide the best opposite they could think of for 85 stimulus words. In accordance with what we predicted, the participants’ responses consisted of a varying number of unique response words for the different test items as shown in Figure 1. There were nine stimulus words for which all participants gave the same answer, eight for which all 50 participants but one gave the same answer and then the number of participants giving the same answer decreases as the number of unique answers increases. The results generally confirm our predictions: (1) the test items which were suggested by the co-occurrence data and which we deemed to have canonical status elicited one another strongly; (2) the test items that we did not deem to be canonical elicited varying numbers of antonyms – the better the antonym pairing, the fewer the
35
36
Caroline Willners and Carita Paradis
Figure 6. Relations between bra (good), dålig (bad), fin (pretty), ful (ugly), vacker (beautiful), god (good) and ond (evil), based on the elicitation experiment. The number of responses is marked by each arrow
number of elicited antonyms; and (3) the elicitation experiment produced a curve from high participant agreement (few suggested antonyms) to low participant agreement (many suggested antonyms). The predictions imply that both words in a canonically antonymous pair would elicit only each other, but this was not the case. Only for the semantic dimensions luminosity (mörk–ljus (dark–light)) and size (stor–liten (large–small)) did the participants’ responses agree 100% in both directions. This might be interpreted as canonicity somehow being linked to directionality, or it may be the case that direction is a result of polysemy rather than inherent to canonical antonyms. Figure 6 illustrates elicitations from the field of merit in which we find one word pair with strong bidirectional evidence and three word pairs with strong unidirectional evidence. Bra–dålig (good–bad) is one of the word pairs in the study with 100% agreement in both directions, i.e. all 50 participants offered dålig (bad) to the stimulus bra (good) and vice versa. The three word pairs with strong unidirectional evidence are fin–ful (pretty–ugly), ful–vacker (ugly–beautiful) and ond–god (evil–good). Dålig (bad) was also suggested as the best opposite of god (good) by six participants and fin (pretty) or perhaps (fine) in this context, by one participant, as the fields of beauty and goodness can become entangled in the field of merit. It is always possible to construe opposition with the help of context, even for words that do not seem to be in semantic opposition at all, and some . To keep the figure simple, we only included words from the test set that were also found among the responses. That is the reason why the numbers by the arrows in Figure 6 do not always add up to 50.
Swedish opposites
word pairs are good antonyms in certain contexts, but not in all, e.g. fin (pretty) and dålig (bad) are very good antonyms in the context of fruit and vegetables, whereas dålig (bad) and god (good) are often used about books. It is not possible to develop this further since we did not control for context in this study. In the cluster analysis, four clusters were predefined: all Canonical antonyms from the test set appear in Clusters 1 and 2, which are also closely related (see the dendrogram in Figure 2). We also find some other canonical word pairs that were not part of the test set in these clusters, such as vit–svart (white–black) and ond– god (evil–good), since we included all data for which we had bidirectional results in the cluster analysis, not just the word pairs in the test set. Clusters 3 and 4 were less closely related than the two previous clusters, and the two pairs of clusters were in turn related to each other. A drawback of the elicitation method is that even though all words in the test set were included as stimuli, most of the suggested pairs were not part of the test set. Since we asked the participants for the best opposite, we do not find test items from the Synonyms and Unrelated in the result. The experiment was self-paced and the test items were presented out of context. The possible effects of this is that the participants may have had time to construct their own scenarios for each word and may not have always written down the first opposite that came to mind. The lack of control for context is also an issue for the polysemous items in the test set such as lätt (light/easy) which is a member of two meaning dimensions and consequently forms two pairs: lätt–tung (light– heavy) and lätt–svår (easy–difficult). This also applies to god (good/tasty), which forms the pairs god–ond (good–evil) and god–äcklig (tasty–disgusting). There was a more or less equal number of responses connected to each meaning. This experiment was not designed to determine whether the participants made conscious choices, or whether half of them had shorter access time to one meaning or the other, which would have helped in the analysis of the polysemous items.
6.3
Judgement experiment
The judgement experiment was performed online and involved 50 participants who were asked to judge how good they thought each of the pairs in the test set were as a pair of antonyms. They made their judgements on an 11-unit scale and since we expected the ordering of the pairs to have an impact, half of the participants were given the stimulus word pairs in one order and the other in reversed order, i.e. Word1–Word2 and Word2–Word1. Our expectations concerning order of sequence were built on markedness theory (e.g. Lehrer 1985 and Haspelmath 2006), in which results show that one member of an antonym pair is more natural than the other, i.e. the unmarked one. Unexpectedly, the order of the words did not have a significant impact on the
37
38
Caroline Willners and Carita Paradis
result. Even though this result has interesting implications for markedness theory, that track is beyond the scope of this study and we put all the data together in one batch, disregarding direction of presentation. The general result for the judgement study, using all the data in the same batch, was that the responses to the four predefined categories formed three significantly different groups: Canonical antonyms (M = 10.72), Antonyms (M = 6.82), and Synonyms (M = 1.61) and Unrelated (M = 1.92), which formed one group. We predicted significant differences in the judgements of all four groups, and this was confirmed by Canonical antonyms and Antonyms which are significantly different both from each other and from the Synonyms and Unrelated. Our prediction concerning the difference between Synonyms and Unrelated word pairs was disconfirmed: they were not judged to be significantly different with respect to degree of oppositeness. The results support the general hypothesis of this paper in that there is a group of canonical antonyms significantly different from non-canonical antonyms. Herrmann et al. (1979) performed a judgement test using pen and paper and found that the word pairs in the test set were ranked on a scale of ‘goodness of antonymy’. The result for the 11 word pairs translated from Herrmann et al.’s (1979) study included in our study is consistent with his ranking (see Table 5). As in his study, ful–vacker (ugly–beautiful) top the list as the best opposite word pair. Smutsig–fläckfri (filthy–immaculate) and trött–pigg (tired–alert) have traded places in the ranking. The main diverging result is framfusig–hövlig (bold– civil) which has a ranking of 1.74 in Hermann’s 5-unit scale but 8.4 in our 11-unit scale. Our intuitions agree with the participants’ judgements that the Swedish pair framfusig–hövlig (bold–civil) actually are good opposites. The reason for this Table 8. Test items selected from Herrmann et al. (1979) Word1
Word2
Translated from
Herrmann’s Score in Semantic score present study relation
ful smutsig trött lugn hård irriterad sparsmakad overksam förtjusande framfusig vågad
vacker fläckfri pigg upprörd böjlig glad spännande nervös förvirrad hövlig sjuk
beautiful–ugly immaculate–filthy tired–alert disturbed–calm hard–yielding glad–irritated sober–exciting nervous–idle delightful–confused bold–civil daring–sick
4.90 4.62 4.14 3.95 3.28 3.00 2.67 2.24 1.90 1.57 1.14
10.64 9.36 10.76 9.28 7.84 6.56 2.76 1.92 1.84 8.40 1.32
C A C A A A A U U A A
Swedish opposites
discrepancy may be that the translation into Swedish does not match the English original in terms of the semantic dimension of the antonymy relation. While our data seem to converge with Herrmann et al.’s (1979) results (see Table 5), they used it to support a non-dichotomous view of canonicity. In contrast, our results, using similar methods, support a dichotomous view, since we do find a significant difference between Canonical antonyms and Antonyms.
6.4
Dichotomy vs. continuum
Both psycholinguistic experiments point in the same direction. In the elicitation experiment, the Canonical antonyms elicit one another to a larger extent than the Antonyms and are all found in Clusters 1 and 2. The cluster analysis is not confirmatory, but the result is in favour of the dichotomy approach to ‘goodness of antonymy’. In the judgement experiment, they were judged significantly different from the Antonyms as a group. This confirms that there seems to exist a small group of opposite word pairs that are ‘better’ antonyms than others. Focusing on the results of the Antonyms, we find clear indications of a continuum, namely: the varying number of unique responses in the elicitation which is reflected both in the ‘staircase’ form of Appendix B and in the slope of the bars at the back and the gradually growing ‘carpet’ in Figure 1; and the large dispersion of means for the Antonyms in the judgement test, varying between 1.68 and 10.32, also reflected in the large standard deviation (3.37). Our results for Antonyms also validate Herrmann et al.’s (1979) study. To conclude, there seems to be both a dichotomy and a continuum involved in the categorisation of ‘goodness of antonymy’. The Antonyms vary greatly in the degree of oppositeness they exhibit, while there is a small group of extremely good antonyms that are not dispersed on a continuum of oppositeness.
6.5
Methodological remarks
Three different methods have been used in the studies reported in this paper. The research process can be described as following cycles involving the researcher’s intuitions, knowledge from the literature, corpus-based research and intuitive data (Mönnink 2000), as discussed in Section 3. To this, we can add lexicographical data, since dictionaries and encyclopaedias were important sources when we constructed the test set, although this can also be viewed as a special case of knowledge from the literature.
39
40 Caroline Willners and Carita Paradis
Table 9. Research cycles of the reported studies in relation to Mönnink (2000) Cycles Researchers’ intuitions & Knowledge from the literature
The research idea itself
Corpus-driven methods
Running Coco on all permutations of adjectives in the SUC shows that the canonical antonyms co-occur more often than noncanonical antonyms and other semantically related word pairs.
Researchers’ intuitions & Lexicographical data
Selection of dimensions and canonical antonyms from the results of the previous step
Lexicographical data
Collecting all synonyms of each of the words among the canonical antonyms
Corpus-driven methods
Coco suggests other significantly co-occurring word pairs as candidates for the test set.
Researchers’ intuitions & Lexicographical data & Knowledge from the literature
Manual categorisation of semantic categories
Researchers’ intuitions & Lexicographical data & Knowledge from the literature
Selecting six word pairs from each semantic dimension
Intuitive data from participants
Elicitation experiment
Researchers’ intuitions & Knowledge from the literature
Analysis and interpretation of the results
Intuitive data from participants
Judgement experiment
Researchers’ intuitions & Knowledge from the literature
Analysis and interpretation of the results
Researchers’ intuitions & Knowledge from the literature
Bringing the results of the different studies together
The choice of the test set starts out with a corpus study of the co-occurrence patterns of all possible combinations of adjective pairs in the SUC. In combination with our intuition, we picked out seven well-established semantic dimensions designated at the end poles by adjective pairs that co-occurred significantly more often than chance predicts. We used lexicographical data to pick out the Synonyms of each of the words from the previous step. Corpus-driven methods were used to suggest possible word pairs for the test set, and these were categorised manually by the researchers who then picked two Antonyms, two Synonyms and one Unrelated word pair for each dimension. The test set was randomised and used as a stimulus in the elicitation experiment, in which we collected intuitive
Swedish opposites
metalexical data from the participants. The data were analysed using statistical methods and the results were interpreted in relation to the literature and to the researchers’ intuitions. The cycle takes yet another turn in the judgement experiment, for which the test set was randomised for each participant, who judged the stimulus word pairs on ‘goodness of oppositeness’ according to their intuitions. The researchers’ intuitions, grounded in the literature, were used in the analysis of the statistical results of the judgement experiment as well as in bringing the different studies together. The main reason for using cycles in the research process in this way is that it gives us a diversified picture of the issue. In this study, we have made several turns which have provided us with a number of perspectives on the issue of ‘goodness of antonymy’.
7.
Conclusion
The main goal of this paper was to combine three methods for the study of antonym canonicity and to report the results of experiments using Swedish data. We used corpus-driven methods to extract possible candidates for the test set from seven predefined semantic dimensions (speed, luminosity, strength, size, width, merit and thickness) and then picked one pair of Canonical antonyms, two pairs of Antonyms, two pairs of Synonyms and one Unrelated word pair from each dimension. We also included 11 word pairs from Herrmann et al. (1979) translated into Swedish. The test set was used as individual stimulus words in the elicitation experiment and as word pairs in the judgement experiment. The elicitation experiment produced a curve from high participant agreement, i.e. all participants suggested the same opposite to the stimulus, to low participant agreement, i.e. many suggested antonyms. The cluster analysis shows that there is a group of Canonical antonyms in the test set, while the Antonyms vary greatly in ‘goodness of antonymy’, which is reflected in the variation of unique response words. There were many polysemous items and we did not control for context, which is why it is not possible to draw any conclusions about directionality, i.e. whether it is of importance to ‘goodness of antonymy’ that the two words within a pair elicit one another as best opposite. The judgement experiment also points to a group of Canonical antonyms significantly different from the Antonyms. Both were significantly different from the Synonyms and Unrelated word pairs, while, unexpectedly, the two latter categories were not significantly different from each other. Also unexpected, we found that the order of sequence (Word1–Word2 vs. Word2–Word1) did not have any significant effect on the results. While the result for the Canonical antonyms
41
42
Caroline Willners and Carita Paradis
is clear, the means for the Antonyms are spread out over a majority of the 11-unit scale used in the experiment. We interpret this as evidence that non-canonical antonyms are sensitive to ‘goodness of antonymy’ in a scalar format. We use and report on a variety of methods to study ‘goodness of antonymy’: data-driven suggestions for the test set, manual semantic categorisation and final choice of test set items, an elicitation experiment performed with paper and pencil and a judgement experiment performed online. The study as a whole goes through several cycles of researchers’ intuitions – lexicographical data – participants’ intuitions – knowledge from the literature, which should vouch for scientific soundness. The variation of method gives a more complex and more complete picture of the issue. Both the psycholinguistic experiments show that there is a small group of exceptionally good antonyms (Canonical antonyms) which are significantly different from non-canonical antonyms, which in reality is an indefinite number. While we see a clear dichotomy between the canonical and non-canonical antonyms, the non-canonical vary greatly in degree of ‘goodness of antonymy’ – there is a continuum, as well as a dichotomy. It has previously been postulated that the canonical antonyms are of great importance to the organisation of the vocabularies of languages (e.g. Fellbaum 1998), but it is equally interesting that virtually any two words can be construed as antonyms with contextual means (Paradis et al. 2009). Our next step is to move on to neuro-linguistic methods to see if we can validate the existence of a group of canonical antonyms as well as a continuum of non-canonical antonyms using event-related potentials.
References Cruse, Alan D. 1986. Lexical semantics. Cambridge: Cambridge University Press. Fellbaum, Christiane (ed.). 1998. WordNet: An electronic lexical database. Cambridge, MA: MIT Press. Field, Andy. 2005. Discovering statistics using SPSS. London: Sage Publications. Francis, Gill. 1993. A corpus-driven approach to grammar: Principles methods and examples. In Text and technology: In honour of John Sinclair, Mona Baker, Gill Francis and Elena Tognini-Bonelli (eds), 137–156. Amsterdam: John Benjamins. Glynn, Dylan, Geeraerts, Dirk and Speelman, Dirk. 2007. “Entrenchment and social cognition. A corpus-driven study in the methodology of language description.” Paper presented at the first SALC (Swedish Association for Language and Cognition) conference at Lund University 29 December – 1 December. Gries, Stephan Th. and Divjak, Dagmar S. 2009. “Behavioral profiles: A corpus-based approach towards cognitive semantic analysis.” In New directions in cognitive linguistics, Vyvyan Evans and Stéphanie Pourcel (eds), 57–75. Amsterdam: John Benjamins.
Swedish opposites
Gross, Derek and Miller, Katherine J. 1990. “Adjectives in WordNet.” International Journal of Lexicography 3/4: 265–277. Haspelmath, Martin. 2006. “Against markedness (and what to replace it with).” Journal of Linguistics 42/1: 25–70. Herrmann, Douglas J., Chaffin, Roger J. S., Conti, Gina, Peters, Donald and Robbins, Peter H. 1979. “Comprehension of antonymy and the generality of categorization models.” Journal of experimental psychology: Human learning and memory 5: 585–597. Jones, Steven. 2002. Antonymy: A corpus-based perspective. London: Routledge. Jones, Steven, Paradis, Carita, Murphy, M. Lynne and Willners, Caroline. 2007. “Googling for opposites: A web-based study of antonym canonicity.” Corpora 2/2: 129–154. Justeson, John S. and Katz, Slava M. 1991. “Co-occurrences of antonymous adjectives and their contexts.” Computational Linguistics 17: 1–19. Justeson, John S. and Katz, Slava M. 1992. “Redefining antonymy: The textual structure of a semantic relation.” Literary and Linguistic Computing 7: 176–184. Lehrer, Adrienne. 1985. “Markedness and antonymy.” Journal of Linguistics 21: 397–429. Lehrer, Adrienne and Lehrer, Keith. 1982. “Antonymy.” Linguistics and Philosophy 5: 483–501. Lundbladh, Carl-Erik. 1988. Adjektivets komparation i svenskan. En semantisk beskrivning. Lundastudier i nordisk språkvetenskap, A: 40. Lund University Press. Lyons, John. 1977. Semantics. Cambridge: Cambridge University Press. Muehleisen, Victoria. 1997. Antonymy and Semantic Range in English. Unpublished PhD diss. Evanston, IL: Northwest University. Murphy, M. Lynne. 2003. Semantic relations and the lexicon: Antonyms, synonyms and other semantic paradigms. Cambridge: Cambridge University Press. Murphy, M. Lynne, Paradis, Carita, Willners, Caroline and Jones, Steven. 2009. “Discourse functions of antonymy: A cross-linguistic investigation.” Journal of Pragmatics 41/11: 2159–2184. de Mönnink, Inge. 2000. On the move. The mobility of constituents in the English noun phrase: A multi-method approach. Amsterdam/Atlanta, GA: Rodopi. Oakes, Michael P. 1998. Statistics for corpus linguistics. Edinburgh: Edinburgh University Press. Paradis, Carita. 2003. “Is the notion of linguistic competence relevant in Cognitive Linguistics?” Annual review of Cognitive Linguistics 1: 247–271. Paradis, Carita. 2005. “Ontologies and construal in lexical semantics.” Axiomathes 15: 541–573. Paradis, Carita and Willners, Caroline. 2007. “Antonyms in dictionary entries: Methodological aspects.” Studia Linguistica 61/3: 261–277. Paradis, Carita, Caroline Willners and Jones, Steven. 2009. “Good and bad opposites: using textual and experimental techniques to measure antonym canonicity.” The Mental Lexicon 4/3: 380–429. Storjohann, Petra. 2005. “Corpus-driven vs. corpus-based approach to the study of relational patterns.” In Proceedings of the Corpus Linguistics Conference 2005 in Birmingham. Vol. 1, no. 1, Birmingham: University of Birmingham. http://www.corpus.bham.ac.uk/conference2005/index.htm Tognini-Bonelli, Elena. 2001. Corpus Linguistics at work. [Studies in Corpus Linguistics 6]. Amsterdam: John Benjamins. Willners, Caroline. 2001. Antonyms in Context. A Corpus-based Semantic Analysis of Swedish Descriptive Adjective. Travaux del’Institut de Linguistique de Lund 40. Lund, Sweden: Dept. of Linguistics, Lund University.
43
44 Caroline Willners and Carita Paradis
Willners, Caroline and Holtsberg, Anders. 2001. “Statistics for sentential co-occurrence.” Working Papers Lund, Sweden: Dept. of Linguistics, Lund University 48: 135–148. Woods, Anthony, Fletcher, Paul and Hughes, Arthur. 1986. Statistics in language studies. Cambridge: Cambridge University Press.
Appendix A Translations of the Swedish words included in the test set. Canonical antonyms
Antonyms
långsam–snabb
slow–fast
ljus–mörk
light–dark
svag–stark
weak–strong
liten–stor
small–large
smal–bred
narrow–wide
dålig–bra
bad–good
tunn–tjock
thin–thick
Synonyms långsam–släpig snabb–rask ljus–öppen mörk–svart svag–matt stark–frän stor–inflytelserik liten–oansenlig smal–spinkig bred–kraftig dålig–låg bra–god tunn–spinkig tjock–kraftig
långsam–flink tråkig–het vit–dunkel melankolisk–munter lätt–muskulös senig–kraftig obetydlig–kraftig liten–väldig smal–öppen trång–rymlig dålig–god ond–bra genomskinlig–svullen fin–grov
slow–swift dull–hot white–obscure melancholic–merry light–muscular sinewy–sturdy insignificant–sturdy small–enormous narrow–open tight–spacious bad–fair evil–good transparent–swollen fine–rough
Unrelated slow–draggy fast–rapid light–open dark–black weak–flat strong–sharp big–influential small–insignificant narrow–skinny wide–sturdy bad–low good–fair thin–skinny thick–sturdy
het–plötslig
hot–sudden
dyster–präktig
sad–decent
flat–seg
abashed–tough
klen–kort
delicate–short
liten–tjock
small–thick
fin–tokig
fine–crazy
knubbig–tät
chubby–dense
Swedish opposites
Appendix B Results from the elicitation experiment. The stimulus words are in bold and the responses are listed to the right of each stimulus. Non-answers are marked with “0”. bra dålig liten stor ljus mörk låg hög mörk ljus sjuk frisk smutsig ren stor liten vacker ful dålig bra frisk kort lång långsam ond god snäll smal tjock bred stark svag klen svag stark klar tunn tjock bred öppen stängd sluten bred smal tunn snäv långsam snabb fort seg lätt tung svår svårt svart vit gul vital tjock smal mager tunn vit svart gul mörk trött pigg utvilad vaken energisk fin ful grov ofin trasig dålig ful snygg vacker fin söt regelmässig glad ledsen sur arg trumpen sorgsen hård mjuk svag god slapp lös klen stark kraftig muskulös stabil grov rejäl rymlig trång liten orymlig snäv tråkig kompakt snabb långsam slö sen sakta sölig senfärdig dunkel ljus klar upplyst tydlig genomskinlig stark ljusaktig god ond äcklig dålig elak oaptitlig dum illa pigg trött slö sömnig långsam lat sjuk seg böjlig stel oböjlig styv rak fast hård utjämnad stum het kall sval mesig kylig mjuk mild svag ljum hövlig ohövlig oartig otrevlig oförskämd läskig klumpig ohyfsad ofin nervös lugn säker självsäker stabil cool rofylld sansad modig trång rymlig vid bred lös smal stor öppen volumiös väldig liten obetydlig minimal oansenlig pyttig jätteliten pytteliten oväldig knubbig smal spinkig tunn slank mager långsmal benig trind gänglig overksam verksam aktiv driftig flitig företagsam rastlös rask igång hyperaktiv
45
46 Caroline Willners and Carita Paradis
rask långsam slö sölig trög sen sjuk släpig saktfärdig senfärdig spännande tråkig ospännande ointressant trist enahanda menlös obetydlig likgiltigt långtråkig odramatisk tråkig rolig kul utåtriktad trivsam spännande skojig underhållande gladlynt intressant munter upprörd lugn samlad oberörd sansad glad tillfreds cool vänlig balanserad behärskad fläckfri fläckig smutsig ren befläckad obefläckad besmutsad besudlad sjavig dunkel fläckad kriminell framfusig tillbakadragen blyg försynt avvaktande återhållsam reserverad nervös feg blygsam diskret seg muskulös spinkig klen tanig svag senig tunn otränad smal omuskulös musklig mager dyster glad munter uppåt rolig sprallig färgglad upprymd lycklig gladlynt uppsluppen pigg uppspelt genomskinlig ogenomskinlig tät synlig färgad matt täckande solid tjock dunkel mörk opak 0 grov fin tunn klen smal len späd finlemmad slät gles mjuk tanig finkornig matt blank pigg glansig klar stark kraftfull spänstig svag vital rapp energisk livlig obetydlig betydlig betydelsefull viktig inflytelserik betydande ansenlig avsevärd synlig anmärkningsvärd framträdande värdefull stor släpig snabb rask kvick rapp rak företagsam osläpig upprätt smidig rinnande stillastående pigg spinkig tjock kraftig knubbig fet rund muskulös grov mullig korpulent fyllig smal välbyggd kraftig smal svag tunn klen lätt liten spinkig mager mesig kraftlös trind slank tanig flink långsam klumpig trög oflink slö fumlig lat ofärdig klumsig ohändig sävlig tafatt senfärdig osäker irriterad lugn glad nöjd oirriterad balanserad samlad tålamod snäll oberörd tålig tillfreds vänlig tålmodig välvillig behärskad munter dyster butter sur nedstämd ledsen deppig trumpen uppåt arg glad missmodig melankolisk sorgsen trist munter pessimistisk tät gles otät genomskinlig ihålig tunn fattig brett dragig hålig utspridd luftig grov spretig lös lucker öppen inflytelserik obetydlig betydelselös maktlös inflytelselös oansenlig obetydelsefull opåverkbar oansvarig försumbar ickeinflytelserik opåverkad menlös utanför neutral obetydande mjäkig tokig klok normal sansad frisk lugn smart redig tråkig vanlig glad mentalt frisk förståndig beräknelig vettig balanserad rolig resonlig rätt förtjusande hemsk motbjudande otrevlig ful avskyvärd förskräcklig äcklig förfärlig osympatisk odräglig vämjelig fördjävlig skurkaktig intetsägande gräslig vidrig frånstötande ocharmig trist förvirrad klar samlad koncentrerad sansad redig säker klarsynt medveten lugn organiserad närvarande självsäker fokuserad orienterad ha koll kontrollerad vettig insiktsfull harmonisk senig muskulös mörkraftig knubbig tjock biffig rultig slapp korpulent tunn flexibel mjuk svag plöfsig rulltig stark osenig grov mullig fläskig frän god mild mjuklen söt ofrän behaglig smaklös svag ljuv frisk töntig blygsam luktfri hövlig normal lågmäld gammalmodig neutral smakfull oansenlig ansenlig iögonfallande betydande viktig prålig uppseendeväckande berömd attraktiv färgstark märkbar synlig dominant anständig extraordinär mycket betydelsefull praktfull vacker formidabel trist frappant plötslig långsam väntad långdragen förutsägbar planerad förberedd förutbestämd förväntad besinnad sävlig oplötslig successiv utdragen lugnt konstant varsel ihållande väntat långvarig efteråt senare seg
Swedish opposites
lugn orolig stirrig nervös upprörd stressad hispig aktiv stökig sprallig upphetsad uppjagad stimmig störig stressig intensiv labil fartfylld temperamentsfull uppspelt energisk rörig olugn upptrissad melankolisk glad munter gladlynt lycklig uppspelt omelankolisk närvarande uppåt gladsint upphetsad framåt euforisk nöjd lättsam glättig dramatisk vaken skärpt förnöjsam vigorös uppsluppen livfull 0 svullen normal slät osvullen smal tunn jämn platt utmärglad hopdragen ihopkrympt dämpad spänstig avtärd ihopsjunken klen spinkig slank tanig insjunken liten flat mager 0 vågad feg ovågad försiktig blyg tillbakadragen städad säker rädd riskfri konservativ rak oansenlig präktig snäll lagom uniform återhållsam trygg sober diskret reserverad pryd flat rund bucklig djup ojämn bubblig kuperad vågig tjock gropig kurvig bred böjd betydlig omedgörlig framhållande hög yppig oflat rant buktig bullig bergig spetig knagglig 0 seg mör pigg mjukoseg fast porös flytande snabb spröd fartfull tunnflytande lättflytande oböjlig hård rask lätt skör svag aktiv alert stel brytbar vek hyperaktiv rolig slapp lös böjlig präktig slarvig opräktig ödmjuk dålig lössläppt spännande tunn oförskämd osedlig lättsam vågad anspråkslös vulgär busig okonventionell opålitlig opretentiös bräcklig ohederlig vild slafsig slapp bohemisk angenäm lös bondig upprorisk frivol osäker 0 sparsmakad prålig överdådig generös överdriven extravagant vräkig plåttrig svulstig plottrig nöjd vidlyftig slösig rik expansiv kravlös osparsmakad slösaktig vågad glupsk tilltagen flådig vulgär utförlig stark vrusig mycket påkostad provsmakad explosiv överväldigande allätare enkel 0
47
Using web data to explore lexico-semantic relations Steven Jones
This paper reports on web-as-corpus research that seeks to explain why some semantically opposed word pairs have special status as canonical antonyms (for example: cold-hot), while other pairs do not (icy-scorching, cold-fiery, freezinghot, etc.). In particular, it reports on the findings of Jones, Paradis, Murphy and Willners (2007), and extends their retrieval procedure to include the previously overlooked ‘ancillary’ function of antonymy (Jones 2002). The primary assumptions are that a language’s most canonical ‘opposites’ can be reasonably expected to co-occur with highest fidelity in those constructions associated most closely with the key discourse functions of antonymy, and that, given their low frequency in language, an extremely large corpus is needed in order to identify such patterns of co-occurrence.
1.
Introduction
As Storjohann notes, “paradigmatic relations have often been strictly differentiated from syntagmatic relations” (2007: 11). However, among the characteristics that give antonyms their “unique fascination” (Cruse 1986: 197) is a tendency to flout this distinction (Murphy 2006: 3). This tendency arises because antonym pairs, though clearly interchangeable for one another in text, are also excellent collocates, co-occurring intra-sententially 8.6 times more often than chance would allow (Justeson and Katz 1991: 142). As a result, the antonym relation offers a unique opportunity to assess the paradigmatic status of a lexico-semantic relation by investigating its syntagmatic realisation. In particular, this paper explores the
. Justeson and Katz (1991) base their antonym co-occurrence rate on the one-million-word Brown corpus; replicating their calculations on a 280-million-word corpus, Jones estimates the figure to be 6.6 (see Jones 2002: 115–116 for a more detailed comparison).
50
Steven Jones
concept of antonym canonicity (drawing on Jones et al. 2007), by examining and exploiting the frames most commonly occupied by co-occurring pairs. Using a 280-million-word corpus of written English, Jones (2002) proposed a classification system for antonym pairs based not on their innate paradigmatic properties, but on their semantic and pragmatic functions in discourse. This taxonomy was subsequently developed by Jones (2006), using spoken English; Murphy and Jones (2008), looking at children’s language; Murphy, Paradis, Willners and Jones (2009), drawing on comparisons with Swedish; and Muehleisen and Isono (2009), looking at Japanese. Collectively, these studies demonstrate that the ways in which antonyms co-occur are regular, and that pairs distribute predictably among a limited number of functional categories. This paper will begin by outlining these discourse functions; it will then report on the finding of Jones et al. (2007) that some of the frames most closely associated with these functions can be fruitfully searched in large corpora to quantify the relative canonicity of various pairs; finally, it will demonstrate that the methodological principles upon which that study is based can be extended to incorporate frames typically associated with the commonest discourse function of all, Ancillary Antonymy (Jones 2002: 45–60).
2.
Discourse functions of antonymy
This section will outline the nine discourse functions of antonymy, beginning with the two major categories – coordinated and ancillary – and moving on to the seven minor categories. What follows is an exemplification of these categories and a summary of their distribution within five differently composed corpora. All examples used in this section are taken from a sixth corpus – the Time Magazine corpus, a 100-million-word database of American English (1923–2006) – and each function is illustrated using the high-frequency, well-established pair bad/good.
2.1
Ancillary Antonymy
The most common use of antonyms in language is to draw attention to a nearby contrast (Jones 2002: 45–60). In such contexts, the ‘A-pair’ (i.e. the antonym pair) demands that we interpret the ‘B-pair’ (a second, parallel pair of words or phrases) . Canonical antonym pairs are here defined as those “associated by convention as well as by semantic relatedness” (Murphy 2006: 7). . http://corpus.byu.edu/time/.
Using web data to explore lexico-semantic relations
in a contrastive fashion, regardless of whether or not it holds any element of innate opposition. So, in the first example below, the B-pair contrast between competition and restraint of trade is enforced by its proximity to the more established, A-pair opposition holding between good and bad. When antonyms co-occur, the chances of another pair benefiting from their opposition in this way is about 40%. (1) European nations had legally accepted the principle that competition is good, restraint of trade bad. (2) When it was good it was very good, when it was bad it was awful. (3) The secret of all good dressing is to highlight your good points and play down the bad points.
Ancillary Antonymy plays an important role in language, from allowing young children to sort their experience in terms of binary oppositions (see Murphy and Jones 2008 for evidence that the discourse function is used widely during the language acquisition phase and found in the speech of children as young as two years old) to allowing journalists to impose in-groups and out-groups on society (see Davies 2007 for examples of ancillary opposition being used to encode ideologies in political writing).
2.2
Coordinated Antonymy
In approximately one third of all contexts in which antonyms co-occur, their function is to create a sense of inclusivity or to exhaust a particular semantic scale. This can be achieved using a variety of lexico-semantic frames, as the following examples show: (4) As a housewarming for its new museum, the National Academy ransacked the 2,000 art works, good and bad, it had accumulated during the past 116 years, (5) Novels about theatre people, good or bad, have one thing in common: they delight those who are fascinated by the theatre; they bore those who are not. (6) Pools, specialists, and short selling all have their good points as well as their bad points and rather than abolish them by law the stock exchange authority should try to prevent their misuse.
Note that in each of the contexts above, the antonyms represent something more than two discrete points on a semantic scale. Rather, it is all theatre people that have one thing in common, all 2,000 art works that were ransacked, etc. In such contexts, bad and good symbolise the entirety of their scale.
51
52
Steven Jones
Table 1. Minor classes of antonymy (adapted from Jones 2002 and Jones/Murphy 2005) Minor discourse function
Example
Comparative Antonymy (in which a concept is measured in terms of its relative position on a semantic scale)
(7) he finally signed the 110-page document with the resigned comment: “This measure contains more good than bad.”
Distinguished Antonymy (in which attention is paid, metalinguistically, to the difference holding between the antonym pair)
(8) He urged his colleagues to rely more on their audience’s ability “to distinguish between good and bad.”
Transitional Antonymy (in which a change from one antonym state to another is highlighted)
(9) “I set out to find where these kids were getting the guns,” Larson recalls. “So I looked for a story that showed how weapons were diverted from good guys to bad guys.”
Negated Antonymy (in which one (10) This was good, not bad, news to 250 delegates antonym is negated in order to bolster from 51 nations assembled in Rome. the other) Interrogative Antonymy4 (in which a choice between two antonyms is offered)
(11) Growing dependence on that morning caffeine jolt has made the U.S. one of the biggest coffee consumers in the world, swallowing about one-third of the world’s coffee production. Is that good or bad? Hard to tell.
Idiomatic Antonymy (in which ant(12) Paying for this claptrap is like paying for the onyms form part of a familiar proverb, Viet Nam War, which is one long, idiotic history of throwing good money after bad. cliché or idiom) Extreme Antonymy (in which the two (13) Many, the very good and the very bad, force or end-points of an antonym scale are insinuate themselves into the imagination. united and set up in contrast against the centre ground of the scale)
2.3
Minor categories
The seven minor discourse functions of antonymy are recorded in Table 1, each accompanied by a brief gloss and an illustrative context from the Time Magazine . The category of Interrogative Antonym was added to the taxonymy in Jones and Murphy (2005) having previously not occurred in sufficient quantities to qualify as anything other than a residual category. It is clearly a feature of spoken language more than written and it should be noted that “while these questions have a superficial resemblance to Coordinated Antonymy, in that the antonyms are joined by a conjunction … the Interrogative framework is truly disjunctive, in that the answer to the question posed must be one or the other of the antonyms” (Jones/Murphy 2005: 405–406).
Using web data to explore lexico-semantic relations
corpus. Collectively, these minor classes comprise approximately 20–25% of all antonym co-occurrence. Over 90% of all contexts in which antonyms co-occur can be assigned to one or other of the nine categories outlined above. The remaining contexts defy classification because the antonym pair therein functions in a novel or inventive way and cannot therefore be generalised about.
2.4
Distribution of discourse functions
Table 2 shows the distribution of the nine discourse functions across a range of corpora. The purpose of this table is to demonstrate that the roles served by antonym pairs in different modes (e.g. speech and writing), different registers (children’s discourse and adult discourse) and even different languages (English and Swedish) are broadly similar. This is because the two dominant classes are Ancillary Antonymy and Coordinated Antonymy in all corpora. Amongst the minor classes, frequency rates are more variable. For example, the interrogative function is popular with adults when addressing children (“Is that a big table or a little table?”) while the comparative function is more common in written
Figure 1. Distributions of antonym discourse functions across five domains (drawing on Jones 2002; Jones 2006; Murphy and Jones 2008; and Murphy, Paradis, Willners, Jones 2009)
. See Jones (2002: 95–102) for a discussion of the ‘residual’ functions of antonymy.
53
54
Steven Jones
English than spoken (“one has to be more pessimistic than optimistic”). In all five corpora, some contexts were found to resist categorisation in terms of any established category (between 3.4% and 11%) and were therefore assigned to a residual class. They are reflected in the right-most columns of Figure 1. Given the similarity with which antonym discourse functions are distributed across very different corpora, it is reasonable to expect any well-established, lexically enshrined pair to serve most or all of these functions at relatively high rates. This raises the possibility that canonicity may therefore be measurable according to the proportion of times that pairs co-occur in those lexico-syntactic frameworks most closely associated with these functions. The next stage of this paper reports on the methods and findings of Jones et al. (2007) in their investigation of this hypothesis.
3.
Antonym Canonicity: A web-as-corpus approach
Two views about the issue of antonym canonicity emerge in the literature. The first treats canonical and non-canonical pairs as independent groupings, arguing that a clear dividing line separates those pairs that have recognised status as ‘opposites’ within a language from those items that simply fall on opposing halves of a given semantic scale (see, for example, Gross, Fischer and Miller 1989). The second view sees canonicity itself as a scale, with some pairs being strongly antonymic and others representing weaker forms of the phenomenon. Inherent to the latter position is an acceptance that not all semantic scales will have one pair of words that can neatly be labelled as ‘canonical’; some scales may have more than one high-canonicity pair (e.g. temperature: cold/hot and cool/warm) while other scales may only have lower-canonicity pairs (e.g. intelligence: clever/stupid, brainy/foolish, etc.). This viewpoint is favoured by Herrmann, Chaffin, Conti, Peters and Robbins (1979) and Murphy (2003), amongst others. In order to bring an empirical dimension to the debate, Jones et al. (2007) presented corpus evidence against which the findings of psycholinguistic research could be compared, and suggested new, usage-based criteria for measuring canonicity. Initially, seven frames were identified that, based on co-occurrence evidence from a number of studies (including Justeson and Katz 1992; Mettinger 1994; and Jones 2002), could reasonably be expected to house antonym pairs at very high rates. Once again, examples are taken from the Time Magazine corpus.
. See Jones/Murphy (2005) and Jones (2006) respectively.
Using web data to explore lexico-semantic relations
X and Y alike (14) We were free to see good and bad alike both X and Y (15) The news was both good and bad for Chevy either X or Y (16) He tends to think things are either good or bad whether X or Y (17) Everyone … is entitled to the news, whether good or bad from X to Y (18) journey leads to Paris, where Actress Lansing goes from good to bad X versus Y (19) Its pages fairly bulge with pictures of good v. bad taste between X and Y (20) the difference between good and bad jazz was worth … consideration
The first four frames associate most closely with Coordinated Antonymy; the fifth usually serves a transitional function and the sixth a distinguished function; the seventh frame represents the residual group of sentences, particularly those that express some form of conflict (see Jones 2002: 95). Having settled on these frames, the next step was to begin ‘priming’ them (i.e. inserting a series of lexical items into the X- and Y-positions) with adjectives taken from various semantic scales. Ten adjectives were selected at random from a set of fifty used in an antonym elicitation test designed by Paradis, Willners, Löhndorf and Murphy (2006). The advantage of using these ‘stimulus’ adjectives is that something is already known of their antonym status, namely with which ‘opposites’ they were intuitively paired by informants in the experiment. As well as the ten stimulus words, the most common response for each (see Table 2) was also used. Note that the choice of antonym pair, though intuitive, does not introduce human subjectivity into the process. The claim is not that the pairs listed in Table 3 are canonical; rather, it is that they are appropriate starting points to identify canonical relations on each scale. With ten ‘seed pairs’ selected, the next stage was to conduct a series of Google searches, placing each of the twenty adjectives in each of the fourteen frames. “ X and * alike” “both X and *” “either X or *” “whether X or *”
“whether * or X” “* and X alike” “from X to *” “from * to X” “both * and X” “X versus *” “* versus X” “either * or X” “between X and *” “between * and X”
55
56
Steven Jones
Table 2. Stimulus and response words (emboldened words used as initial seed adjectives in Jones et al. 2007 study) from Paradis, Willners, Löhndorf and Murphy (2006) Stimulus word
Response word(s)
beautiful poor open large rapid exciting strong wide thin dull
ugly (50) rich (50) closed (40) small (48) slow (47) boring (36) weak (47) narrow (45) fat (35) bright (28)
shut (10) little (1) sluggish (2) dull (13) feeble (1) thin (3) thick (13) exciting (10)
slim (1) fast (1) unexciting (1) mild (1) skinny (1) overweight (1) interesting (8)
slight (1) slim (1) wide (1) shiny (2)
lively (1)
sharp (1)
Using web text as a corpus and an internet search engine as a concordancer is problematic in many ways, some of which will be discussed in the final section of this paper. However, as an illustration of the richness of the data generated, the following concordance shows a sample of the output for a web search on “whether X or *” using soft in X-position. Soft was not one of the words used in the Jones et al. (2007) study; the concordance below is simply indicative of the kind of results found. whether soft or loud, the guitar on this record is clear and crisp and really “strikes a chord” the type of dish one is preparing determines whether soft or firm tofu should be used. The ensuring that transactions will meet their set deadlines (whether soft or hard) could be as to develop a FEPs database related to argillaceous formations, whether soft or indurated various forms of pornographic media that render them all, whether soft or hard, pornography. [DISCOUNTED] would tint your words always with its power, whether soft or with the edge of If you post explicit materials (porn) whether soft or hard in this group, you WILL be banned! drugs whether soft or hard are dangerous to society. We have campaigns about speeding, whether soft or hard copy, most companies find that once they begin flow-charting their what I like: light, texture, angles, point of view, composition, focus whether soft or sharp, But how does the water (whether soft or hard, rich or poor in minerals) impact the taste? Her voice, whether soft or soaring, retains a warmth and a soulfulness capable of entrancing I love Cornell’s vocal range – I think he’s great whether soft or loud. I’m glad Audioslave gave [DISCOUNTED] interested in seeing whether soft OR could support rapid assessments, It is important to bear in mind that whether soft or hard packing is utilised, the size of the Nepalese clinical psychologists have stated that depression, whether soft or severe, has an the Italian Riviera has its own special focaccia. Whether soft or crisp, thick or thin, the dough Whether soft or gritty the passion she sings with comes through in each song loud and clear. at either a 33% or 37% water content depending on whether soft or stiff soil conditions were [DISCOUNTED] As with any new skill, whether soft or not, there is always knowledge that
Using web data to explore lexico-semantic relations
This is true for old tissue (whether soft or hard) and also for blood residues found on items of the crust – whether soft or crunchy – was the best part. Just as she began to tell the fellow When cervical collars are used, whether soft or hard collars, they may set up the scene for whether soft or forceful, whether fast or slow, he interprets these songs as well as anyone whether soft or hard, you’ll hear and, if you’ve got the force feedback feature turned on, feel
Some individual contexts were discounted by Jones et al. (2007) on the grounds that the word retrieved in Y-position (symbolised by an asterisk in Google notation) was not adjectival. However, most of the Y-position output in the concordance is clearly contrastive with soft, including a number of lower-frequency, contextspecific terms (indurated, soaring, gritty, etc.) which, though having few canonical properties, confirm that the frames used are indeed housing oppositional pairs and that the retrieval procedure can therefore be considered effective. The data above relate to one search on one frame. Once thirteen further searches had been undertaken for soft (one in both the X- and the Y-position for each of the seven frames), the results were aggregated by Jones et al. (2007) to give an overall picture of which words are the most common textual antonyms of soft. This information is replicated in Table 3. The first conclusion that can be drawn from examining the adjectives generated by web searches on soft is that the frames used are, collectively, a very reliable indicator of opposition. Of the top 14 items occupying Y-position most frequently, only medium does not have direct semantic incompatibility with soft. The results also confirm that the primary textual antonym of soft is hard. In Table 3. Aggregated distribution of antonyms retrieved by soft in at least 0.1% of contexts (from Jones et al. 2007) SOFT (3,887 hits) hard loud firm rigid stiff tough medium strong crisp crunchy heavy harsh sharp bony
% 1,864 228 95 86 54 16 15 14 13 12 11 8 5 4
47.95 5.87 2.44 2.21 1.39 0.41 0.39 0.36 0.33 0.31 0.28 0.21 0.13 0.10
57
58
Steven Jones
almost half of all searches undertaken, these two words were paired together. No other adjective occupies Y-position at a rate of more than 6%, indicating that the hard is a relatively faithful antonym of soft. The second-placed term, loud, is found more in contexts relating to volume of noise rather than to sturdiness (therefore reflecting the polysemy of soft), while the lower-ranking adjectives tend to distribute evenly between these two senses (with rigid, stiff, etc. relating to sturdiness and crisp, crunchy, etc. relating more to noise and texture).
Table 4. Ranked list of adjectives retrieved by search word in ten frames or more (from Jones et al. 2007) Seed word 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29.
large rich closed small weak poor slow open strong narrow thin bright wide narrow rapid ugly beautiful thin small bright open fat fat dull poor bright exciting fat boring
→ → → → → → → → → → → → → → → → → → → → → → → → → → → → →
Retrieved adjective
%
Frames
Contexts
Reciprocal?
small poor open large strong rich fast closed weak wide thick dark narrow broad slow beautiful ugly fat big dim laparoscopic thin lean bright wealthy dull boring skinny interesting
78.76 67.94 57.13 53.55 48.41 44.02 43.65 37.45 36.06 34.76 33.60 27.02 26.04 17.42 12.99 10.95 10.87 9.13 8.87 8.25 7.56 5.65 3.79 3.73 3.27 3.11 2.29 1.63 1.53
14 13 12 14 13 14 13 10 12 13 14 12 13 11 10 14 14 11 12 11 10 11 10 11 10 11 10 11 12
4361 4209 2271 3001 2019 2193 1625 2240 1504 918 994 861 887 460 346 323 374 270 497 263 452 246 165 103 163 99 54 71 63
Y Y Y Y Y Y Y Y Y Y Y Y Y Y N Y Y Y Y Y Y Y Y Y Y Y N Y N
(50.00; 13; 1781)
(69.72; 13; 2229) (4.24; 10; 186) (21.71; 12; 705) (5.24; 7; 195)
(53.64; 13; 2856) (27.73; 13; 475) (59.98; 13; 1175) (8.61; 12; 210) (37.88; 11; 899) (1.53; 9; 63) (13.15; 11; 88) (1.69; 7; 53)
Using web data to explore lexico-semantic relations
The procedure used to generate the above results for soft were replicated by Jones et al. (2007) for all twenty seed words. In each case, the proportion in which every possible antonym was retrieved by the seed word was calculated. Table 4 records these percentages in ascending order, showing that large’s retrieval of small (78.76%) was the most robust in the study. In order for a pairing to be listed in Table 4, the Y-position adjective needed to be retrieved by the seed word at least twice in at least ten of the fourteen frames. This threshold was set arbitrarily by Jones et al. (2007), but they argued that breadth of co-occurrence was a higher priority than depth in terms of assessing canonicity. In other words, though high frequency may be a consequence of canonicity, it is not a cause; rather, the canonicity of individual pairs derives from their extensive coverage across a range of appropriate contexts. In some instances, the seed word retrieved antonyms other than its intuitively-paired ‘opposite’ (slow retrieved fast, thin retrieved thick, etc.). Where this happened, the new adjective (italicised in Table 4) was put through exactly the same research procedure as the seed word in order to ascertain whether any further potential pairings on the scale could be identified. In this way, Jones et al. (2007) were able to be reasonably confident that no potential canonical pairs had passed unnoticed. Finally, note that the right-most column of Table 4 indicates whether each pairing is reciprocal. Jones et al. (2007) took the view that in order for a pair to be considered canonical, the search criteria should apply in both directions. This resulted in a total of nineteen pairs being eligible, as recorded in Table 5. Some evidence in support of the view that antonym canonicity is a phenomenon which itself operates along a scale is provided in Table 5. While one semantic dimension was found to include as many as four canonical pairs, another was found to include none at all. Of the pairs identified, laparoscopic/open Table 5. Pairs meeting canonicity criteria in Jones et al. (2007) study Scale
Canonical pair(s)
beauty wealth openness size speed interestingness strength width thickness/fatness luminosity
beautiful/ugly poor/rich, poor/wealthy closed/open, laparoscopic/open large/small, big/small, big/little fast/slow – no canonical pairs identified – strong/weak narrow/wide, broad/narrow thick/thin, fat/thin, fat/skinny, fat/lean bright/dull, bright/dim, bright/dark
59
60 Steven Jones
is the most unexpected. These antonyms, referring to alternative forms of surgery, is not one that would be volunteered by participants in psycholinguistic experiments, nor could it be considered a high frequency pair (laparoscopic occurs sixty times in the 100-million-word British National Corpus, only three of which have open within ±9 words). Despite this, web data show that they co-occur across a broad spectrum of antonymic frames and therefore meet the set criteria for canonicity.
4.
Ancillary Antonymy: A complementary approach
The success of the research reported above in identifying canonical antonym pairs and measuring their relative strength of association opens up further opportunities for web-as-corpus approaches to the lexicon. So far, frames associated with several antonym categories (primarily the coordinated function, but also negated, distinguished, transitional, etc.) have been used to identify pairs that occupy the X- and Y-positions with greatest fidelity. However, one discourse function not yet used for this purpose is Ancillary Antonymy. The main reason that Ancillary Antonymy has been thus far overlooked is that prototypical ancillary frames are much thinner on the ground. Whereas, say, about 6.7% of all Coordinated Antonymy contexts make use of a both X and Y frame (Jones 2002: 73), no single pattern of co-occurrence accounts for anything like such a large proportion of all Ancillary Antonymy contexts. However, to ignore the category entirely in terms of antonym retrieval is an oversight. Ancillary Antonymy accounts for the largest proportion of antonym usage, and all contexts have a ready-made slot into which contrast items may be inserted: the B-pair slot. The suggested methodology for using ancillary contexts to investigate the antonym relation differs from that used by Jones et al. (2007) because the only way to generalise about ancillary usage is to examine the distribution of individual antonym pairs that favour particular frames and quantify the B-pairs most commonly associated with them. To give an example of this, when long/short is used as an ancillary pair, it frequently makes use of the frame long on X, short on Y. Indeed, within the onemillion-word Time Magazine corpus, this construction appears over 150 times, a sample of which is provided below. which was short on strategic reconnaissance, long on guns. But it remained for the late, a war diet long on vegetables, short on meat. Snorted Vegetarian George Bernard Shaw, 84: diminutive Bonci was long on technique, short on volume, made up in lyrical effect what he But many a buyer found it short on fun, however long on function. Trouble was-and still is-that
Using web data to explore lexico-semantic relations
County, in the high hills of southern Tennessee, is short on doctors but long on religion. At Marmon found itself long on tone but short on business, and after an abortive attempt at if the performers are short on years, they are pleasantly long on talent, and under the shrewd struts along Hollywood Boulevard in his bare feet, is short on cash but long on izzat. The word respected Mrs. Earlie. A first-rate story, it is short on blood, long on plot and psychology. Gallery is long on portraiture, short on Spanish art, entirely lacking in important French sold fewer than 300 airplanes. He was long on reputation, short on orders and cash. But he the companies were short on working capital and long on excess-profits taxes. But in the fight”). While most of them, short on schooling but long on practical experience, could not this week for a session that promised to be short on legislation and long on oratory was short on denominationalism, long on the love of God. He walked up &; down the hills, whether Britain, long on plans but short on Dominion support, would buy that compromise-or buy that compromise-or the U.S., long on plans and short on policy, suggest it-remained know the value of Team Work” and Mae, “short on Intelligence but long on Shape.” Louella them up in the chatty, offhand Collier’s style, long on anecdote and short on big, dull facts.’ Saskatchewan, short on tourist attractions and long on bad roads, was the exception. Lysenko is short on controlled experimentation and long on thundering Marxist phrases Garfield) of New York’s Lower East Side, short on money and long on push, graduates from novel about an Indiana backwoods boyhood, is short on realism but long on entertainment. U.S. gallerygoers would find the paintings short on skill, long on human interest. Many of
When arguing that Construction Grammar “provides the potential to bridge the gap between the syntagmatic and the paradigmatic” (2006: 12), Murphy turns to web-as-corpus methods and cites examples of Coordinated Antonymy such as “… the deprived Irish working class, orange and green alike”. Her argument is that the frame (X and Y alike) is itself a construction because “a kind of relation coercion” (2006: 5) is exerted on the word-pair therein (in this case, orange and green). The concordance above also demonstrates ‘relational coercion’ in action. Here, the construction (long on X, short on Y) has greater lexical content than X and Y alike, but the principle is the same. The B-pairs, comprised of words and phrases that would not usually be considered contrastive, are ‘coerced’ into opposition by their context. Though a small number of the B-pairs recorded above do hold some degree of innate opposition (e.g. vegetables/meat), most are relying entirely on their neighbouring A-pair, in conjunction with a powerful lexico-syntactic construction, to activate their opposition (doctors/religion, tone/business, cash/izzat, etc.). Note the frame is equally productive in either sequence (with long on preceding short on or vice versa) and the distribution almost identical. This confirms that something more than idiomaticity is at work here (because an idiomatic phrase is usually regard as fixed in its word order) and suggests that Construction Grammar may indeed offer the most suitable explanation. Ultimately, it is lexical and syntactic parallelism that facilitates the B-pair opposition in ancillary constructions (see Jones 2002: 56–57), and parallelism itself can be accommodated within a Construction Grammar approach. “Since parallelism
61
62
Steven Jones
involves a certain kind of form associated with a certain kind of meaning,” Murphy argues, “it should also be treated as a construction” (2006: 18). To return to the issue of canonicity and the process of antonym retrieval, it is now possible to prime the long on X, short on Y frame with one B-pair member in order to discover which words are presented in opposition against it most extensively. For example, one of the words to appear regularly in this construction is facts; if facts is inserted into X- and Y-position in the long on X, short on Y construction, enough results are generated by web searches for its antonym profile to begin emerging. Table 6 shows the aggregated scores for such searches, together with an internet example for each of the top ten items retrieved. Table 6. Antonyms retrieved on ten occasions or more in searches on facts Noun retrieved Web example (frequency) opinion (49)
(21) It seems most lefties whether its Kyoto, terrorism, taxes or childcare … are usually short on facts, long on opinion and late to the debate!!! (http://johnlennard.blogspot.com/2006/08/my-comments-sectionmakes-me-laugh_21.html)
emotion (33)
(22) Trial lawyers and big companies apparently learned one lesson from the tort reform battles in Washington: To sell your message, go short on facts, long on emotion. (http://www.highbeam.com/doc/1P2-802201.html)
rhetoric (31)
(23) The Tory orchestrated “meet and greet” was designed to woo the local business community with promises of opportunity. The meeting was short on facts, long on rhetoric and lacking any substance whatsoever. (http://www.opseu.org/ops/ministry/locktalk/locktalkmay18.htm)
speculation (19)
(24) This alternate history is short on facts, long on speculation, and like all conspiracy theories, fits the chaotic events of many decades and various leaders into a cohesive whole, as if life were merely a script written by ‘them’, whoever they are. (http://informedspeculation. com/2005/01/31/uncle-noams-story-time-hour/)
conjecture (15)
(25) The explanation that followed was short on facts and long on conjecture – a state the county contributed to because of our reluctance to discuss active personnel matters. (http://wweek.com/html/letters121599.html)
. All Google searches and counts were conducted on 4th January 2009.
Using web data to explore lexico-semantic relations
Noun retrieved Web example (frequency) innuendo (15)
(26) This “commentary” is nothing more than piss poor muckraking short on facts; long on innuendo and does nothing more than impugn the reputations of some of Key West’s most competent and ethical lawyers. (http://www.kwtn-blue.com/2008/04/page-one-comm-2.html)
anecdotes (12)
(27) We have some elegant theories, some simple theories and some simplistic theories. We’re short on facts and long on anecdotes. (http://openparachute.wordpress.com/2008/03/28/the-real-climate change-swindle/)
hyperbole (11)
(28) You said THIS study was crooked. And the question was about evidence about THIS study; you gave a general opinion piece short on facts, long on hyperbole. Nothing in the link says anything about THIS study. (http://www.curezone.com/forums/am.asp?i=1236493)
spin (10)
(29) We have a nifty 15minute program here called media watch. It lampoons and exposes the Medias often flagrantly lazy, ‘Google’ cut and paste style of Journalism short on facts, long on spin. (http://mayfairplace.blogspot.com/2007/01/what-to-believe.html)
fiction (10)
(30) No good bunch of hollywood liberals who are using moores short on facts long on fiction BOWLING FOR COLUMBINE is true that the accademy awards is given for political reasons and not artistic merrit. (http://rightvoices.com/2003/06/04/exercising-the-second amendment/)
Note that although the frame was primed with facts in all four possible permutations (“short on facts long on *”; “short on * long on facts”; “long on facts short on *”; “long on * short on facts”), the most fertile of these searches was, by some margin, the first: “short on facts long on *”. When totalled, it becomes clear that facts contrasts with a wide variety of nouns in this context. However, some patterns are still apparent and some semantic observations therefore possible. For example, almost all of the retrieved items tend towards the subjective rather than the objective, with those retrieved most frequently relating to individual viewpoint (opinion) or guesswork (speculation, conjecture), and others suggestive of exaggeration or bias (emotion, hyperbole, innuendo). This experiment can, of course, be replicated with other words inserted in X- and Y-position. Pilot studies suggest that fertile seed words may include brains (retrieving guts, muscle, balls, charm and ambition at relatively high rates) and cash (retrieving promise, ideas, class and potential). In unprimed searches for long on X, short on Y, the word pairs most commonly identified include talent and experience, diagnosis and cure, (good) intentions and delivery, and talk and action.
63
64 Steven Jones
These findings are based on one ancillary construction only. However, the potential for other frames and pairs to be exploited is great. Among the Ancillary Antonymy constructions that might be appropriate for web searches are X is temporary, Y is permanent and good for X, bad for Y. Initial explorations suggest that the first construction contrasts form with class most frequently, but also couples pain with both victory and death at high rates. The second construction yields a broader range of results: for example, what is good for the economy is often bad for the environment; what is good for consumers is bad for business.
5.
Research limitations and conclusions
Investigations of this nature are fraught with procedural hazards, primarily because the internet is not a corpus (i.e. a structured collection of text specifically compiled for linguistic analysis) and cannot therefore claim to be representative of language in general. Because the web is not built on a fixed sociolinguistic foundation, consistency is a problem (e.g. American vs. British English, etc.), as is text duplication (song lyrics, political speeches, etc.). Furthermore, Google is very limited as a corpus-searching and concordancing tool: web-pages are not selected randomly (see Ciaramita and Baroni 2006: 145); frequency counts are inaccurate; pages from the same site are often retrieved by a single search; wildcard searches find examples of multi-word phrases in the * position as well as single-word items; and, regardless of how many results search engines claim to find, fewer than 1,000 are ever actually retrieved. In some ways, however, Google’s crude searching and concordancing style is less of an inconvenience for studies of this nature than for those focussed on more traditional, content analysis (see also Robb 2003). The major advantage to using the web as a corpus is, of course, its size. To give one example of this, recall that none of the searches in which the long on X, short on Y frame was primed with facts retrieved more than 50 contexts in total. Using conventional corpora to investigate such low-frequency word-strings is pointless; even the internet only just provides enough data to make distributions meaningful. Admittedly, as some of the examples in Table 6 show, web data can be very mixed in terms of its spelling, grammar and punctuation conventions. However, it also . Criteria that Kennedy (1998: 3), among others, applies in his definition of corpora. . “Google is a poor concordancer. It provides only limited context for results of queries, cannot be used for linguistically complex queries, such as searching for lemmas (as opposed to word forms), restricting the POS or specifying the distance between components in the query in less than crude ways” (Sharoff 2006: 64).
Using web data to explore lexico-semantic relations
provides a ‘democratic’ representation of cotemporary language (Santini 2005), is comprised of the most current data available, and allows very low frequency linguistic phenomena to be investigated for the first time.10 In terms of the methodology used, the research presented in this paper barely scrapes the surface of textual opposition because so few frames are investigated. Nevertheless, little room is left to doubt that web-based antonym retrieval methods are sound. If we return to the original ten stimulus words and compare the textual (Googled) antonyms retrieved in the Jones et al. (2007) study with the intuitive (elicited) antonyms retrieved in the Paradis et al. (2006) study, we find that all but one generate an identical ‘best’ antonym.11 For example, if one asks informants to supply the ‘opposite’ of open, more say closed than any other word (80%); if one runs open through a range of lexico-syntactic frames associated with antonymy, closed is retrieved more than any other word (38%). Likewise, for large, elicitation methods identify small at a rate of 96% and web-as-corpus methods identify small at 79%. The only exception to the pattern is thin, which is intuitively paired with fat by informants, but more likely to be paired with thick in text. This anomaly may expose a disconnect between informants’ meta-linguistic self-awareness, on the one hand, and their usage patterns, on the other, or it may reflect an exceptional feature of that particular scale. Theoretical advances are also made possible using web-as-corpus methods, including some in relation to the paradigmatic/syntagmatic divide. As Storjohann notes in relation to the co-occurrence of co-hyponyms in German, this “conventional distinction is not justified because paradigmatic relations are contextually realised in systematic patterns or constructions” (2007: 11). In this paper, the syntagmatic realisation of antonyms has been found to supply further evidence (following Jones et al. 2007) that canonicity operates along a continuum, thus supporting the claims of Herrmann et al. (1979) and Murphy (2003) more than those of Gross et al. (1989) or Charles, Reed and Derryberry (1994). Such methods continue to allow for a more empirical definition of antonym canonicity to emerge; one that moves beyond collocation and experimental criteria and, as Murphy (2006) suggests, is more compatible with a Construction Grammar approach: a canonical pair can be described as one that serves several recognised discourse functions of antonymy and, in doing so, occupies a wide range of recognised antonym frames; it is not necessarily one comprised of high frequency, high collocating lexical items. 10. Note that Kilgarriff and Grefenstette (2003), Fletcher (2004) and Sharoff (2006) all note elements of comparability between web data and traditional, purpose-built corpora. 11. The textual approach tends to record a lower proportion for its top antonym because a wider range of potential contrast items are retrieved; however, the ranking is generally comparable with the elicitation method.
65
66 Steven Jones
Acknowledgements Parts of this research were supported by the British Academy (Project Title – ‘Antonymy: A cross-linguistic study of canonicity, co-occurrence and context’; Principle Investigator – M. Lynne Murphy). We are grateful to Johan Dahl (Lund) for development of Python search software, and to Lisa Persson (Göteborg/Sussex) for assistance with the searches and creation of the search database.
References Charles, Walter G., Reed, Marjorie A. and Derryberry, Douglas. 1994. “Conceptual and associative processing in antonymy and synonymy.” Applied Psycholinguistics 15: 329–354. Ciaramita, Massimiliano and Baroni, Marco. 2006. “Measuring Web Corpus Randomness: a progress report.” In Wacky! Working papers on the Web as Corpus, Marco Baroni and Silvia Bernardini (eds), 127–158. Bologna: GEDIT. Clark, Herbert H. 1970. “Word Associations and Linguistic Theories.” In New Horizons in Linguistics, John Lyons (ed.), 271–286. Baltimore: Penguin. Cruse, D. Alan. 1986. Lexical Semantics. Cambridge: Cambridge University Press. Davies, Matthew. 2007. “The Attraction of Opposites: The ideological function of conventional and created oppositions in the construction of in-groups and out-groups in news texts.” In Stylistics and Social Cognition, Lesley Jeffries, Dan McIntyre and Derek Bousfield (eds), 79–100. Amsterdam/New York: Rodopi. Deese, James. 1965. The structure of associations in language and thought. Baltimore: Johns Hopkins University Press. Fellbaum, Christiane. 1995. “Co-occurrence and antonymy”. International Journal of Lexicography 8: 281–303. Fletcher, William. 2004. “Making the Web more Useful as a source for Linguistic Corpora.” In Corpus Linguistics in North America, Thomas Upton (ed.), 191–205. Amsterdam/New York: Rodopi. Gross, Derek, Fischer, Ute and Miller, George A. 1989. “The organization of adjectival meanings.” Journal of Memory and Language 28: 92–106. Herrmann, Douglas J., Chaffin, Roger J. S., Conti, Gina, Peters, Donald and Robbins, Peter H. 1979. “Comprehension of antonymy and the generality of categorization models.” Journal of Experimental Psychology: Human Learning and Memory 5: 585–597. Herrmann, Douglas J., Chaffin, Roger J. S., Daniel, M. P. and Wool, R. S. 1986. “The role of elements of relation definition in antonymy and synonym comprehension.” Zeitschrift für Psychologie 194: 133–153. Jones, Steven. 2002. Antonymy: a corpus-based approach. London: Routledge. Jones, Steven and Murphy, M. Lynne. 2005. “Using Corpora to Investigate Antonym Acquisition.” International Journal of Corpus Linguistics 10/3: 401–422. Jones, Steven. 2006. “A lexico-syntactic analysis of antonym co-occurrence in spoken English.” Text and Talk 26(2): 191–216. Jones, Steven, Paradis, Carita, Murphy, M. Lynne and Willners, Caroline. 2007. “Googling for Opposites: A web-based study of antonym canonicity.” Corpora 2/2: 129–155.
Using web data to explore lexico-semantic relations
Justeson, John S. and Katz, Slava M. 1991. “Co-occurrences of antonymous adjectives and their contexts.” Computational Linguistics 17: 1–19. Justeson, John S. and Katz, Slava M. 1992. “Redefining antonymy.” Literary and Linguistic Computing 7: 176–84. Kennedy, Graeme. 1998. An Introduction to Corpus Linguistics. London: Longman. Kilgarriff, Adam and Grefenstette, Gregory. 2003. “Introduction to the special issue on the Web as corpus.” Computational Linguistics 29(3): 333–347. Mettinger, Arthur. 1994. Aspects of semantic opposition in English. Oxford: Clarendon. Muehleisen, Victoria and Isono, Maho. 2009. “Antonymous adjectives in Japanese discourse.” Journal of Pragmatics 41/11: 2185–2203. Murphy, M. Lynne. 2003. Semantic relations and the lexicon. Cambridge: Cambridge University Press. Murphy, M. Lynne. 2006. “Antonyms as lexical constructions: Or, why paradigmatic construction isn’t an oxymoron.” In Constructions all over: Case studies and theoretical implications, Doris Schönefeld (ed.), Special volume of Constructions, SV1-8/2006. http://www.constructions-online.de/ Murphy, M. Lynne and Jones, Steven. 2008. “Antonyms in Children’s and Child-directed Speech.” First Language 28/4: 403–430. Murphy, M. Lynne, Paradis, Carita, Willners, Caroline and Jones, Steven. 2009. “Discourse functions of antonymy: A cross-linguistic investigation of Swedish and English.” Journal of Pragmatics 41/11: 2159–2184. Paradis, Carita, Willners, Caroline, Löhndorf, Simone and Murphy, M. Lynne. 2006. “Quantifying aspects of antonym canonicity in English and Swedish: textual and experimental.” Presented at Quantitative Investigations in Theoretical Linguistics 2, Osnabrück, Germany, 1–2 June. Robb, Thomas. 2003. “Google as a quick ‘n’ dirty corpus tool.” Teaching English as a Second or Foreign Language 7/2. http://www-writing.berkeley.edu/TESL-EJ/ej26/int.html Santini, Marina. 2005. “An exploratory study of Web pages using cluster analysis.” Proceedings of the 8th Annual Colloquium for the UK Special Interest Group for Computational Linguistics. (CLUK 2005). http://www.nltg.brighton.ac.uk/home/Marina.Santini/ITRI-05-01.pdf Sharoff, Serge. 2006. “Creating general-Purpose Corpora Using Automatic Search Engine Queries.” In Wacky! Working papers on the Web as Corpus, Marco Baroni and Silvia Bernardini (eds), Bologna: GEDIT. http://wackybook.sslmit.unibo.it/pdfs/sharoff.pdf Storjohann, Petra. 2007. “Incompatibility: A No-Sense Relation?” In Proceedings of the 4th Corpus Linguistics Conference CL2007, Matthew Davies, Paul Rayson, Susan Hunston and Pernilla Danielsson (eds). Birmingham: University of Birmingham. http://corpus.bham. ac.uk/corplingproceedings07/paper/36Paper.pdf
67
Synonyms in corpus texts Conceptualisation and construction Petra Storjohann
Conventional descriptions of synonymous items often concentrate on common semantic traits and the degree of semantic overlap they exhibit. Their aim is to offer classifications of synonymy rather than elucidating ways of establishing contextual meaning equivalence and the cognitive prerequisites for this. Generally, they lack explanations as to how synonymy is construed in actual language use. This paper investigates principles and cognitive devices of synonymy construction as they appear in corpus data, and focuses on questions of how meaning equivalence might be conceptualised by speakers.
1.
Introduction
The subject of antonymy, particularly English antonymy, has received renewed attention from cognitive (e.g. Cruse/Togia 1995; Croft/Cruse 2004) and psycholinguistic approaches (e.g. Murphy 2003), and from a methodological point of view, within corpus-oriented frameworks (e.g. Jones 2002; Murphy 2006). The field of synonymy, however, has somehow lost its attraction for researchers in both English and German linguistics. Synonymy has predominantly been interpreted simply as ‘sameness of meaning’, which has made this relation seem rather uninteresting. As a consequence, within more recent semantic frameworks, research on synonymy lacks foundational hypotheses, with the result that, as Cruse points out, “much research remains to be done in the field of synonymy” (Cruse 2004: 157). Psycholinguistic approaches (cf. Murphy 2003) and corpus studies of concrete examples of English synonymy (e.g. Partington 1998) have led to a slight shift of focus and provided valuable insights into the actual use of meaning equivalents, by concentrating on differences instead of semantic similarities. However, no comprehensive corpus studies have so far been conducted which aim to address the question of how synonyms are conceptualised and constructed in actual discourse. This paper examines German synonyms in language use
70 Petra Storjohann
by investigating some of the specific cognitive principles and devices behind the construction of this meaning relation. Starting from the textual level, it attempts to look at this lexical-semantic relation by examining its constructional basis in corpus material.
1.1
Preliminaries
First, certain preliminaries, mainly concerning definitions of synonymy, need to be outlined. Exploring the phenomenon of synonymy has a long tradition in linguistics. As a result, there are a number of definitions and classifications of synonymy, most of them with a structuralist imprint, explaining the different types or degrees of synonymy. Many of these definitions refer to the degree of semantic or pragmatic similarity and overlap or to the differences between lexemes. Quite often, synonyms are still considered to be words with sets of features that are formalised and attributed to logical relations, and most of these definitions were established at a time when linguists strove for a description of language as a system and attempted to show how vocabulary might be structured. Classes such as complete, absolute or total synonymy, propositional, cognitive, descriptive, partial and near-synonymy, and so on, have for example been provided by Lyons (1968, 1977), Lutzeier (1981) and Cruse (1986), who have categorised synonyms according to their theoretical differences. Even though it is now commonly agreed that the scale of synonymity is continuous (Cruse 2002: 488) and that a strict categorisation of types of synonymy is problematic because in many contexts it is difficult to determine the exact degree of dissimilarity, it is nonetheless mostly connotative features expressing distinctive stylistic, regional or other pragmatic traits which are accepted as discernible differences between lexical items signifying similar concepts. Traditional formalsemantic approaches which analyse synonyms in terms of the degree of overlap in their semantic features disregard the question of whether two words which are theoretically interchangeable by virtue of their meaning are in fact ever used as synonyms in discourse. As Murphy (2003: 168) remarks, while identity of denotative meaning is a logically possible relation, it is an anathema to natural language.
Logical-formal, denotative and connotative features were long considered feasible in terms of set theoretical models. But such models do not accommodate senserelational variability and flexibility, and they lack explanations and principles concerning the construction of synonymous contexts. Conventionally, lexicalsemantic relations have not been examined on the basis of patterns that emerge
Synonyms in corpus texts
from language use, and traditional classifications have not attempted to provide plausible explanations as to how synonymy is established in communicational situations.
1.2
Synonymy: A usage-based approach
The development of corpus-guided and psycholinguistic studies and recent cognitive perspectives in semantics has led to a focus on real situations, on language in use, on empirical data and on models which entail issues of mental representation, conceptualisation, embodied experience and perception. In this paper, the focus is on some specific, albeit frequent cases of synonymy where certain cognitive aspects become apparent in the contextual environments of corpus texts. The underlying corpus comprises mainly journalistic texts from the Institut für Deutsche Sprache Mannheim archive of corpora of contemporary written German with a size of about 3.4 million tokens. Essentially, this comprehensive database contains newspapers exemplifying German in general written public discourse. In the following discussion of meaning equivalence, existing definitions of synonymy will not be used as a research tool to be used for this investigation. For example, definitions such as: X is a cognitive synonym of Y if X and Y are syntactically identical, and any declarative grammatical sentence S containing X has equivalent truth-conditions to another sentence S (1), which is identical to S except that X is replaced by Y. (Cruse 1986: 88)
are not useful guides, either for the detection of synonymous contexts or for a possible explanation of how synonymy is established by speakers. This paper is not concerned with denotational or conotational features, propositional or expressive properties, logical-formal criteria, specific truth conditional requirements, analytical implication, or even syntactic identity. This investigation of synonymy does not concentrate on aspects of semantic similarities and finer shades of feature differences between two lexical items or two lexical units and their senses or, alternatively, between construals. German synonyms, as listed in the traditional literature such as Stockwerk – Etage (floor), or Geld – Moos – Kies – Moneten (money) (cf. Bußmann 2002), where typically stylistic, regional or other pragmatic characteristics are distinguished, are not the focus of this analysis. Neither does this article look at discriminating collocational differences between items such as bekommen and erhalten (receive – get) or anfangen and beginnen (start – begin), . See http://www.ids-mannheim.de/kl/projekte/korpora/.
71
72
Petra Storjohann
an area where corpus studies have provided useful insights into differences in the usage of near-synonyms, particularly for English near-synonyms. All of these aspects have played an essential role in categorising synonyms according to denotation, expressive traits or their collocational profiles. Taking the examples mentioned above, synonymy can be grouped into different types and generally be attributed to lexical relations. But some synonymous relations cannot be assigned to one specific class because existing classifications do not consider that, in addition to lexical aspects, there are also conceptual processes and operations contributing to the discursive construction or comprehension of meaning equivalence. These are the issues which will be illuminated in this paper, namely types of meaning equivalence where a specific entailment is contextually created, where synonymy is realised as a conceptual relation, where language reflects conceptual organisation, patterns of thought, experience and a projected reality, and where speakers draw upon a repository of knowledge. In the following, synonymy is taken to be a conceptual relation that represents experience, in which similarity of concepts are conventionally encoded and externalised by linguistic patterns. In this paper, a rather broad, pragmatic but context-sensitive stance on synonymy is taken, in common with Murphy (2003). Two lexical items are taken to be synonyms as “long as their differences are slight enough that, in context, the two words’ meanings contribute the same context-relevant information” (Murphy 2003: 150). She continues as follows: what actually counts as synonymous is constrained by the demands of communicative language use and the context in which this language use occurs. (Murphy 2003: 168)
In the following, a pragmatic perspective refers to synonymy as a relation that is established by contextual usage following certain communicative and situational rules and constraints, and it comprises both conceptual and lexical aspects. As a result, a relaxed view on the requirement of matching grammatical categories is adopted, provided that the two synonymous items in question perform the same semantic function. As Murphy (2003) demonstrates through experiments, syntactic criteria are largely ignored for the judgement of synonymy. Here, the view is shared that lexicalisations of equivalent or identical concepts are not synonyms by virtue of their semantics but they might have the potential to be used identically in certain contexts. In the following sections, the questions of how speakers actually use two items synonymously, what type of mental processes are involved, and what kind of shared knowledge is necessary for this will be addressed. . Cf. for English examples Kennedy (1991), Partington (1998) and Taylor (2003).
Synonyms in corpus texts
2.
Synonymy and conceptualisation
Speakers have an intuition for the use of synonyms in context. This intuition is based on specific linguistic and non-linguistic knowledge, experience and perception, which in discourse are activated as appropriate. In situations of language use, communicative, discursive, linguistic and metalinguistic knowledge is used in order to create relations between entities and thus between their lexicalisations, according to the speaker’s intention and communicative situation. Just like any other semantic relation, synonymy is interpretable on the basis of beliefs, experience, traditions and everyday conventions. As a result, flexible and dynamic contextual construction of synonymy can be observed in actual language use. With respect to opposition, this is best illustrated by Cruse’s (1986: 198) example of coffee – tea which are perceived as opposites, since they are often offered as two options and alternatives to each other. Such opposition is based on prototypical situations, not on binary contrast, meaning that there is no inherent opposition, but it is, as Cruse points out a „contingent fact about the world“ (Cruse in Croft/ Cruse 2004: 165). Such facts about the world are necessary knowledge in order to place any two concepts into any kind of relation. The same holds for using a concept in such a way that it is similar enough to be expressed by another term which maps onto a semantically close concept in order to be substituted by it, which is after all one of the intuitive prerequisites of a synonymous relation. Through the analysis of discourse and situation-sensitive use of synonyms in corpus data, we can see what kind of knowledge is actually involved in the construction of equivalent contexts and how they are linguistically structured and realised. As an example, corpus texts (1) and (2) demonstrate how with respect to the domain ‘food’, speakers know that a person’s diet needs to be varied in order to be characterised as healthy. (1) Viele Tumorarten scheinen ebenfalls sogenannte Zivilisationskrankheiten zu sein, aus einer Mehrzahl von Ursachen erwachsen, wobei man freilich mit etwas kargerer Lebensweise, mit gesunder, sprich abwechslungsreicher Ernährung mit wenig Fetten, viel Ballaststoffen, als Empfehlung höchstens jene erreicht, deren Bikinifigur in Frage gestellt oder das Gürtelschloß nicht mehr weitergestellt werden kann. (Salzburger Nachrichten, 14.06.1991.) (2) Durch abwechslungsreiches Essen werde das Immunsystem gestärkt. Eine gesunde Ernährung sei außerdem die beste Krebsvorsorge. (die tageszeitung, 19.01.1996.) . In contrast to relations of opposition, the relation of synonymy receives only little attention and embedding into his theoretical cognitive model of construals.
73
74
Petra Storjohann
In a pattern such as gesund, sprich abwechlungsreich (healthy, which means varied) an implication based on some “contingent fact about the world” is made explicit. It is not claimed here that this is an example of synonymy. However, in citation (2), this type of information is made implicit by speakers and a meaning-equivalent context is created on the basis of an assumption that this implication is shared knowledge. Similar evidence is found in cases (3)–(6) where similar constructions and semantic relations are created for neutral – unabhängig (neutral – independent) and dauerhaft – nachhaltig (lasting – sustaining/sustainable). (3) “Warum nicht ein Mitglied mehr in die Hafenkommission wählen?” Diese Frage, von Claudius Graf an der SP-Versammlung vom Donnerstagabend gestellt, soll an der kommenden Gemeindeversammlung vom Stadtrat beantwortet werden. […] Die SP Arbon stellt sich als weiteres Mitglied eine neutrale, also von Behörde oder Interessengruppen unabhängige Person vor. (St. Galler Tagblatt, 04.09.1999; Mehr Leute für Hafen.) (4) “Was fehlt, ist ein einheitliches, allgemein anerkanntes und unabhängiges Qualitätssiegel”, urteilt Ulrike Pilz-Kusch, die im Auftrag der Verbraucherzentrale Nordrhein-Westfalen den Ratgeber “Gesucht: Wellness” geschrieben hat. Jedes Gütesiegel lege derzeit auf unterschiedliche Dinge wert. Auch seien neutrale Prüfer die Ausnahme. Was fehlt, ist eine klare Definition von Wellness und Überprüfungen durch unabhängige Sachverständige. (die tageszeitung, 27.11.2004, S. 17, Qualitätssiegel.) (5) Eine Gesellschaft kann sich nur dann dauerhaft und damit nachhaltig entwickeln, wenn sie ihre natürlichen Lebensgrundlagen erhält. (Frankfurter Rundschau, 22.10.1997, S. 6, Wie das Zukunftsmodell “nachhaltige Entwicklung” endlich in die Köpfe kommt.) (6) Mit einer unbewaffneten “Armee” aus Freiwilligen wollen Friedensaktivisten den Friedensprozess in Sri Lanka unterstützen. Von August an sollen speziell ausgebildete Helfer der Nonviolent Peaceforce den Frieden in der Krisenregion dauerhaft und mit gewaltlosen Mitteln sichern. “Ein nachhaltiger Frieden ist nur durch gewaltfreies Handeln zu schaffen”, sagte David Grant, […]. (Berliner Zeitung, 26.05.2003, Sri Lanka.)
In (3) and (5), structures such as neutral, also unabhängig (neutral, meaning an independent person) and dauerhaft und damit nachhaltig (lasting and therefore sustained) suggest an implication or some sort of consequence, and the implication . The term implication does not refer to its strict logical definition but to its general notion of semantic entailment where conceptually, certain processes depend on one another or are entailed in one another in different ways, so that they are conceptually closely linked. However, it is not understood, for example, in the way in which it is defined by Lyons (1977).
Synonyms in corpus texts
is made explicit by additional lexical emphasis. This conceptual implication must be perceived as a conceptual equivalence or semantic closeness interpretable as a synonymous reading as featured in (4) and (6), where the underlying relation between the two concepts is implied instead of being expressed overtly. Examples like these are not random or occasional findings in a corpus. They are regular attestations for situations which Cruse (2002: 486) might have had in mind when he noted that at an intuitive level, we operate with a complex notion of synonymy, with different conceptions for different purposes. Corpus texts reveal how implicit background knowledge of certain processes and states are involved when relating two concepts to each other. Often, only a unilateral and not, as one might assume, a mutual conceptual implication is a sufficient criterion or prerequisite for the creation of contextual identity. Hence, the focus is on implication understood as semantic inclusion and associative conceptual derivation. With two lexical units X and Y, this means that a concept as expressed by term Y is semantically included, connected, implicit, associated or concluded as well as conceptually derived from the concept represented by X. Speakers perceive two concepts as mutually dependent in different ways. On first inspection, it appears that there is a range of conceptualisation processes speakers employ to construe conceptual entailment. These are: – – – – –
conceptualisation of cause-effect conceptualisation of conditionality conceptualisation of purpose-orientation conceptualisation of part-of-whole conceptualisation of superordination.
At this point, however, it is not claimed that this is a complete set of options, nor is it implied that these are strict categories with clear boundaries. On the contrary, it is often the case, as will be shown later, that synonymy between two concepts and their respective lexicalisations can be based on alternating conceptualisations. These conceptual operations are sometimes used to construe semantic inclusion and thus synonymy in a wider sense, and they will each now be explored in more detail.
2.1
Cause-effect conceptualisation
In examples (7) to (9), it is stated that something exists or is characterised in a specific way as a result of something else which has caused it.
75
76
Petra Storjohann
(7) Nissan muss als erster großer Autohersteller seine Produktion wegen Stahlmangels zeitweise stoppen. […] Der Bedarf an Stahl sei wegen großer Nachfrage höher als vom Konzern zuvor erwartet. (Berliner Zeitung, 26.11.2004, Nissan stoppt Produktion wegen Stahlmangels, S. 15.) (8) Im Mikrokosmos der Mozartstadt ist es Verkehrsressortchef Johann Padutsch (BL), an dessen “Bosheit” die Einsatzfähigkeit des Autos leidet, des “vielleicht großartigsten Spielzeuges, das der Mensch je erfunden hat”. “Künstliche Staus, Verkehrsinseln, Ampeln, Radwege, Busspuren” – in der Welt des Erich Hüffel ein Sammelsurium von Unverschämtheiten und bodenlosen Frechheiten. Zumal völlig sinnlos, weil überflüssig. (Salzburger Nachrichten, 20.05.1995, Durch nichts zu erschüttern.) (9) Nach dem Abschluß eines Gerichtsverfahrens ist jede Kritik und das Geltendmachen von Beweisen nutzlos und deshalb auch sinnlos. (Neue Kronen-Zeitung, 24.01.1996, S. 52.)
In all of these texts, causality is explicitly marked. They state that the need for steel (Bedarf an Stahl) is higher due to a greater demand (große Nachfrage) for it, that roundabouts, traffic lights, cycle paths and bus lanes are pointless (sinnlos) because they are unnecessary (überflüssig), and that looking for further evidence in closed lawsuits is useless (nutzlos) and therefore pointless (sinnlos). Causation is understood as a conceptual relation where two lexical items refer to concepts of cause and effect between specific events in an utterance. Cause and effect are linguistically stressed by corresponding markers such as wegen (due to), weil (because) and deshalb (therefore). Such relative terms specify – for example – that überflüssig expresses some sort of cause or reason which eventually leads to a state which is judged to be sinnlos, a term which signifies resulting characteristics. Other patterns, for example, are X, daher Y (X and therefore Y) or X, und somit Y (X and hence Y), and these also function to direct the attention to a causal-consecutive relation between the items in question. Turning to citations (10)–(12), we can see that the same lexical pairs exhibit equivalent reference and evoke the construction of a relation of meaning equivalence. (10) [...] Konkret hat die Umfrage ergeben, dass in den Kantonen St. Gallen und Appenzell sowie dem Fürstentum Liechtenstein zurzeit 250 Informatiker fehlen. Eher klein ist die Nachfrage bei den Fachrichtungen Mediamatik und Projektleitung. Der grösste Bedarf besteht in den Fachrichtungen Applikationsentwicklung, Support und Systemtechnik. (St. Galler Tagblatt, 21.03.2001, Ansturm auf Informatikschulen.)
Synonyms in corpus texts
(11) Betrachtet man die Vergangenheit mit ihren rasanten Entwicklungen in der Gesellschaft und hauptsächlich in der Wirtschaft, ist wohl klar, dass Prognosen auf über 50 Jahre hinaus sehr gewagte, ja überflüssige Spekulationen sind. Solche sinnlosen Prognosen dienen weder der Wirtschaft, noch beeinflussen sie die gesellschaftlichen Entwicklungen. (Züricher Tagesanzeiger, 09.11.1996, S. 25, Empört über Simplifizierungen.) (12) Auch energiesparende Bluffs erfüllen also ihren Zweck, im Zweifel sogar mehrmals, wie die Wissenschaftler beobachten konnten. Und sie zeigen, wie es im Laufe der Zeit überhaupt zu nutzlosen Brautgeschenken kommen kann: Lassen sich die Weibchen auch von sinnlosen Gaben überzeugen, wird sich diese Methode nach und nach ausbreiten – nicht nur, weil sie für die Männchen bequemer, sondern vor allem, weil sie billiger ist. (spektrumdirekt, 10.01.2005, Anspruchslose Frauen.)
Semantic closeness between the items in question is not based on a large semantic overlap but rather on logical sequences of two events or states which are closely interrelated. Generally, these examples indicate that knowledge about cause and effect between two events is implicit knowledge that is shared or assumed to be shared by speakers. The knowledge is assumed to be interpretable and is hence used for producing synonymity. As Lyons (1968, 1977) and Cruse (1986) have noted for the case of synonymy, there is usually a two-way semantic inclusion. However, in a strict interpretation of cause and effect conceptualisation, it is believed that a meaning-equivalent context does not necessarily show this behaviour but rather depicts a unilateral entailment relation. Besides, in a number of cases, speakers possess knowledge of a close interconnection between cause and effect, but they cannot strictly assign cause and effect. A change of perspective is best illustrated by corpus contexts where the terms are interchanged in the following way: sinnlos, weil nutzlos vs. nutzlos, weil sinnlos überflüssig, weil sinnlos vs. sinnlos, weil überflüssig
The same interchangeability or indeterminacy of cause and effect is found between the lexemes Sicherheit und Geborgenheit, both of which are best translated into English as safety. In the corpus, there is sufficient evidence of their use as synonyms. But at the same time, Geborgenheit is the term which is used to express a state resulting from the state denoted by Sicherheit. However, speakers also express causality between the two notions the other way around. This is best illustrated by example (13) where we find Geborgenheit gibt Sicherheit, and example (14), where we have Sicherheit gibt Geborgenheit.
77
78
Petra Storjohann
(13) Auch amerikanische Studenten, die in Heidelberg gerade ihr Austauschjahr beginnen, werden vom Deutsch-Amerikanischen Institut betreut. “Wir suchen für diese Studenten Heidelberger Familien, die die jungen Leute bei sich aufnehmen”. Viele könnten es nicht ertragen, jetzt alleine im Wohnheim zu leben. “Geborgenheit gibt ihnen ein Stück Sicherheit. (Mannheimer Morgen, 14.09.2001, Nachbarschaft.) (14) Man hat sich in letzter Zeit angewöhnt, Demokratie positiv zu bewerten und Diktatur negativ. Angesichts der von Diktatoren begangenen Greuel nur zu verständlich. Allerdings übersieht man dabei, dass eine Diktatur von der betroffenen Bevölkerung nicht nur negativ erlebt wird. Sie vermittelt den Eindruck der Sicherheit, der Verläßlichkeit und gibt so ein Gefühl der Geborgenheit, das viele in der auf der Selbständigkeit des Einzelnen beruhenden Demokratie vermissen. (Salzburger Nachrichten, 29.12.1999, Vernunft gegen Gefühl.)
The notions of ‘Sicherheit’ and ‘Geborgenheit’ are so closely intertwined that speakers are not always aware of the exact sequences. It is assumed that the more contexts there are with both options, the higher the likelihood that the two items will be semantically substitutable and hence used as synonyms. This means that in a case where details of causation become blurred, there is a higher chance that the terms will constitute a relation of synonymy. And presumably, in such cases there is quite a systematic variation between contexts where the cause-effect concept is expressed explicitly and where synonymous contexts are attested to. Generally, it is observed that the more precise the level of awareness and the higher the extent of knowledge about logical relations, the less likely it is that the words involved will be used as synonyms. After all, there are contexts with a clear assignment of cause and effect where the terms are never used as meaning equivalents. However, these assumptions are based upon initial observations and these claims need further investigation to be fully justified.
2.2
Conceptualising conditionality
Conditionality is another way of conceptualising semantic implication. One lexical item is used as a sort of antecedent in an utterance and another designates a dependent result. Again, knowledge of conditions and their results can be used to produce a synonymous reading. (15) Auch Günter Janz von der Liste Zivilcourage kann sich eine finanzielle Unterstützung der Gemeinde an einen Privaten vorstellen, aber “falls kein Wirt Interesse zeigt, besteht auch kein Bedarf”. (Kleine Zeitung, 12.03.1998, Kegelbahn als Politikum.)
Synonyms in corpus texts
In context (15), it is explicitly indicated that Interesse (interest) signifies something that is required before something exists that is designated by Bedarf (need). The difference between a causative and conditional relation is usually indicated syntagmatically by the use of corresponding relational terms such as falls, wenn (if). When this relation turns into a synonymous context, the notion of the result is entailed in the concept of the condition. Contextual meaning equivalence between Bedarf and Interesse is illustrated by corpus sample (16). (16) Die Pfeife von Eberhard Diepgen ging für schlappe 400 DM an ihren Käufer, und Friede Springer stellte den seidenen Morgenmantel ihres Axels zur Verfügung, der für rund 1000 DM wegging – vermutlich an Madame Lothar. Für die Pfeifen hiesiger Promis hätte ich wenig Bedarf – am Ankauf des präch tigen Pelzes aus dem Nachlass des Klaus Peter Schulenberg hingegen einiges Interesse. (die tageszeitung, 04.09.1992, S. 24, Urdü’s wahre Kolumne.)
As was pointed out for the case of causality, corpus texts exhibit a different degree of awareness of logical relations. In some cases, the notion of condition and its consequence is clearly lexicalised and linguistically expressed, and the items involved can never be used to establish a synonymous reading. In other cases, a regular variation between lexical representations of condition and result can be observed, indicating that speakers have some knowledge of conceptual closeness. However, what the necessary requirement is and what the consequence is cannot be established. These findings confirm that the implicational directionality is judged differently by speakers. (17) Wie angetönt werden gesetzliche Bestimmungen nur dann mitgetragen und befolgt, wenn deren Sinn erkannt und die Notwendigkeit akzeptiert wird. (St. Galler Tagblatt, 09.02.2001.)
Recognising Sinn (sense) is understood here as something that is required in order to be able to consequently accept the Notwendigkeit (necessity). The temporal successive relationship between the two states is stressed by their corresponding verbs and also by the order in which they appear in the coordinating structure. Alternatively, conditionality can be turned around, in which case Notwendigkeit is a requirement for a state which is denoted by Sinn, as shown in context (18). (18) Zu seinem Vorschlag befragten wir die Theaterleiter in Heidelberg, Ludwigshafen und Mannheim sowie die künftige Generalintendantin des Nationaltheaters, Regula Gerber, die Voscheraus Vorschlag “verführerisch” nennt, weil Mannheim ja von einer solchen Zentralisierung profitieren würde. Doch die Theater in Ludwigshafen und Heidelberg dürften dafür nicht geopfert werden. Für Gerber hat das Projekt höchstens dann Sinn, wenn sich
79
80 Petra Storjohann
daraus eine künstlerische Notwendigkeit ergibt, verbunden mit einer zentralen Intendanz. (Mannheimer Morgen, 06.05.2004, Warnung vor kultureller Verarmung in der Region.)
As was claimed earlier, terms exhibiting a close causative relation with systematic variation between cause and effect are more likely to turn into synonyms. The same holds for conditional relations. Again, the tentative conclusion is that the less straightforward the actual relation is and the less aware speakers are of the exact circumstances of condition and result, the higher the probability that the items in questions will be used equivalently. Sinn and Notwendigkeit are an example found in the corpus where there is regular variation between lexicalisations of condition and sequence on the basis of such indeterminacy. This demonstrates that speakers are quite aware of a close interrelation between the two notions, a knowledge which is used to place the terms in a synonymous relation, as for instance in corpus citation (19), where identical reference (see underlined terms) further substantiates the relation: (19) So erfuhren die Grünen von der Verschiebung der Rentenbeitragssenkung um 0,8 Prozent vom 1. Januar auf den 1. April aus des Kanzlers Mund in einer Aktuellen Stunde des Bundestags am vergangenen Donnerstag. Eilends stellten sie per telefonischer Intervention sicher, daß dann auch die zur Gegenfinanzierung gedachte Ökosteuer erst am 1. April 1999 kommt. Diesen Termin hatten die Grünen für “ihr Baby”, die schrittweise Anhebung der Energiepreise zur Senkung der Lohnnebenkosten, bereits in den Koalitionsverhandlungen bevorzugt. Sie wollten durch gründliche Beratung der Ökosteuer peinliche Rechtspannen vermeiden. Zu ihrer gelinden Überraschung mußten die Grünen dann vor drei Wochen anläßlich einer Recherche zur Steuerreform feststellen, daß Oskar Lafontaines Finanzministerium die Ökosteuer für den 1. Januar vorbereitete. Wenn es noch eines Belegs für den Sinn besserer Absprachen zwischen SPD und Grünen bedurft hätte – das Info-Chaos am gestrigen Tag um die Beendigung des rot-grünen Durcheinanders wäre mehr als ausreichend dafür. Während Grünen-Vorstandssprecherin Gunda Röstel bereits regelmäßige Treffen des Koalitionsausschusses, beginnend am kommenden Montag, ankündigte, bestritten Regierungssprecher Uwe-Karsten Heye und SPD-Fraktionschef Peter Struck noch die Notwendigkeit eines institutionalisierten Informationsaustausches. (Berliner Morgenpost, 24.11.1998, S. 3, Schluß mit den Chaos-Tagen.)
On the other hand, clearly indisputable conditional relations have a lower chance of turning into a meaning-equivalent relation in other contexts. Again, this
Synonyms in corpus texts
correlation is an assumption based on the investigation of a few words, and does need further consolidation.
2.3
Conceptualisation of purpose/goal-orientation
The third type of synonym relation which is frequently attested is that based on a conceptualisation where a lexical item X maps onto a process with an imminent goal or purpose that is explicitly expressed by another term Y. Context (20) features one such purpose-orientation. (20) Der US-Senat verabschiedet eine von Republikanern und Demokraten unterstützte Kürzung der Sozialhilfe um 70 Milliarden Dollar. […] Die Garantie, wonach jeder Amerikaner ein Anrecht auf staatliche Hilfe zum Schutz vor Armut und Hunger hat, soll nicht mehr gelten. (die tageszeitung, 21.09.1995, S. 8, Wer nicht hat, dem wird genommen.)
Hilfe (help) designates something that is needed in order to reach the state that is specified by the term Schutz (safety/security). The conceptualised purposeorientation is stressed by the syntagmatic pattern X zur/zum Y. If the actual goal or purpose is closely linked as a natural phenomenon to the necessary preceding process, speakers must have this association stored mentally as an implication. As a result, the terms Hilfe and Schutz are used synonymously, as exemplified in (21). (21) Seit dem Einmarsch der internationalen Friedenstruppen im vorigen Sommer hat das Militär im Kosovo zahlreiche zivile Aufgaben bewältigt. […] Die zivile UN-Administration hat ebenso wie die internationale Polizei erst in den letzten Monaten Fuß gefasst. Beide sind auf die Hilfe des Militärs angewiesen. Fest steht, dass der zivile Wiederaufbau ohne den Schutz des Militärs nicht zu gewährleisten ist. (die tageszeitung, 19.04.2000, S. 11, Fauler Kompromiss.)
Another case where synonymy is construed on the basis of conceptualising purpose-orientation can be found between the terms Ermittlung and Aufklärung (investigating and solving a crime), as illustrated in examples (22) and (23). (22) Norbert Rüther, Exfraktionschef der Kölner SPD, hat laut Staatsanwaltschaft “umfangreiche Angaben” zur Spendenpraxis seiner Partei gemacht. 830.000 Mark soll Rüther von verschiedenen Spendern angenommen haben. Rüthers Aussage dürfte unangenehme Folgen für die Spender haben. Staatsanwältin Regine Appenrodt kann nun mit ihren Ermittlungen beginnen, die vielleicht Aufklärung über Schmiergeldzahlungen für die Kölner
81
82
Petra Storjohann
Müllverbrennungsanlage bringen können. (die tageszeitung, 14.03.2002, S. 8, Kölner SPD-Mann packt aus.) (23) Unbegreiflich bleibt das mäßige Interesse an der Aufklärung der Todesursache. Warum hat die Justiz die Ermittlung eines so brisanten Falles einer unerfahrenen Schweizerin und einer Handvoll norddeutscher Staatsanwälte überlassen? (Die Zeit, 26.09.1997, Nr. 40, Die offene Akte, S. 11.)
As mentioned earlier, in a number of corpus examples, the phrasal sequence in which the terms are embedded has a structure of the type X zur/zum Y (Ermittlung zur Aufklärung). A crime is being investigated in order that it can be solved. Generally, Ermittlung is a term denoting a process, the aim of which is to determine a fact or the identity of the perpetrator of a crime, as expressed by Aufklärung. Aufklärung is thus a desired result or a consecutive process. Thus, solving a crime is conceptualised as a desired result or alternatively, as an implicit purpose or endpoint of the process of a criminal investigation. Hence, speakers might consider the two consecutive concepts as one larger common concept and therefore construe an implication. However, what needs to be pointed out is that of course not just any two consecutive processes which are in a relationship based on purpose/goal-orientation evoke synonymity. There are a number of such relations which can simply be retrieved from a corpus by searching for syntagms such as X zu Y. And most of the results refer to processes, states or actions which are completely independent from each other and form no necessary prerequisites or associated results. Their sequence might also reflect a larger temporal shift from one to the other than in the example provided here. What they usually lack is an implication which is associative due to the conceptual closeness between them. It might be argued that the stronger the association of one concept with the other, the more likely that a synonymous relation can hold between their lexicalisations. But generally, the view is taken here, in common with Cruse (in Croft/Cruse 2004: 144), that “construability is not infinitely flexible”. Unlike the aforementioned types of conceptualisation, for purpose-orientation, a systematic variation between the two terms of each pair has not been observed in the corpus.
2.4
Conceptualising implication through superordination
Many synonymous relations are formed on the basis of a unidirectional conceptual entailment of the type hyponym-hyperonym. Partington (1998: 32) notes that it is always possible to “replace a hyponymous term in a phrase with its superordinate without altering the truth value of the phrase”. This is particularly true for
Synonyms in corpus texts
lexical items which denote a generic group, an ontological category, a species or a specific type of thing. Semantically, the hyperonym is always included in the hyponym. A change between a synonymous and a hyperonymous relation is based on how speakers stress certain semantic properties contextually. Depending on communicative needs, both terms can be used to indicate identical reference to the same reference-object often in the subsequent proposition. If a superordinate item precedes the hyponym contextually, then it is the speaker’s intention to provide precision or a correction, rather than produce a meaning-equivalent utterance. This suggests that speakers do have a notion of how one concept is included in another concept. It is, however, not as straightforward in the case of terms which refer to more abstract concepts. In fact, in a large number of cases, it is difficult to distinguish between a truly hyperonymous relation and a meaning-equivalent relation. How difficult such a decision sometimes is can be illustrated with reference to the sets Antwort – Äußerung (answer – utterance) and Motto – Spruch (motto – slogan) in the following examples: (24) Es sind nicht nur die von russischer Seite vorgetragenen sozialen Probleme, die die Verhandlungen über den Zeitplan für den endgültigen Abzug der ehemals sowjetischen Truppen aus dem Baltikum bisher haben scheitern lassen. Dies geht aus jüngsten Äußerungen des lettischen Delegationschefs für die Gespräche mit Moskau, Janis Dinevics, hervor. Anläßlich der letzten Zusammenkunft habe er nämlich seinem Gegenpart, Sonderbotschafter Sergej Sotow, die direkte Frage gestellt, ob der Kreml die nunmehr dem russischen Kommando unterstellten Verbände schneller abziehen würde, wenn es der Baltenrepublik mit internationaler Unterstützung gelänge, Mittel für den Bau von Offizierswohnungen in Rußland bereitzustellen. Die unmißverständliche Antwort Sotows habe “Njet” gelautet. Dinevics kommentierte diese Äußerung mit den Worten, daß auf russischer Seite immer deutlicher “strategische Interessen zutage treten”, einen endgültigen Abzug der Truppen so weit wie irgend möglich hinauszuzögern. (Die Presse, 21.07.1992, Truppenabzug aus dem Baltikum verzögert sich.) (25) “Willkommen im Leben”. Der Werbeslogan ist bekannt. Das Kreditkarten unternehmen, das sich dieses Motto zu eigen gemacht hat, bemüht sich jetzt aber auch tatsächlich, diesen Spruch mit Sinn zu füllen. (Frankfurter Rundschau, 07.10.1998, S. 37, Wettbewerb “Team aktiv!”)
Is an answer a kind of utterance and a motto a kind of slogan? In more abstract contexts, a justifiable distinction between synonymy and hyperonymy cannot always be made. To language users, however, this is irrelevant. It is, as Murphy
83
84
Petra Storjohann
(2003: 139) remarks, the level of specificity of relevant properties which affects how similar the meanings of two words seem. And knowledge of such semantic closeness and semantic specificity is available to speakers in situations of use.
2.5
Part-of-whole synonymy
A less dominant case of meaning equivalence is found in instances where partonymy is the basis of contextual synonymy. Conceptual implication can be construed when speakers refer to a specific part of something larger and this part is conceptually implicit and taken for granted as common knowledge. One such pair is Vertrag – Vereinbarung (contract/treaty – agreement). The following citation exhibits such a partonymous relation. (26) Die DAG beruft sich dabei auf die Aussagen des Bremer Arbeitsrechtlers Wolfgang Däubler, der nach Einsicht in die Verträge “eine große Zahl von Vereinbarungen” festgestellt hatte, “die so überhaupt nicht gehen”. (die tages zeitung, 27.03.1990, S. 21.)
In this context, conceptually, agreements form and establish a contract. The conceptual element is therefore present. Individual agreements are considered to be part of a contract, treaty or convention: they are the very essence of these. Once again, speakers must perceive this connection as a form of semantic closeness which is used for producing a relation of semantic equivalence. Example (27) demonstrates how the lexicalisations Vertrag and Vereinbarung and their corresponding concepts enter a synonymous relation: (27) Bremens Bürgermeister Klaus Wedemeier (SPD) und sein Amtskollege aus der türkischen Stadt Izmir, Burhan Özfatura, haben am Mittwoch in der Hansestadt eine Rahmenvereinbarung zur Zusammenarbeit zwischen beiden Städten paraphiert. Die Bremische Bürgerschaft muß dem Vertrag über die Städtepartnerschaft noch zustimmen. Die Vereinbarung sieht unter anderem die Zusammenarbeit in Wissenschaft, Bildung und Kultur sowie im Gesundheits- und Sozialwesen, im Sport und im Tourismus vor. (die tageszeitung, 09.03.1995, S. 21, Städtepartnerschaft Izmir-Bremen.)
In addition to the examples supplied here, attestations of a partonymy relation as a basis of synonymy are not frequently found in the corpus. Presumably, the relations of closeness and the implications formed by such patterns are not as regular as those formed in cases where there is an underlying concept of causality or conditionality.
3.
Synonyms in corpus texts
Contextual construction
So far, we have attempted to ascertain what knowledge representation might be relevant in the manifestation of synonymous readings. In this section, we turn our attention to the question of how these are realised linguistically in discourse. Synonymy has generally been attributed to the class of paradigmatic sense relations, and for a long time it was neatly separated from syntagmatic structures. A number of studies of contrastive relations in English, for example by Justeson/Katz (1991), Fellbaum (1995), Jones (2002) and Murphy (2006), have revealed how English opposites often co-occur within the same sentence and are embedded in specific syntactic sequences. They are contextually realised in specific patterns. Some of these might be interpretable as constructions in terms of a construction-based approach, as suggested by Murphy (2006: 5) where antonym pairs “constitute constructions, in that they are form-meaning pairings”. This argument is based on several observations about opposites and their function in actual discourse. Murphy herself questions whether synonyms are also inserted into specific syntagmatic frames and she suggests that there might not be enough evidence to support the idea that synonyms too have an affinity for particular constructions. It would be possible to treat other paradigmatic lexical relations, such as hyponymy and synonymy, in this way, although there is much less evidence that such relations also display the syntagmatic properties that have been found for antonymy. (Murphy 2006: 17)
It is typically assumed that synonyms do not combine in close proximity, but that rather they are positioned further apart, with one or two sentences separating them, where a specific synonymous term refers to a previously mentioned notion or concept. This argument supports the idea that synonymy is a typical paradigmatic relation. However, in a small study, Storjohann (2006) shows that a number of German synonyms often co-occur in close proximity and recur in combinational patterns and in typical phrasal structures. Therefore, they can also partly be classified according to their syntagmatic behaviour reflecting discourse-functional categories. Generally, three typical construction patterns can be observed in corpus data, namely Coordinated Synonymy, Synonym Clusters and Subordinated Synonymy. These are typical templates for synonyms in general, irrespective of lexical or semantic-conceptual meaning equivalents. However, as will be illustrated, not all of them are representative patterns for the types of synonymy
. Murphy refers to the term antonymy in its broad sense including different types of opposites, gradables and non-gradables alike.
85
86 Petra Storjohann
which have been demonstrated before, where an underlying conceptual principle is applied to the construction of a meaning-equivalent context.
3.1
Coordinated synonymy
Most of the synonyms we have mentioned so far make exhaustive use of a coordinated framework where the synonymous pair is conjoined by und (and), oder (or) or beziehungsweise (as well as) in templates such as X und Y, X oder Y, X beziehungsweise Y. Examples (28)–(43) feature some of these structures and coordination options: (28) Der Streit im Abgeordnetenhaus um die Auflösung der Westberliner Akademie der Wissenschaften […] müßten “von unabhängigen und neutralen Gerichten” entschieden werden, erklärte die ASJ gestern. (die tageszeitung, 04.07.1990, S. 28.) (29) Die Obegg AG hat sich als neutrales und unabhängiges VersicherungsbrokerUnternehmen in der Ostschweiz etabliert. (St. Galler Tagblatt, 23.10.1997; Neue Zusammenarbeit Verkehrsbetriebe.) (30) Durch eine gesunde und abwechslungsreiche Ernährung können mehr als 30% der jährlich 340.000 Krebs-Neuerkrankungen in Deutschland vermieden werden […]. (Neue Kronen-Zeitung, 17.02.2000, S. 7.) (31) Die abwechslungsreiche und gesunde Morgenmahlzeit sowie das Pausenbrot sorgen dafür, dass die Kinder fit und leistungsstark bleiben und gleichzeitig die notwendigen Nährstoffe bekommen. (Mannheimer Morgen, 03.07.2003, Wundertüte versüßt Schulstart.) (32) Dauerhaft und nachhaltig, weiß Hans Tietmeyer, müssen die öffentlichen Finanzen konsolidiert sein; strikt und eng sind die Teilnahmekriterien auszulegen. (Frankfurter Rundschau, 31.05.1997, S. 3.) (33) Im Bereich des bestehenden HL-Marktes an der Hauptstraße sollen nachhaltige und dauerhafte Strukturen entwickelt werden, […] (Mannheimer Morgen, 05.01.2005, Entwicklungsimpulse in den Ortskern lenken.) (34) Per E-Mail würden oft überflüssige oder sinnlose Informationen ausge tauscht, beklagen 59,3 Prozent der Befragten, die das Meinungsforschungs institut TNS Emnid im Auftrag von “Süddeutsche Zeitung Wissen” interviewte. (Mannheimer Morgen, 27.06.2006, E-Mails können Zeit rauben.) (35) Auch ich halte die Herausgabe eines Amtsblattes für völlig sinnlos und überflüssig. (Mannheimer Morgen, 24.10.2002, In den Papierkorb.)
Synonyms in corpus texts
(36) Doch auch wenn die Bösewichte noch so bedrohlich, die Herausforderungen noch so übergroß erscheinen: immer gibt es Figuren, die Sicherheit und Geborgenheit vermitteln, und sei es ein Hund. (Frankfurter Rundschau, 09.08.1997, S. 6.) (37) Die kleine familienähnliche Gemeinschaft, in der sie hier leben, begünstigt eine Atmosphäre der Intimität, die einen hohen Grad an Geborgenheit und Sicherheit vermittelt. (Mannheimer Morgen, 25.07.1998, 25 Jahre Dienst am Menschen.) (38) Ausdrücklich betont Schlögl jedoch, daß Menschen, die politisch verfolgt werden, nach wie vor in Österreich Schutz und Hilfe bekommen können. (Tiroler Tageszeitung, 09.04.1998, Nicht jeder ist willkommen.) (39) Nach der brutalen Vertreibung der Kosovo-Albaner kann man es sich nicht mehr so einfach machen. Jetzt stehen Hilfe und Schutz für die Vertriebenen im Zentrum. (Züricher Tagesanzeiger, 28.05.1999, S. 10, Den Flüchtlingen eine Chance geben.) (40) Das den Sicherheitsbehörden zur Verfügung stehende Instrumentarium zur Aufklärung und Ermittlung von Straftaten sei zu verbessern und rechtlich neu zu gestalten. (Salzburger Nachrichten, 21.02.1995; Seminar in Ottenstein.) (41) Es geht nicht darum, die “Praxis” bei der Arbeit der Kripo zu beschneiden, sondern zu “verrechtlichen”, indem den Sicherheitsbehörden bei Ermittlung und Aufklärung strafbarer Handlungen über den “ersten Zugriff” hinaus selbständige Aufgaben zugewiesen werden. (Salzburger Nachrichten, 28.02.1996; Ermittlungen der Polizei.) (42) Die Wiederaufarbeitung abgebrannter Brennelemente aus deutschen Kern kraftwerken könne von der rot-grünen Koalition schadensersatzfrei verboten werden, sagte der Grünen-Politiker. Dem stünden auch keine Verein barungen oder Verträge mit Frankreich entgegen. (Frankfurter Rundschau, 02.12.1998, S. 1.) (43) Der Ausschuß sollte alte Verträge und Vereinbarungen unter die Lupe nehmen. (Frankfurter Rundschau, 22.01.1998, S. 2.)
Corpus data indicate how, in syntagmatic terms, each pair favours a coordinated phrasal template. Synonymous pairs which function in a coordinated fashion are a sign of how conscious speakers are about synonyms. They are typical of spoken language and newspaper texts, which in this case form the main basis of the underlying corpus. Coordinating frameworks have a specific communicative function. They signal semantic inclusiveness and exhaustiveness (cf. Jones 2002), and a conjoined framework is an economic way of conveying as much information as
87
88
Petra Storjohann
possible, including slight semantic shades between the synonyms. Another effect of such patterning might be that the synonyms involved become more alike as each contaminates the other’s semantic interpretation, although this depends on how conventionalised the conjoined pairing is for the synonyms in question. As Jones (2002) has proved for coordinated English antonyms, there is usually a favoured choice of how the opposite items are linked. This choice features various sequence rules. Although concrete numbers in terms of distribution have not yet been investigated, it appears that most pairs under scrutiny showed no preferred order of sequence in the construction itself and that both permutations can be identified. They seem remarkably reversible, a result which is quite different from the results of research into preferences of positions in a syntagmatic template for English pairs expressing concepts of contrast (cf. Jones 2002).
3.2
Clusters
The second type, which is referred to as synonym clustering or synonym sets, is a subclass of coordination and it exemplifies enumeration. For general lexical synonyms (i.e. propositional or cognitive lexical variants) examples (44) and (45) represent this type: (44) Die “Grundstimmung in Frankfurt ist nicht so, wie es sein sollte, wenn man den Aufstieg geschafft hat”. Überall werde gemeckert, räsoniert, genörgelt, kritisiert, schlecht gemacht. Es gebe keinen Anlaß, in Panik zu geraten, “auch wenn wir das Spiel am Samstag gegen 1860 München verlieren sollten, werde ich nicht panisch reagieren”. (Fankfurter Rundschau, 21.08.1998, S. 18.) (45) Das Volk der Deutschen muss beweglicher, flexibler, agiler werden, verlangt sein grosser Häuptling in Bonn, der trotz seiner Leibesfülle erstaunlich leichtfüssig geht. (Züricher Tagesanzeiger, 12.07.1996, S. 5, Endlich Urlaub!)
On a textual level, speakers who produce such repetition are attempting to express a specific concept exhaustively by employing variation of expression and thus communicating slightly different pragmatic information. Any existing expressive or stylistic shades as well as differing denotative properties are concisely subsumed. This pragmatic force has the effect of explicitly stressing a particular propositional element by putting an emotive emphasis on an idea. Such sequences are particularly productive for near-synonyms which refer to the same notion but differ in expressive pragmatic traits. However, there are not many attestations of meaning equivalents which are produced on the basis of a more logical conceptualisation, because it takes more than two items to form a
Synonyms in corpus texts
cluster. In the following examples, it may be possible to interpret an underlying causative relation, and these might be examples which can be attributed to the category of coordinated synonym clusters. (46) Die neutrale, freie, unabhängige Schweiz passt nicht in die neue Weltordnung. Denn das Ziel der UNO ist: Globalisierung auf allen Ebenen – Zentralismus. (St. Galler Tagblatt, 10.04.1999; Wie verfasst?) (47) Nun werden diese bäuerlichen Familienbetriebe aber verstärkt mit ökologisch ungebremsten Produktionssystemen in Konkurrenz gesetzt und damit die letzten Rückzugsgebiete für Mensch und Tier bedroht. Daher gibt es nur einen Weg: Rückkehr zu natürlichen Kreisläufen; Neubesinnung auf umweltverträgliche Bodenbewirtschaftung; ökologische Fruchtfolgen und Anbaumethoden sowie eine naturgemäße Tierhaltung. (Salzburger Nach richten, 04.11.1993; Sind die Bauern noch zu retten?)
One would expect more clusters of this type, if more complex interrelations were present. Nevertheless, in the underlying corpus there were not many convincing examples which might easily, without any doubt whatsoever, be allocated to this group. This is clearly an area where further research is required.
3.3
Subordinated synonymy
As Cruse points out: Synonyms also characteristically occur together in certain types of expression. For instance, a synonym is often employed as an explanation, or clarification, of the meaning of another word. (Cruse 1986: 267)
In cases where an explanation or clarification is made explicit, speakers characteristically employ constructions with subordinating structures. Often, they take the form of a definitional paraphrase and their function is to further specify an utterance. The most common syntagmatic environments for German subordinate constructions which are used for this purpose are X, das heißt Y (X, which means Y); X, sprich Y (X, meaning Y); X, also Y (X, which is Y). In this paper, the view is taken that constructions like these are typical for lexical synonymy where speakers provide a simpler explanation by using a word which belongs to common everyday vocabulary for a term (often a loan word) which is not taken to be common knowledge but is part of a more specific vocabulary.
89
90 Petra Storjohann
(48) Im Kampf gegen den Brustkrebs gibt es dabei in einigen Ländern, etwa in Skandinavien oder in den Niederlanden, sogenannte “Screenings”, das heißt, Reihenuntersuchungen klinisch gesunder Frauen. (Tiroler Tageszeitung, 29.01.1998, Vorsorge als Überlebenschance.) (49) In der ersten schnellen Runde baut sich bei uns eine um etwa zehn Grad kühlere Temperatur auf. Ist sie dann im grünen Bereich, dann ist der Reifen aber schon so abgefahren, daß nicht mehr der optimale, sprich bestmögliche, Haftwert erreicht wird. (Tiroler Tageszeitung, 28.04.1997, Verkühlte Reifen und verschnupfte Benetton.)
In such cases, it is not a semantic overlap of conceptual aspects which is stressed. Instead, lexical correlation is produced through metalinguistic reflection. The question, however, is whether these are also typical constructions for synonyms with underlying logical conceptualisations as described earlier. Although there are attestations for this patterning, as illustrated for synonym pairings such as gesund-abwechslungsreich and neutral-unabhängig in (50) and (51), these are not frequent. (50) Viele Tumorarten scheinen ebenfalls sogenannte Zivilisationskrankheiten zu sein, aus einer Mehrzahl von Ursachen erwachsen, wobei man freilich mit etwas kargerer Lebensweise, mit gesunder, sprich abwechslungsreicher Ernährung mit wenig Fetten, viel Ballaststoffen, als Empfehlung höchstens jene erreicht, deren Bikinifigur in Frage gestellt oder das Gürtelschloß nicht mehr weitergestellt werden kann. (Salzburger Nachrichten, 14.06.1991.) (51) Laut neuem und altem Hafenreglement muss die Hafenkommission aus mindestens sechs Personen bestehen. Die SP Arbon stellt sich als weiteres Mitglied eine neutrale, also von Behörde oder Interessengruppen unabhängige Person vor. (St. Galler Tagblatt, 04.09.1999; Mehr Leute für Hafen.)
In the first context, the syntagmatic patterns suggest conceptual implication and notional consequence (healthier food, which in consequence means a varied diet), rather than simply providing a lexical substitute for helpful additional clarification of a previously used term. However, subordinating constructions function more typically to indicate lexical equivalence, and this is not the speaker’s intention when producing synonymy based on conceptualisations such as causation or conditionality, and so on.
Synonyms in corpus texts
3.4
Further indications of conceptualising implication
Not all synonyms adhere to coordinated or subordinated frames. There are also other emerging combinations which make use of additional lexis and which place explicit emphasis on a sense of implication. Typical expressions of this kind are: – – – – –
X impliziert Y (X implies Y) X heißt auch Y (X also means Y) X ist nichts anderes als Y (X is nothing but Y) X bedeutet gleichzeitig/auch immer Y (X simultaneously/always means Y) X beinhaltet Y (X includes Y)
All of these contain a specific relation term which identifies conceptual implication. The following two examples, (52) and (53), further illustrate this. (52) Wenn es wieder früher dunkel wird und die Abende länger werden, kann beinahe täglich in den Zeitungen wieder von Einbrüchen und Ver brechen gelesen werden. Sicherheit heisst Geborgenheit und ist somit ein Stück Lebensqualität. (St. Galler Tagblatt, 27.08.1999, Gratulation zum 80. Geburtstag.) (53) Wer von “Risiko” spricht, impliziert Gefahr. (Züricher Tagesanzeiger, 27.10.1997, S. 25, Sprachgebrauch entlarvt.)
Relation terms are used to characterise the relation between two concepts and their lexicalisations. In a contextual instantiation, they are part of the process of relation identification. What they reveal is how the two corresponding concepts are perceived, linked and possibly conceptually stored in the mental lexicon. Although these are strings of words which exhibit a coherent pattern, they are not systematically recorded in the underlying corpus data.
4.
Conclusion
Adopting a usage-based approach to the investigation of synonyms opens up a number of issues. Current research on the phenomenon often concentrates on corpus analyses which reveal collocational differences and selectional restrictions between near-synonyms, such as English tall-high (e.g. Taylor 2003), sheer-purecomplete-absolute (e.g. Partington 1998), etc. In particular, such studies uncover
. No specific German studies are known to the author, but the German example erhaltenbekommen-kriegen (to receive – to get) also fits into this research area.
91
92
Petra Storjohann
important lexical-semantic aspects of the difference in use of meaning equivalents and can add to conventional and sometimes inadequate descriptions in reference works, as well as helping learners of a foreign language. However, the scrutiny of synonyms in real data has more to offer, particularly for the empirical examination of the conceptual level of creating synonymy. The aim of this paper was neither to resolve the problem of classical lexical definitions of the phenomenon of synonymy nor to find corpus validation to impose a new classification on this relation. Rather, it has been demonstrated that the context in which synonymy has traditionally been placed fails to answer a number of questions about meaning equivalents in actual discourse. Throughout this paper, it has been argued that synonymy is not just a lexical relation between two items which share most of their semantic features or which signify the same or a similar enough concept: it also appears that speakers make similarity judgements on the basis of different underlying conceptual mechanisms, and that the analysed synonyms operate on a conceptual level, where speakers construe meaning equivalence in language use by relying on shared linguistic as well as non-linguistic knowledge, and by applying cognitive principles. The general aim of this paper was to provide a description of some cases of synonymy which has explanatory rather than categorising power. Two important aspects were stressed throughout this study. First, language material indicates that speakers have certain conceptualisations in mind which yield to semantic implication, which again is the very ingredient for construing contextual similarity. Such conceptualisations always contain conceptual elements such as , or <similarity>. A number of examples were discussed, where synonymy is created on the basis of close and often mutually dependent concepts such as cause-effect, conditionality, purpose-orientation, part-of-whole and superordination. For these synonym sets, it is argued that synonymy is not a relation between words, nor even between meanings or senses, but it is a relation between two lexical representations that map onto similar concepts. In conclusion, we can say that synonymy is very much conceptual in nature. The expressed concepts need to be linked to each other, presumably cognitively associated with each other, and knowledge of these concepts and their lexical representations needs to be shared and be part of the mental lexicon. The knowledge that is required goes well beyond linguistic competency and includes knowledge about the world or about reality, as we perceive it. Given these observations, it is argued that these conceptualisations are encoded by language and represent schematisations of experience and the recruiting of background knowledge. There are of course communicative restrictions and linguistic constraints, and it is also the case that not just any causative relation, for example, can turn into synonymy. At the same time, it is not claimed that all cognitive principles that might be relevant have
Synonyms in corpus texts
been covered here, but these are the notions which have so far emerged from corpus investigations. Up until now, specific underlying conceptualisations have not been the focus of research on synonymy, and more recent cognitive approaches have also not been employed in the description of synonymy. Taking the four guiding principles of cognitive semantics (cf. Evans/Green 2006: 153) as a general point of departure, there are indications with regard to the construction of synonymy that first, conceptual structure is embodied; second, semantic structure is conceptual structure; third, meaning representation is encyclopaedic; and finally, meaning construction is equated with conceptualisation. Within the realm of this small study, it turns out that these principles are applicable to the explanation and description of synonymy in language use. A second important conclusion that emerges from corpus data is that they convincingly show that constructing synonymy is in fact quite dynamic and that it is not exclusively a paradigmatic relation. Admittedly, just like any other lexicalsemantic relation, synonymy exhibits typical structures and can be contextually construed in typical syntagmatic templates. Examining the issue from a usagebased perspective, as in the cases demonstrated here, it can be concluded that synonymy is a lexical construction that unifies with other structures resulting in syntagmatic realisations. The results of this study show the intersection of paradigmatics and syntagmatics, treating synonyms as discontinuous items “that are compatible with appropriate slots in a grammatical context” (Murphy 2006: 17). This investigation of a number of synonymous pairs has shown that the subject of synonymy entails more than merely looking at similarities in denotation and connotation and classifying synonymy as a paradigmatic structure, and there is still much work to be done before a comprehensive description of synonymy within a theoretical model can be realised. Certainly, the most pressing area of research which has not been pursued further here is the investigation of a possible correspondence between cognitive mechanisms of metonymy such as conceptual mapping and projection within polysemous items and the aforementioned conceptual principles of synonymous contexts. In 2004, Cruse (in Croft/Cruse 2004) applied a cognitive approach to the description of semantic relations, but his account of dynamic construals did not include the relation of synonymy. In 2006, Murphy proposed a construction-based explanation for some opposite pairs. There seems to be strong evidence that both of these theoretical frameworks are suitable tools for the description of lexical-semantic relations, and some ideas can undoubtedly be transferred to synonymy as well. Although a full theoretical account needs further substantiation, there are strong indications that, as is the case with the characterisation of opposites, any adequate characterisation of synonymy should also acknowledge the notion of concepts.
93
94 Petra Storjohann
References Bußmann, Hadumod (ed.). 2002. Lexikon der Sprachwissenschaft. (3rd ed.) Stuttgart: Alfred Kröner Verlag. Croft, William and Cruse, D. Alan. 2004. Cognitive Linguistics. Cambridge: Cambridge University Press. Cruse, D. Alan. 1986. Lexical Semantics. Cambridge: Cambridge University Press. Cruse, D. Alan. 2002. “Paradigmatic relation of inclusion and identity III: Synonymy.” In Lexicology – A international handbook on the nature and structure of words and vocabularies (HSK 21/1), Alan Cruse, Franz, Hundsnurscher, Michael Job and Peter Rolf Lutzeier (eds), 485–497. Berlin: Walter de Gruyter. Cruse, D. Alan. 2004. Meaning in Language. (2nd ed.) Oxford: Oxford University Press. Cruse, D. Alan and Togia, Pagona. 1995. “Towards a cognitive model of antonymy”. Journal of Lexicology 1: 113–141. Evans, Vyvyan and Green, Melanie. 2006. Cognitive Linguistics: An Introduction. Edinburgh: Edinburgh University Press. Fellbaum, Christiane. 1995. “Co-occurrence and Antonymy”. International Journal of Lexicography 8.4: 281–303. Jones, Steven. 2002. Antonymy: A corpus-based perspective. London: Routledge. Justeson John, S. and Katz, Slava, M. 1991. “Redefining Antonymy: The Textual Structure of a Semantic Relation”. Literary and Linguistic Computing 7: 176–184. Kennedy, Graeme. 1991. “Between and through: The company they keep and the functions they serve”. In English Corpus Linguistics, Karin Aijmer and Bengt Altenberg (eds), 95–110. London: Longman. Murphy, M. Lynne. 2003. Semantic relations and the Lexicon. Cambridge: Cambridge University Press. Murphy, M. Lynne. 2006. “Is ‘paradigmatic construction’ an oxymoron? Antonym pairs as lexical constructions”. Constructions SV1. http://www.constructions-online.de/. Lyons, John. 1968. Introduction to Theoretical Linguistics. Cambridge: Cambridge University Press. Lyons, John. 1977. Semantics. 2 vols. Cambridge: Cambridge University Press. Lutzeier, Peter Rolf. 1981. Wort und Feld. Wortsemantische Fragestellungen mit besonderer Berücksichtigung des Wortfeldbegriffs. Tübingen: Niemeyer. Partington, Alan. 1998. Patterns and Meanings. Using Corpora for English Language Research and Teaching. Amsterdam/Philadelphia: John Benjamins. Storjohann, Petra. 2006. “Kontextuelle Variabilität synonymer Relationen.” OPAL – Online publizierte Arbeiten zur Linguistik 1/2006. Mannheim: Institut für Deutsche Sprache. (http:// www.ids-mannheim.de/pub/laufend/opal/privat/opal06-1.html) Taylor, John. 2003. “Near synonyms as coextensive categories: ‘Tall’ and ‘high’ revisited”. Language Sciences 25: 263–284.
Antonymy relations Typical and atypical cases from the domain of speech act verbs Kristel Proost
Antonymy is a relation of lexical opposition which is generally considered to involve (i) the presence of a scale along which a particular property may be graded, and hence both (ii) gradability of the corresponding lexical items and (iii) typical entailment relations. Like other types of lexical opposites, antonyms typically differ only minimally: while denoting opposing poles on the relevant dimension of difference, they are similar with respect to other components of meaning. This paper presents examples of antonymy from the domain of speech act verbs which either lack some of these typical attributes or show problems in the application of these. It discusses several different proposals for the classification of these atypical examples.
1.
Introduction
This contribution deals with the antonymy relations of speech act verbs. The term antonymy will be used to include cases of gradable antonymy as well as those of complementarity. This use of the expression antonymy reflects the fact that its German equivalent Antonymie is sometimes employed as a cover term subsuming both konträre and kontradiktorische Antonymie, i.e. gradable antonymy and complementarity respectively (cf. Lang 1995: 32–33). In the English literature on oppositeness of meaning, the term antonymy is often used as a synonym of gradable antonymy and does not extend to cases of complementarity (cf. Cruse 1986; Hofmann 1993; Lehrer 2002). This paper presents and discusses cases of meaning contrast from the domain of speech act verbs which appear to be instances of gradable antonymy and/or complementarity but do not fit in exactly with the way in which these relations have traditionally been defined (as, for example, by Lyons 1977; Cruse 1986; Lehrer 2002; Löbner 2003). Other types of
96 Kristel Proost
oppositeness of meaning such as converseness, reversiveness and duality will not be discussed, because they have been shown to be only marginally relevant to the domain of speech act verbs (cf. Proost 2007a). The discussion of the antonymy relations of speech act verbs will also be restricted to opposites which are stored as such in the lexicon and do not depend on contextual information in order to be interpreted as opposites. Such opposites are variously referred to as “canonical opposites” (cf. Cappelli 2007: 195) or instances of “systemic semantic opposition” (cf. Mettinger 1994: 61–83). Cases of non-systemic semantic opposition, including examples of words which are not stored as opposites in the lexicon but may be construed as such and interpreted accordingly, will not be taken into account. Since the cases discussed all concern the antonymy relations of speech act verbs, the next section will be concerned with the meanings of these.
2.
The meaning of speech act verbs
Speech act verbs are verbs used to refer to linguistic actions. They characteristically lexicalise combinations of speaker attitudes such as the speaker’s propositional attitude (i.e. the attitude of the speaker towards the proposition of his or her utterance), the speaker’s intention and the speaker’s presuppositions concerning the propositional content, the epistemic attitude of the hearer and the interests of the interlocutors (cf. Harras 1994; Harras & Winkler 1994; Harras 1995; Winkler 1996; Proost 2006; Proost 2007b). Examples of speech act verbs include to claim, to inform, to request, to promise, to praise, to criticise and to thank. The antonymy relations of speech act verbs will be represented against the background of the system used to describe the meaning of German speech act verbs in the Handbuch deutscher Kommunikationsverben, a textbook on German speech act verbs which consists of two volumes, a dictionary volume and a theoretical volume representing the lexical structures of German speech act verbs by means of lexical fields (cf. Harras et al. 2004; Harras/Proost/Winkler 2007). The description of the meaning of German speech act verbs in this reference work is based on a situation type referred to by all speech act verbs. Following Barwise/ Perry, this situation type is called the “General Resource Situation Type” (cf. Barwise/Perry 1983: 32–39). It involves the use of language and is characterised by the presence of four situation roles: a speaker, a hearer, a complex communicative attitude of the speaker and an utterance which – in the prototypical case – contains a proposition (see Figure 1). These four elements are part of any . The relation holding between reversives is variously called “reversiveness” (for example in Cruse 1979: 960), “reversity” (ibid: 957) and “reversivity” (as in Cruse 2002: 507).
Antonymy relations
Figure 1. Elements of the General Resource Situation Type
Figure 2. Specifications of the roles ‘Utterance’ and ‘Attribute of S’
situation referred to by speech act verbs; they constitute the common core of the meaning of these verbs (cf. Verschueren 1980: 51–57; Verschueren 1985: 39–40; Wierzbicka 1987: 17–19; Harras et al. 2004: 9–22). Two of the four roles of the General Resource Situation Type, the role of the utterance and that of the complex communicative attitude of the speaker, may be further specified as follows: the role of the utterance is specified by the aspect of the propositional content, while that of the complex communicative attitude of the speaker is specified by the aspects of the speaker’s propositional attitude, the speaker’s intention, and the speaker’s presuppositions (see Figure 2). The aspect of the propositional content may be further specified by the attributes of the event type of P, the temporal reference of P and – in the case that
97
98 Kristel Proost
Figure 3. Specifications of the aspect ‘Propositional Content’
P is an action – the agent of P. Each of these may be assigned different values (see Figure 3): – The attribute of the event type of P may be assigned the values ‘Action’, ‘Event’ or ‘State’ (e.g. boast: State/Action; claim: Not specified; request: Action) – The attribute of the temporal reference of P may have the values ‘[+Past]’ and ‘[–Past]’ (e.g. boast: [+Past], claim: Not specified; request: [–Past]) – The attribute of the agent of P may be assigned values such as ‘Speaker’, ‘Hearer’, ‘Third Person’, ‘Speaker and Hearer’, etc. (e.g. promise: Speaker; request: Hearer; slander: Third Person; agree (to do something): Speaker and Hearer) The aspect of the propositional attitude of the speaker may be specified as being epistemic, evaluative or emotive, or as an attitude of wanting or grading. These may be characterised as follows (see Figure 4): – epistemic attitude: S takes to be true: P (e.g. claim), S takes to be true: not-P (e.g. deny), S does not take to be true: P (e.g. lie), … – attitude of wanting: S wants: P (e.g. request), S does not want: not-P (e.g. allow, permit), S wants: do P (e.g. promise) – attitude of grading: S considers: P x (e.g. judge) – evaluative attitude: S considers: P good/bad (e.g. praise, boast/criticise) – emotive attitude: S feels: joy/anger/sorrow because of P (e.g. congratulate/ scold/lament)
Antonymy relations
Figure 4. Specifications of the aspect ‘Propositional Attitude of S’
The aspect of the speaker’s intention may be further specified as being epistemic or evaluative or as an attitude referring to an action (see Figure 5): – epistemic attitude: S wants: H recognise: S takes to be true: P (e.g. claim), S wants: H know: P (e.g. inform) – referring to an action: S wants: H do: P (e.g. request), S wants: not do: H, P (e.g. forbid) – evaluative: S wants: H consider: P rather good (e.g. whitewash) The aspect of the speaker’s presuppositions may be specified by the attributes of the expectability of P, the factivity of P, the interests of S and H, and the epistemic attitude of H. Each of these may be assigned the following values (see Figure 6): – Expectability of P: not expectable: P (e.g. request), expectable: P (e.g. warn) – Factivity of P: P is the case (e.g. praise, criticise, thank) – The interests of S and H concerning P: in the interest of S: P (e.g. request); in the interest of H: P (e.g. advise) – Epistemic attitude of H: H does not know: P (e.g. inform) Unlike the speaker’s propositional attitude and the speaker’s intention, which are relevant to the meaning of all speech act verbs, the speaker’s presuppositions are relevant to most but not all speech act verbs. They are irrelevant, for example, to
99
100 Kristel Proost
Figure 5. Specifications of the aspect ‘Speaker Intention’
Figure 6. Specifications of the aspect ‘Presuppositions of S’
the meaning of verbs like agree (that something is the case), deny (that something is the case), ask (a question) and greet. Other aspects which are relevant to some speech act verbs only include: – The position a given utterance occupies within a sequence of utterances. For example, deny is a reactive predicate: it is used to refer to an utterance by which a speaker reacts to an utterance of another speaker. Insist (that something is the case) is a re-reactive predicate: it indicates a second stage of reaction, a reaction of a speaker to a denial of his or her own initial request.
Antonymy relations 101
– The sequencing of multiple utterances. For example, convince and persuade are used to refer to linguistic actions consisting of multiple utterances making up a sequence. – Specifications of the role of the hearer. A verb like spread, for example, is used to refer to an utterance addressed to several hearers. – The manner in which something is said. An example is entreat, which is used to refer to a speech act performed by a speaker emphatically asking someone to do something. – The institutional setting of the speech act referred to. For example, accuse (in one of its senses) is used to refer to a speech act performed in a judicial context. The specifications of the propositional content and of the different kinds of speaker attitudes may be combined in many different ways. Different combinations constitute special resource situation types which are referred to by different classes of speech act verbs. For example, the combination below represents a situation referred to by the German verbs huldigen, ehren, würdigen and honorieren and by English verbs like honour. Propositional Content: Event Type: Temporal Reference: Agent: Attitude of S to P: Intention of S: Presuppositions of S:
Information Content: P Action [+Past] Hearer or Third Person S considers: P good S wants: H recognise: S considers: P good P is the case
huldigen, ehren, würdigen, honorieren
Huldigen, ehren, würdigen and honorieren all lexicalise the concept of a linguistic action whereby a speaker tells a hearer that s/he evaluates a past action performed by that hearer or some third person positively with the intention that the hearer also recognise the speaker’s positive evaluation. Insofar as these verbs all lexicalise the same concept, they constitute a lexical field. The representation of the conceptual part of the meaning of linguistic action verbs as combinations of specifications of the propositional content and of the different types of speaker attitude allows antonyms to be searched for systematically: verbs lexicalising opposite specifications would appear to be good candidates for antonyms. . Loben (praise) does not belong to this group, because it may be used to refer to a positive evaluation of an action or a state of affairs. Hence, it lexicalises the value ‘action or state of affairs’ for the attribute of the temporal reference of P.
102 Kristel Proost
3.
Complementarity and gradable antonymy: Traditional definitions
The large majority of the relations of oppositeness of meaning which speech act verbs enter into show some but not all characteristics of gradable antonymy and/or complementarity. Complementarity is traditionally defined as follows (cf. Lyons 1977: 271–272; Cruse 1986: 198–199; Hofmann 1993: 42–43; Lang 1995: 32–33; Lehrer 2002: 503; Löbner 2003: 127): Complementarity: Definition Two lexical items L(a) and L(b) lexicalising concepts a and b respectively are in a relation of complementarity if they exhaustively divide a domain into two mutually exclusive parts.
Typical examples of complementaries are true-false, dead-alive, open-shut, married-unmarried and man-woman. The fact that complementaries bisect a conceptual domain is reflected by the entailment relations holding between utterances containing them (cf. Lyons 1977: 271–272; Cruse 1986: 199; Lang 1995: 32–33): x is a entails and is entailed by x is not b, and x is not a entails and is entailed by x is b. For example, if the door is open, it cannot at the same time be shut, and if it is not open, it must be shut. This means that, when L(a) and L(b) are complementaries, (i) x is a and x is b cannot both be true (?The door is open and shut), and hence it is not possible that a and b are both the case; (ii) x is a and x is b cannot both be false (?The door is neither open nor shut), and hence it is not possible that neither a nor b is the case. Gradable antonymy has traditionally been defined as follows (cf. Lyons 1977: 272; Cruse 1986: 204; Hofmann 1993: 41; Lehrer 2002: 498; Löbner 2003: 124): Gradable antonymy: Definition Two lexical items L(a) and L(b) are in a relation of gradable antonymy if they denote opposite sections of a scale representing degrees of the relevant variable property.
Pairs like big-small, old-new, good-bad and hot-cold are typical examples of gradable antonymy. Gradable antonyms do not divide a conceptual domain into two mutually exclusive parts: the scale of the relevant property denoted by gradable antonyms contains a neutral midinterval which cannot properly be referred to by either member of an antonym pair. This is reflected by the entailment relations holding between utterances containing gradable antonyms (cf. Lyons 1977: 272; Lang 1995: 33): x is a entails x is not b, and x is b entails x is not a but x is not a does not entail x is b, and x is not b does not entail x is a. For example, x is short entails that x is not long, and x is long entails that x is not short, but x is not short does not
Antonymy relations 103
entail that x is long, nor does x is not long entail that x is short. Something which is not long may but need not be short; it may also be average. This means that, when L(a) and L(b) are gradable antonyms, (i) x is a and x is b cannot both be true (?The house is big and small) and hence it is not possible that a and b are both the case; (ii) x is a and x is b can both be false (The house is neither big nor small) and hence it is possible that neither a nor b is the case. Summarising what has been said so far, there are two conditions which two lexical items L(a) and L(b) must fulfil to qualify as complementaries or gradable antonyms. The first of these is relevant to complementaries as well as gradable antonyms: Condition I (complementarity and gradable antonymy): a and b cannot both be the case. (x is a and x is b cannot both be true)
Additionally, two lexical items L(a) and L(b) must fulfil Condition IIa to qualify as complementaries and Condition IIb to be classifiable as gradable antonyms: Condition IIa (complementarity): It is not possible that neither a nor b is the case. (x is a and x is b cannot both be false.) Condition IIb (gradable antonymy): It is possible that neither a nor b is the case. (x is a and x is b can both be false.)
The following section presents verb pairs from the domain of speech act verbs which fulfil conditions (I) and (II) in an untypical way. The cases discussed are selected from the inventory of antonym pairs in the Handbuch deutscher Kommunikationsverben (cf. Proost 2007a), which lists 18 antonymous groups of German speech act verbs. Each group corresponds to a lexical field constituted by verbs lexicalising the same concept, as demonstrated for the field {huldigen, ehren, würdigen, honorieren} ({honour}). For each of these groups, there is also a corresponding group containing verbs lexicalising one or more features involving opposite values. For example, verbs like tadeln, kritisieren and verurteilen (criticise, fault and condemn) lexicalise features such as ‘S considers: P bad’ and ‘S wants: H recognise: S considers: P bad’, which are opposites of the features ‘S considers: P good’ and ‘S wants: H recognise: S considers: P good’ lexicalised by verbs like huldigen, würdigen, ehren and honorieren. Hence, {huldigen, würdigen, ehren, honorieren} and {tadeln, kritisieren, bemängeln, …} are antonymous groups: any verb of one group is an antonym of any verb belonging to the other. None of the 18 antonymous groups appeared to fulfil the conditions on gradable antonymy and complementarity (I)–(II) in a straightforward fashion.
104 Kristel Proost
The examples discussed in the next section are representative of the different kinds of problem which apparently antonymous speech act verbs pose for the traditional definitions of gradable antonymy and complementarity.
4.
Antonymy relations of speech act verbs
4.1
Some apparently typical examples
Example 1: Pairs of verbs like zustimmen-abstreiten (agree (on the truth of something)-deny) The first example of antonymous speech act verbs are pairs of verbs like zustimmen and abstreiten (agree and deny). Verbs like zustimmen (agree) are used to refer to situations in which a speaker tells a hearer that s/he takes something (an action, event or state of affairs) to be true and intends that the hearer recognises this, the speaker’s utterance being a reaction to a previous utterance of H stating the truth of something (P). The situation type referred to by verbs like zustimmen may be represented as in Table 1 below. Other verbs which may also be used to refer to this kind of situation are beipflichten and bestätigen (assent and confirm). Verbs like abstreiten (deny) differ from those like zustimmen (agree) in that they are used to refer to situations in which a speaker tells a hearer that s/he takes something (an action, event or state of affairs) not to be true and intends the hearer to recognise this; S’s utterance is a reaction to a previous statement from H that P is true. The situation referred to by verbs like abstreiten may be represented Table 1. Resource situation type referred to by zustimmen vs. abstreiten (agree (that something is the case) vs. deny) Characterisation of resource situation type
Verbs zustimmen, beipflichten, bestätigen, bejahen (agree, assert, confirm)
abstreiten, bestreiten, verneinen, leugnen (deny, negate, disagree)
Propositional Content Event Type (P) Temporal Reference (P) Agent (P)
Information Content: P Not specified Not specified Not specified
Information Content: P Not specified Not specified Not specified
Attitude of S to P Speaker Intention
S takes to be true: P S wants: H recognise: S takes to be true: P
S takes to be true: not P S wants: H recognise: S takes to be true: not P
Utterance
Reactive
Reactive
Antonymy relations 105
as in Table 1. The same type of situation may also be referred to by bestreiten, verneinen and leugnen and by the English verbs negate and disagree. Since verbs like abstreiten differ from those like zustimmen only in that the propositional content embedded in the propositional attitude and in the speaker’s intention is negated, they may be regarded as antonymous groups: any verb of one group is an antonym of any verb belonging to the other. Pairs like zustimmen-abstreiten seem to be relatively straightforward cases of antonymy: since X agrees on Y and X denies Y cannot both be true, the general condition on antonymy (Condition I) is fulfilled. However, the question of exactly what type of antonymy is exemplified by pairs like zustimmen-abstreiten cannot easily be answered. On the one hand, agree (that something is the case) and deny appear to be complementaries, because their meanings incorporate those of the adjectives true and false respectively, which are clear cases of complementaries. On the other hand, the fact that it is possible for a speaker to neither agree on something nor to deny it indicates that pairs like agree-deny fulfil Condition IIb on gradable antonymy. However, it is only possible for a speaker neither to agree on Y nor to deny Y, if s/he either does not say anything or performs a completely different speech act. This means that pairs like zustimmen-abstreiten on the one hand and typical examples of gradable antonyms like long-short on the other fulfil Condition IIb on gradable antonymy in different ways. If something is neither long nor short, it is still being estimated as being of a certain length. If, by contrast, a speaker neither agrees on Y nor denies Y, we are no longer dealing with the dimension of assertion, i.e. the dimension of expressing that something is true or not true. No such change of dimension is involved in the antonymy of long and short, good and bad, big and small etc. The change of dimension evoked when someone neither agrees on something nor denies it is likely to be due to the fact that antonyms like agree and deny are not gradable. Though intensifications or attenuations of epistemic attitudes may be expressed in actual language use (for example by utterances such as X is more/less true than Y), there are no speech act verbs which lexicalise the grading of epistemic attitudes, at least not in German. What may be graded is not the epistemic attitude itself but the propositional content it embeds, i.e. P or not-P. Speech act verbs which lexicalise the epistemic attitude ‘S takes to be true: rather P’ are lacking in German, but the epistemic attitude ‘S takes to be true: rather
. Cruse, for example, points out that “… while definitions of sense relations in terms of logical properties such as entailment are convenient, they are also partially misleading as a picture of the way natural language functions. This is because complementarity (for instance) is to some extent a matter of degree” (Cruse 1986: 200).
106 Kristel Proost
not-P’ is lexicalised by anzweifeln and bezweifeln (both doubt) and by einräumen and einlenken (concede and admit).
Example 2: Pairs of verbs like huldigen-tadeln (honour-criticise) As pointed out before, verbs like huldigen (honour) are used to refer to situations in which a speaker tells a hearer that s/he evaluates a past action of that hearer or some third person positively, the speaker’s intention being that the hearer recognise this. Verbs like tadeln (criticise) appear to be antonyms of verbs like huldigen in that they differ from these only with respect to the type of evaluation they lexicalise: while huldigen expresses a positive evaluation of S to P, tadeln expresses a negative one. Other verbs expressing the same combination of speaker attitudes as tadeln are rügen, rüffeln, kritisieren, beanstanden, bemängeln, monieren, missbilligen, verurteilen, anprangern und schelten. In addition to criticise, the corresponding class of English verbs includes fault, deplore, condemn and denounce. The opposite evaluations expressed by verbs like huldigen on the one hand and verbs like tadeln on the other are embedded in the propositional attitude and the speaker’s intention that they lexicalise (Table 2). Though the opposition of huldigen and tadeln reflects the more basic opposition of good and bad, which are gradable antonyms, it is not at all clear whether the gradability of good and bad is in fact inherited by honour and criticise. If we say that a speaker honours someone more than someone else, what is being graded is the content of the propositional attitude (‘P good’) or the intensity with which the speaker expresses his/her positive evaluation rather than the propositional attitude itself (‘S considers: P good’). Table 2. Resource situation type referred to by huldigen vs. tadeln (honour vs. criticise) Characterisation of resource situation type
Verbs würdigen, honorieren, huldigen, ehren (honour)
tadeln, rügen, rüffeln, kritisieren, beanstanden, bemängeln, monieren, missbilligen, verurteilen, anprangern, schelten (criticise, fault, deplore, condemn, denounce)
Propositional Content Event Type (P) Temporal Reference (P) Agent (P)
Information Content: P Action [+Past] H or 3rd Person
Information Content: P Action [+Past] H or 3rd Person
Attitude of S to P Speaker Intention
S considers: P good S wants: H recognise: S considers: P good
S considers: P bad S wants: H recognise: S considers: P bad
Presuppositions of S
P is the case
P is the case
Antonymy relations 107
According to Condition I, huldigen and tadeln (as well as honour and criticise) are antonyms, because X honours Y and X criticises Y cannot both be true. However, here again, it is not easy to decide whether huldigen and criticise are complementaries or gradable antonyms. To the extent that the meanings of these verbs incorporate the opposition of good and bad, which are clear-cut examples of gradable antonyms, they appear to be instances of gradable antonymy rather than complementarity. Pairs like honour-criticise do fulfil Condition IIb on gradable antonymy: X honours Y and X criticises Y can both be false. However, these utterances can both be false only when X either does not do anything or performs a completely different speech act. Cases like these no longer relate to the dimension of the linguistic expression of evaluations. Typical examples of gradable antonyms like good-bad do not involve any such changes of dimension: if something is considered neither good nor bad, it may still be evaluated as being average, which means that it may still be considered relative to the dimension of evaluation. On the whole, we may conclude that pairs like agree-deny and those like honour-criticise show some properties of gradable antonyms and/or complementaries but do not fit in exactly with the way in which either one of the corresponding relations has traditionally been defined.
4.2
A less typical example
Example: Pairs of verbs like huldigen-vorwerfen (honour-reproach) We have been concerned so far with antonym pairs whose members differ only with respect to the propositional attitude and the speaker’s intention they lexicalise. These components both involve negation (e.g. agree-deny: ‘S takes to be true: P’ vs. ‘S takes to be true: not-P’; honour-criticise: ‘P good’ vs. ‘P bad’ – good is ‘not bad’ and vice versa). Verbs differing only in those components of their meaning which involve negation may be said to differ only minimally. There are a relatively large number of speech act verbs which differ not only in meaning components involving negation but also with respect to other features. Examples are pairs consisting of a verb like huldigen (honour) and a verb like vorwerfen (reproach). Verbs like reproach are used to refer to situations in which a speaker tells a hearer that s/he disapproves of a past action performed by that hearer. Other verbs which may be used to refer to the same type of situation are vorhalten and zurechtweisen. Apart from reproach, the corresponding English field also contains the verbs rebuke, reprove, reprimand and admonish. The resource situation type referred to by verbs like vorwerfen may be represented as in Table 3:
108 Kristel Proost
Table 3. Resource situation type referred to by huldigen vs. vorwerfen (honour vs. reproach) Characterisation of resource situation type
Verbs
Propositional Content Event Type (P) Temporal Reference (P)
Information Content: P Action [+Past]
Information Content: P Action [+Past]
Agent (P) Attitude of S to P Speaker Intention
H or 3rd Person S considers: P good S wants: H recognise: S considers: P good
H S considers: P bad S wants: H recognise: S considers: P bad
Presuppositions of S
P is the case
P is the case
würdigen, honorieren, huldigen, vorwerfen, vorhalten, zurecht ehren (honour) weisen (reproach, rebuke, reprove, reprimand, admonish)
As may be seen in Table 3, verbs like vorwerfen differ from those like huldigen not only in the propositional attitude and in the speaker attitude they lexicalise (the features involving negation), but also with respect to the Agent of P. While huldigen is used to refer to speech acts whereby a speaker expresses a positive evaluation of an action by either the hearer or some third person, vorwerfen is used to refer to speech acts whereby a speaker expresses a negative evaluation of an action performed by that hearer only. Since huldigen and vorwerfen differ not only in such features of their meaning which involve negation but also in an additional one, they do not differ only minimally. Typical complementaries or gradable antonyms differ only minimally, i.e. with respect to only one aspect of their meaning. Since the speaker intention expressed by speech act verbs always varies along with the propositional attitude they express, antonymous speech act verbs should be regarded as differing only minimally if they differ only with respect to these two components. Since pairs like huldigen-vorwerfen differ with respect to more than just these features, they are not typical examples of antonymy. It is in fact questionable whether pairs like these should at all be regarded as antonyms. Once we accept that antonyms differ more than only minimally, it is not clear how tolerant we should be. Any decision on how many components are allowed to be different would be completely arbitrary. Cases of verb pairs like huldigen-vorwerfen (honour-reproach) will be referred to as “antonyms in the broad sense of the term” to set them apart from verb pairs like huldigen-tadeln (honour-criticise), which are “antonyms in the narrow sense”.
Antonymy relations 109
Table 4. Resource situation type referred to by jubeln vs. poltern (rejoice vs. thunder) Characterisation of resource situation type
Verbs jubeln, jubilieren, frohlocken, jauchzen, zujubeln (rejoice, exult, cheer)
poltern (thunder, storm, bluster)
Propositional Content Event Type (P) Temporal Reference (P) Agent (P)
Information Content: P Not specified [+Past] Not specified
Information Content: P Not specified [+Past] Not specified
Attitude of S to P Speaker Intention
S feels: joy because of P S wants: H recognise: S feels: joy because of P
S feels: anger because of P S wants: H recognise: S feels: anger because of P
Presuppositions of S Manner of speaking
P is the case Emphatically
P is the case Emphatically
4.3
A special case
Example: Pairs of verbs such as jubeln-poltern Verbs like jubeln (rejoice) are used to refer to situations in which a speaker expresses that s/he feels joy because of something (P) and intends the hearer to recognise this. The resource situation type referred to by verbs like jubeln may be represented as in Table 4. Other verbs which may be used to refer to this type of situation are jubilieren, frohlocken and jauchzen. The corresponding English field includes the verbs rejoice, exult and cheer. If emotions like joy and anger may be regarded as opposites, a verb like poltern (thunder, storm, bluster) may be considered an antonym of verbs like rejoice when used in its sense as a speech act verb. Poltern and verbs like rejoice differ only with respect to the propositional attitude and the speaker’s intention that they lexicalise: while rejoice lexicalises the propositional attitude ‘S feels: joy because of P’ and the corresponding intention that the hearer recognise this, poltern expresses the propositional attitude ‘S feels: anger because of P’ and the corresponding intention that the hearer recognise this (Table 4). Since poltern and verbs like jubeln differ only with respect to the propositional attitude and the speaker’s intention that they express (the features involving negation), they differ only minimally. For this reason, they may be considered antonyms in the narrow sense. Since X rejoices and X thunders cannot both be true, pairs like rejoice-thunder (jubeln-poltern) fulfil the first Condition on Antonymy. Though these pairs are clearly antonyms, it is more difficult to decide exactly what type of antonymy they
110 Kristel Proost
instantiate. To the extent that X rejoices and X thunders can both be false, pairs like rejoice-thunder appear to be gradable antonyms. However, X rejoices and X thunders can both be false only when the speaker either does not do anything or performs a completely different speech act. Cases like those no longer concern the dimension of the expression of joy or anger. Typical cases of gradable antonymy such as good-bad, long-short and big-small do not involve any such change of dimension. This means that pairs like rejoice-thunder on the one hand and those like long-short on the other fulfil Condition IIb on gradable antonymy in different ways. Postulating a relation of gradable antonymy between poltern and verbs like jubeln is also problematic, because these verbs do not denote different sections of a single scale but rather refer to a particular section of two different scales (the joy-scale and the anger-scale respectively).
Word-internal oppositeness of meaning
4.4
Example: Evaluations expressed by boast A last example of an antonymy relation from the domain of speech act verbs concerns verbs like angeben (boast). Boast is used to refer to situations in which a speaker expresses the fact that s/he evaluates one of his/her own past actions or one of his/her own qualities (or those of someone associated with him/her) positively with the intention that the hearer not only recognise this but also adopts the speaker’s positive evaluation. The resource situation type referred to by boast may be represented as follows: Information Content: P Action/State [+Past] Speaker S considers: P good S wants: H recognise: S considers: P good S wants: H consider: P good P is the case
Propositional Content: Event Type (P): Temporal Reference (P): Agent (P): Attitude of S to P: Speaker Intention: Presuppositions of S:
Verbs like boast not only express a positive evaluation by the speaker of the resource situation but also an additional evaluation by another speaker reporting on the act of self-praise of the first one. Following Barwise/Perry, the reporting situation is referred to as the “discourse situation” (cf. Barwise/Perry 1983: 32–39). Both types of situation have the same inventory of situational roles: a speaker (S), a hearer (H) and an utterance with a propositional content (P) (see Figure 7).
Antonymy relations
Figure 7. The inventory of situational roles of the resource and the discourse situation (SRS: Resource Situation Speaker, PRS: Proposition of the utterance made by SRS, HRS: Resource Situation Hearer, SDS: Discourse Situation Speaker, PDS: Proposition of the utterance made by SDS, HDS: Resource Situation Hearer) (from Harras et al. 2004: 10)
By using a descriptive verb like boast, the discourse situation speaker indicates that s/he considers the positive evaluation expressed by the resource situation speaker to be exaggerated. The evaluation of the resource situation speaker’s act of self-praise as being exaggerated is a negative evaluation. This means that boast lexicalises two opposite evaluations: a positive one by the speaker of the resource situation (the reported situation) and a negative one by the speaker of the discourse situation (the reporting situation). The presence of two opposite evaluations in the meaning of a single word represents a special case of oppositeness of meaning also exemplified by synonyms of boast such as prahlen, protzen, aufschneiden, sich brüsten and sich aufspielen (boast and brag), by verbs like verklären (glorify) and by such like beschönigen, schönreden and schönfärben (whitewash). The type of oppositeness of meaning exemplified by these verbs differs from gradable antonymy and complementarity in that it is not a relation holding between separate lexical items. It is also different from the kind of word-internal oppositeness of meaning which Lutzeier has called “Gegensinn” (cf. Lutzeier 2007: xvii). While the latter concerns cases of meaning contrast between the different senses of polysemous words, the type of oppositeness of meaning instantiated by verbs like boast relates to only one of the senses of these words.
111
112 Kristel Proost
5.
Conclusion
The discussion of the examples presented has shown that, for the lexical domain of German speech act verbs, antonymous speech act verbs typically differ with respect to at least two components of their meaning: the speaker’s propositional attitude and the speaker’s intention. Speech act verbs differing only with respect to these two components are antonyms in the narrow sense of the word. Speech act verbs differing in more than these two features are antonyms in a broader sense. Most of the antonymy relations of speech act verbs are in fact of this rather loose kind. Of the antonymous groups belonging to the domain of speech act verbs, only three represent cases of antonymy in the narrow sense. These cases have been discussed in this paper: verbs like zustimmen (agree) vs. those such as abstreiten (deny), verbs like huldigen (honour) vs. those such as tadeln (criticise), and verbs like jubeln (rejoice) vs. those such as poltern (thunder). Being cases of antonymy in the narrow sense, these types of pairs of speech act verbs are candidates for gradable antonymy or complementarity. All three of them fulfil Condition I, the general condition on antonymy (“x is a and x is b cannot both be true”). None of them fulfils Condition IIa (“x is a and x is b cannot both be false”) and hence none of the pairs of antonymous speech act verbs discussed seems to be a candidate for complementarity. To the extent that all instances of the three types discussed do fulfil Condition IIb (“x is a and x is b can both be false”), they are all candidates for gradable antonymy. They do not turn out, however, to be typical examples of gradable antonymy, for the following reasons: i. They are not obviously gradable, because it is not clear with respect to exactly what component of their meaning they are gradable and hence they do not fulfil Condition IIb on gradable antonymy in the way typical examples of gradable antonymy do. (This is true of pairs of verbs like zustimmen-abstreiten, huldigen-tadeln and huldigen-poltern.) ii. Some of them do not relate to a single scale. (This is true of pairs of verbs like jubeln-poltern.) For these reasons, it may be desirable if the twofold distinction which is traditionally drawn between complementarity and gradable antonymy were replaced by the distinction between complementarity and contrariety. The latter has been defined as a relation of meaning contrast between two lexical items fulfilling Condition IIb (cf. Lang 1995). Contrariety could then be taken to subsume not only cases of gradable antonymy but also those of antonymy relations between lexical items whose gradability is questionable.
Antonymy relations 113
References Barwise, Jon and Perry, John. 1983. Situations and Attitudes. Cambridge, MA: MIT Press. Cappelli, Gloria. 2007. I reckon I know how Leonardo da Vinci must have felt. Epistemicity, Evidentiality and English Verbs of Cognitive Attitude. Pari: Pari Publishing. Cruse, D. Alan. 1979. “Reversives.” Linguistics 17: 957–966. Cruse, D. Alan. 1986. Lexical Semantics. Cambridge: Cambridge University Press. Cruse, D. Alan. 2002. “Paradigmatic Relations of Exclusion and Opposition II: Reversivity.” In Lexicology: An International Handbook on the Nature and Structure of Words and Vocabularies. Handbücher zur Sprach- und Kommunikationswissenschaft; 21.1. vol. I., Alan, D. Cruse, Franz Hundsnurscher, Michael Job and Peter Rolf Lutzeier (eds), 507–510. Berlin/ New York: de Gruyter. Harras, Gisela. 1994. “Unsere Kommunikationswelt in einer geordneten Liste von Wörtern: Zur Konzeption einer erklärenden Synonymik kommunikativer Ausdrücke des Deutschen.” In The World in a List of Words, Werner Hüllen (ed.), 33–41. Tübingen: Niemeyer. Harras, Gisela. 1995. “Eine Möglichkeit der kontrastiven Analyse von Kommunikationsverben.” In Von der Allgegenwart der Lexikologie: Kontrastive Lexikologie als Vorstufe zur zweisprachigen Lexikographie. Akten des internationalen Werkstattgesprächs zur kontrastiven Lexikologie, 29.–30.10.1994. Kopenhagen, Hans-Peder Kromann and Annelise Kjaer (eds), 103–113. Tübingen: Niemeyer. Harras, Gisela and Winkler, Edeltraud. 1994. “A Model for Describing Speech Act Verbs: The Semantic Base of a Polyfunctional Dictionary.” In Proceedings of the 6th Euralex Conference, Willy Martin et al. (eds), 440–448. Amsterdam: Vrije Universiteit. Harras, Gisela, Winkler, Edeltraud, Erb, Sabine and Proost, Kristel. 2004. Handbuch deutscher Kommunikationsverben. Teil I: Wörterbuch. Schriften des Instituts für Deutsche Sprache 10.1. Berlin/New York: de Gruyter. Harras, Gisela, Proost, Kristel and Winkler, Edeltraud. 2007. Handbuch deutscher Kommunikationsverben. Teil II: Lexikalische Strukturen. Schriften des Instituts für Deutsche Sprache 10.2. Berlin/New York: de Gruyter. Hofmann, Thomas R. 1993. Realms of Meaning: An Introduction to Semantics. London/New York: Longman. Lang, Ewald. 1995. “Das Spektrum der Antonymie.” In Die Ordnung der Wörter: Kognitive und lexikalische Strukturen. Institut für Deutsche Sprache Jahrbuch 1993, Gisela Harras (ed.), 30–98. Berlin/New York: de Gruyter. Lehrer, Adrienne. 2002. “Paradigmatic Relations of Exclusion and Opposition I: Gradable Antonymy and Complementarity.” In Lexicology: An International Handbook on the Nature and Structure of Words and Vocabularies. Handbücher zur Sprach- und Kommunikationswissenschaft; 21.1. vol. I., Alan, D. Cruse, Franz Hundsnurscher, Michael Job and Peter Rolf Lutzeier (eds), 498–507. Berlin/New York: de Gruyter. Löbner, Sebastian. 2003. Semantik: Eine Einführung. de Gruyter Studienbuch. Berlin/New York: de Gruyter. Lutzeier, Peter Rolf. 2007. Wörterbuch des Gegensinns im Deutschen. Band I: A-G. Berlin/New York: de Gruyter. Lyons, John. 1977. Semantics. vol. I. Cambridge: Cambridge University Press. Mettinger, Arthur. 1994. Aspects of Semantic Opposition in English. Oxford: Clarendon Press.
114 Kristel Proost
Proost, Kristel. 2006. “Speech Act Verbs.” In Encyclopedia of Language and Linguistics. 2nd ed. vol. XI, Keith Brown (ed.), 651–656. Oxford: Elsevier. Proost, Kristel. 2007a. “Gegensatzrelationen von Sprechaktverben.” In Handbuch deutscher Kommunikationsverben. Teil II: Lexikalische Strukturen, Gisela Harras, Kristel Proost and Edeltraud Winkler (eds), 367–397. Berlin/New York: de Gruyter. Proost, Kristel. 2007b. Conceptual Structure in Lexical Items: The Lexicalisation of Communication Concepts in English, German and Dutch. Pragmatics and Beyond. New Series 168. Amsterdam/Philadelphia: John Benjamins. Verschueren, Jef. 1980. On Speech Act Verbs. Pragmatics and Beyond 4. Amsterdam: John Benjamins. Verschueren, Jef. 1985. What People Say they do with Words: Prolegomena to an Empirical-Conceptual Approach to Linguistic Action. Advances in Discourse Processes 14. Norwood, NJ: Ablex Publishing Company. Wierzbicka, Anna. 1987. English Speech Act Verbs: A Semantic Dictionary. Sydney: Academic Press. Winkler, Edeltraud. 1996. “Kommunikationskonzepte und Kommunikationsverben.” In Bedeutung, Konzepte, Bedeutungskonzepte: Theorie und Anwendung in Linguistik und Psychologie, Joachim Grabowski, Gisela Harras and Theo Herrmann (eds), 195–229. Opladen: Westdeutscher Verlag.
An empiricist’s view of the ontology of lexical-semantic relations Cyril Belica, Holger Keibel, Marc Kupietz and Rainer Perkuhn
The investigation of lexical-semantic relations serves the subject’s organisation of the experiential world, not the discovery of an objective ontological reality.
Taking a usage-based perspective, lexical-semantic relations and other aspects of lexical meaning are characterised as emerging from language use. At the same time, they shape language use and therefore become manifest in corpus data. This paper discusses how this mutual influence can be taken into account in the study of these relations. An empirically driven methodology is proposed that is, as an initial step, based on self-organising clustering of comprehensive collocation profiles. Several examples demonstrate how this methodology may guide linguists in explicating implicit knowledge of complex semantic structures. Although these example analyses are conducted for written German, the overall methodology is language-independent.
1.
Introduction
The term theory of meaning appears to be loaded with a number of implicit assumptions or mental images such that it is rather difficult to discuss theories of meaning without routinely invoking these assumptions. Possibly the most problematic one concerns the representation of meaning, and we would like to confront this assumption with the following somewhat provocative questions: (1) Is a formal system for the representation of meaning in fact already a theory of meaning? (2) If, hypothetically, in the future, our confidence in the symbolic . This motto is freely adapted from von Glasersfeld (1996: 162): “Cognition serves the subject’s organisation of the experiential world, not the discovery of an objective ontological reality”.
116 Cyril Belica, Holger Keibel, Marc Kupietz and Rainer Perkuhn
representability of meaning eroded beyond a reasonable limit, what would the expression theory of meaning then denote? (3) Is it defensible to investigate empirically how the meaning of a word arises, without stipulating the feasibility of its (symbolic) representation? The line of research presented here attempts precisely that: to separate the question of how lexical meaning and lexical-semantic relations arise from the question of whether and how they are to be represented. Nevertheless, this paper does not formulate any precise answers to any of these questions but only points out possible directions from which answers may be approached. Two clarifications should be made from the outset. First, the paper does not discuss lexical semantics from a lexicological or lexicographic perspective; instead it takes a pronounced empiricist perspective, focusing on the ontology of lexical-semantic relations. Second, it is important to discern two notions of ontology which Gruber (2009) characterises as follows: The term “ontology” comes from the field of philosophy that is concerned with the study of being or existence. In philosophy, one can talk about an ontology1 [index numbers added] as a theory of the nature of existence […]. In computer and information science, ontology2 is a technical term denoting an artifact that is designed for a purpose, which is to enable the modelling of knowledge about some domain, real or imagined.
It is, of course, this first notion of ontology that the present paper is primarily concerned with. Nevertheless, both readings may be applied to lexical-semantic relations, and throughout this paper, we refer to either reading by using the same index numbers as in the above quotation. The second, technical reading of ontology refers to a formal representation which captures – in a theory-dependent way – the systematic aspects of lexical semantics which can be thought of as a coarse-grained structure. It is generally acknowledged, however, that there are also less systematic aspects in lexical semantics which constitute a more fine-grained structure. To accommodate these aspects as well, a fundamentally different representation is typically added on top of the ontology2. In this case, the ontology2 constitutes a first rough approximation of the lexical-semantic structure, while the additional representation corrects for its minor local deficiencies. With or without such a fine-tuning, ontologies2 as models of lexical semantics have proved to be of great descriptive and functional value: they are very useful . Throughout this paper, we use the term word in a pre-interpretative and theory-independent way, referring to a broad and flexible notion according to which any desired set of string configurations can be construed as a word.
An empiricist’s view of the ontology of lexical-semantic relations 117
in many computer-linguistic applications (in terms of performance results), and they are also the model of choice for many lexicographic requirements. They do not, however, seem to have much explanatory value in capturing the nature of lexical semantics; that is, ontologies2 provide little insight into the ontology1 of lexical semantics. From the perspective advocated in this paper, it is rather the less systematic fine-grained structure that pervades language and therefore comes first, whereas more systematic coarse-grained structures emerge out of it. According to this view, the fine-grained structure and the coarse-grained structure are inextricably interwoven – much in contrast to an ontology2 and a corresponding correcting mechanism where the two structures remain separate modelling approaches. We elaborate on these ideas later on. Like any other empirical research aiming for explanatory, evidence-based theories, the present approach is faced with an explanatory gap between the level of accessible data and corresponding analytical methods on the one hand, and the theoretical level with hypotheses about general principles on the other. Although this gap may never be closed entirely, it is possible to decrease it if research at both levels proceeds hand in glove: concrete observations at the data level may be generalised inductively to more abstract structures, which may in turn inspire the researcher to abductively formulate new hypotheses at the theoretical level. At the same time, any hypotheses at the theoretical level may be deductively translated into specific predictions which can be tested against the data by means of falsification. Abduction and falsification thus connect both levels across the gap. It is useful to keep this gap in mind as we repeatedly move back and forth between the two levels throughout the paper. When explanatory theories are the goal, it is good scientific practice to start out with as few assumptions at the theoretical level as possible and to make these assumptions explicit. Moreover, assumptions should be chosen conservatively and only where necessary and, more importantly, they should be formulated in such a way that they can in principle be falsified empirically. For this reason, we consider it best to abandon the following assumptions which seem to implicitly underlie much theoretical work, also in a broader context beyond semantics. These involve the scientific beliefs (i) that language competence is a self-contained object of investigation; (ii) that language can be properly captured as a formal system; (iii) that complete theories are a research goal worth pursuing; (iv) that descriptive modelling of meaning entails explanatory power; and (v) that decomposition is an applicable explanatory principle with respect to language. Each of these assumptions may in the future turn out to be true, but there are serious concerns about their plausibility and at the same time very little, if any, evidence supporting them. While our doubts concerning (iv) were briefly illustrated above, the
118 Cyril Belica, Holger Keibel, Marc Kupietz and Rainer Perkuhn
other assumptions in this list are discussed elsewhere (e.g. Kupietz/Keibel 2009; Keibel/Kupietz 2009). With respect to lexical semantics, the rejected assumption (v) may have the greatest relevance, as there are decompositional and structural aspects in most existing theories of lexical meaning. For instance, it is present in any theory describing meaning in terms of general semantic features which are required to be primitive, necessary and sufficient. In our view, such theories are overly influenced by the needs of technical operationalisation, and by the valid observation that the complex relations between categories of being (e.g. physical objects, properties, events, propositions) may at least in part display some neatly hierarchical and (de)compositional structure which may be captured in terms of technical ontologies2. It is therefore tempting for a linguist to assume that this also holds for lexical meaning. This assumption would be absolutely adequate if meaning were symbolic and denotational by nature. But this is, at best, an open research question, and without strong evidence in its support, it seems advisable not to make this assumption a priori. In the next section, we outline an alternative view according to which meaning is in principle connotational and context-dependent. This view does not rest on strong assumptions and, importantly, it is not inconsistent with the notion that lexical meaning displays denotative, hierarchical and compositional phenomena. On the contrary, we are convinced that such phenomena exist, but in our view, they arise out of the connotational structure.
2.
An empiricist’s view of lexical meaning
2.1
The emergent nature of lexical meaning
As the primary linguistic assumption underlying the present work, we adopt an emergentist perspective on language which is a cornerstone of usage-based linguistics. According to this perspective, the lexicon and grammar cannot be separated, and what we call lexicon and grammar are emergent language structures that are constantly shaped by the experience of language use (cf. Hopper 1998; Bybee 1998; Barlow/Kemmer 2000; Bybee/Hopper 2001). In constructing and understanding utterances and sentences, speakers draw heavily on the language productions they have previously encountered. Some of the choices they make or experience repeatedly may become entrenched, i.e. routinised. Different . From this perspective, it is, of course, not meaningful to treat lexical semantics as a special field of semantics. When we speak of lexical semantics nonetheless, it is simply in order to focus on phenomena that fall within the scope of this book.
An empiricist’s view of the ontology of lexical-semantic relations 119
speakers have a different history of language experience and are therefore likely to develop different language routines. The language system, then, may be defined as what the routines of most speakers of the respective language community have in common, constituting implicit conventions across speakers. This emergentist perspective on language is well-motivated by a number of empirical observations (cf. the work cited above). Crucially, this emergence does not only concern language evolution and language change but the very ontological1 status of language in every single moment: language structures are viewed “as always provisional, always negotiable, and in fact, as epiphenomenal” (Hopper 1998: 157), and “lability between form and meaning [is seen] as a constant and as a natural situation” (ibid.). The emergent dynamics in adult language may be interpreted as a consequence and continuation of the processes ascribed to children by usage-based accounts of first language acquisition (cf. Tomasello 2003; Elman et al. 1996; MacWhinney 1999). These general statements, if true, have immediate consequences for the nature of lexical semantics. Traditionally, words were seen as playing a rather passive role in language production and comprehension. For instance, as Hoey (2005: 1f.) notes, it was widely assumed that a speaker first decides on the grammatical and the semantic structure before selecting specific words to fill the grammatical slots and to instantiate the semantic concepts defined by these structures. Consequently, lexical meaning was considered an essentially static property of words such that a word can be chosen by its meaning in production, and the meaning of a word can be retrieved in comprehension. This picture changes drastically with an emergentist perspective: words do not have a fixed meaning associated with them, but instead they “evoke” meaning (Paradis 2008) or “provide clues to meaning” (Elman 2004: 301), and this meaning is negotiated in discourse and crucially depends on the specific context in which the word is used (cf., inter alia, Firth 1935/1957a; Langacker 1987; Sinclair 1998; Elman 2004; Paradis 2008). Thus, in contrast to the traditional view, words and lexical semantics are thought of as playing a very active role in language comprehension and production. It is important to note that, if language structures do indeed constantly emerge as a side effect of language use, and if language use is in turn influenced by the cognitive language structures of individual speakers, language is not only an emergent but ultimately also an autopoietic system, characterised by self-creation and self-organisation. In a recent paper on the ontology1 of signs, Kravchenko (2003: 180) argues that within an autopoietic system, information “is understood as being constructed and co-dependent rather than instructional and referential”. With respect to lexical semantics we may therefore summarise that (i) meaning is construed, “constructed” and “co-dependent”; (ii) meaning is connotational rather than denotational, and (iii) meaning arises only in context. Ultimately,
120 Cyril Belica, Holger Keibel, Marc Kupietz and Rainer Perkuhn
meaning is use in context. A theory of meaning is therefore a theory of connotation-sensitive use in context.
2.2
Types of context
Meaning is use in context, and there are at least two different types of context that may be relevant. In our terminology, the first type of context is local or collocational context which concerns the immediate series of words in which a word is used. This is often termed cotext (e.g. Sinclair 1991). The underlying assumption is that for speakers, each word is syntagmatically associated with a number of other words with which they form collocations – i.e. “semi-preconstructed phrases which constitute single choices” (Sinclair 1991: 110) and which range from fixed idioms to fairly flexible phraseological preferences – and that different collocations tend to give rise to different meanings (e.g. Firth 1951/1957b; Sinclair 1998; Partington 1998; Yarowsky 1993). It is important to note that the term collocation here refers to a psychological phenomenon. However, because they are part of a speaker’s language routines, at least the more common psychological collocations are likely to have correlates in corpora, taking the form of statistical collocations (cf. Hoey 2005; Partington 1998). With respect to theories of meaning, it is the psychological collocations that are of interest, and it is only because these cannot be investigated directly (being part of speakers’ implicit language knowledge), that we analyse statistical collocations instead. This amounts to the hypothesis that psychological collocations can be uncovered, at least in principle, from large quantities of performance data, provided that these data are adequately sampled. The second type of context is global or situational context and concerns various aspects of the situation in which a text or utterance is produced or understood, including factors such as the discourse domain, the topic, or the speaker’s mood. Just as for local contexts, it has been observed repeatedly that these situational factors have an influence on (lexical) meaning (e.g. Firth 1951/1957b; Hoey 2005; Gale/Church/Yarowsky 1992). As a continuous result of experience, words become associated with a number of global contexts – or more precisely, with generalisations of global contexts – and different global contexts give rise to different meanings. Again, these associations are first and foremost a psychological phenomenon. However, we claim that these psychological associations, too, have corpus correlates and, moreover, that these correlates can be uncovered in a large number of recurrent local contexts. Nevertheless, it is again the psychological associations that this paper is primarily concerned with. While the corresponding hypothesis for collocations is fairly straightforward and frequently (albeit often implicitly) made in corpus-based work, the present claim is a strong one and,
An empiricist’s view of the ontology of lexical-semantic relations 121
at least to our knowledge, it has not previously been stated and explored in the literature. In the remainder of this paper, we will gradually elaborate on this claim and illustrate its potential in several examples of written German. It is worth pointing out that even though this line of research attempts to uncover global contexts from local contexts, both types of context are logically independent. The same word combination (i.e. local context) may evoke very different meanings when used in different global contexts. A second point is that the distinction between local and global context should not be confused with the distinction between fine-grained and coarse-grained semantic structures which was described in the introduction, and to which we will return shortly.
2.3
Similarity
Context matters but contexts are never perfectly identical. Therefore, positing that words become psychologically associated with local and global contexts entails the assumption that human speakers are sensitive to similarities and, moreover, that they continuously note similarities in their environment, even if they are not aware of it. This assumption is obviously valid, for without such sensitivity to similarities, previous experience could not be generalised and utilised in future behaviour. Thus, the ability to recognise similarities – or, more precisely, the inability to ignore similarities – is vital for the organisation of our experiential world, it plays a key role in all learning processes and it is, in particular, a pre requisite for inducing local and global contexts. In short, similarities are a driving force behind the emergent lexical semantics and other emergent language structures. But it is likely that they are not the only one. A range of phenomena can be observed in lexical semantics that are best described as abstract structures which probably cannot be explained by similarities alone. There also appear to be some systematising processes at work. This motivates the following working conjecture: there are at least two types of cognitive mechanism underlying learning and generalising in humans, with a tight, nondecomposable entanglement between these mechanisms (cf. Vachková/Belica 2009; Kupietz/Keibel 2009; Keibel/Kupietz 2009). The first type of mechanism, subsumed as shorthand under the label spontaneous-associative agent, is triggered spontaneously by similarities in perception, and it in turn triggers cognitive processes such as association, analogy, and “compulsive” preference. Importantly, it responds to similarities in both local and global contexts. The second type of mechanism, subsumed under the label classificatory assessment agent, may be characterised as follows. It attempts to systematise – i.e. to identify apparent regularities in its stimuli – resulting in
122 Cyril Belica, Holger Keibel, Marc Kupietz and Rainer Perkuhn
symbolic generalisations. These take the form of abstract categories and relations between them, and the agent arrives at such generalisations in a reflexive-iterative manner. In this way, it develops a posteriori categorial models, including, among others, models of language. Humans obviously have this systematising capacity – without it there would be no theories, no scientific modelling, and so forth. The claim here is that we make use of this capacity most of the time, sometimes intentionally but in general mechanically, without realising it. Both agents pertain to general cognition rather than specifically to language, but the second one may be more alert to some types of stimulus (including, among others, language stimuli) than to others. It is likely that, despite their hypothesised entanglement, the spontaneous-associative agent strictly precedes the classificatory assessment agent with respect to phylogenesis, ontogenesis, and cognitive processing (in particular, language processing). In any case, both agents are highly relevant in cognitive generalising. Together, they shape the language behaviour of individuals and therefore the emergent conventions of the language community. In our view of lexical semantics, it is the associative agent that spontaneously notes and processes the fine-grained structure described in the introduction, whereas the emergent symbolic generalisations produced and processed by the classificatory agent pertain to the level of coarse-grained structure. In other words, the perceived approximate denotation arises only at the emergent level of the classificatory agent. The spontaneous-associative processes by which the fine-grained structure arises from experienced language use may be thought of as natural processes of crystallisation where repeated encounters may result in the creation of a crystal nucleus and subsequently in a gradual growth. If there happen to be abstract regularities in this dynamic, fine-grained crystal structure, the associative agent by itself would be entirely ignorant of them. It is the classificatory agent that takes notice of them and infers rule-like relations in terms of abstract categories. The crystal structure therefore is a prerequisite for any systematising-classificatory processes to take place, such that the coarse-grained structure grows out of the fine-grained structure. At the same time, the categories and relations of this coarse-grained structure provide additional stimuli to the associative agent and in turn influence the subsequent development of the fine-grained structure. It is by this ongoing mutual influence that both agents and both structures are . We distinguish these two types of cognitive mechanism only at a functional level and make no assumptions about the underlying neural structures or processes. The functional distinction between the two agents, however, explicitly does not relate to different levels of description, but to different properties of the described object (cf. Smolensky 1988).
An empiricist’s view of the ontology of lexical-semantic relations 123
intrinsically entangled and cannot be adequately described in terms of compositional modules. The systematic structures emerging in this interaction may be very complex and abstract, and they may involve hierarchical and compositional phenomena. They comprise, for instance, the senses of a word that lexicographers would list in a dictionary. This account of systematicity as emerging out of the seemingly chaotic associative structure is both plausible and attractive because systematicity can potentially be explained, and not simply assumed as in compositional, denotational descriptions of lexical semantics (cf. Elman 2004 for related considerations). Nevertheless, the coarse-grained structure is only the tip of the iceberg while most of this metaphorical iceberg is made of fine-grained structure. With a good deal of oversimplification one might argue that, while in the past there has been a tendency to view the fine-grained structure below the sea level as a nuisance or even as irrelevant (e.g. by focusing on language competence and banishing all fine-grained phenomena to a list of exceptions), this structure is nowadays more widely acknowledged as linguistically relevant. However, there now is a tendency to describe it with the same types of abstract category that have proved to work well for the coarse-grained structure, albeit with more specialised, idiosyncratic categories. An adequate description of the iceberg as a whole will not be possible unless some conceptual devices are developed that are suitable for capturing the essence of the fine-grained structure. We shall now return to the question of representation. Due to the entanglement of the two agents and of the structures they produce, meaning is never fully captured if only the symbolic, categorial structure is considered. Attempting to do so means attempting to explicate similarities between linguistic phenomena in terms of identities, e.g. by means of necessary and sufficient conditions, and will therefore necessarily result in a loss of information. This loss of information takes place not only when meaning is represented symbolically, but also in sense disambiguation, in formulating lexicographic paraphrases, and so forth. In lexicographic work, there are very good reasons for representing meaning as in these examples, but it is important to be aware that there will always be some loss of information and that the lost information cannot be restored. The implication for
. In this respect, it appears to us that the increasing popularity of concepts and methods borrowed from computational linguistics in linguistics proper is not always a blessing. Without doubt, linguistic description benefits enormously from the possibilities offered by computational linguistics, but, as a side effect, linguists may be seduced into operationalising and representing all phenomena symbolically.
124 Cyril Belica, Holger Keibel, Marc Kupietz and Rainer Perkuhn
explanatory approaches to lexical meaning is that a way of representing meaning based on similarity rather than identity needs to be developed.
2.4
Denotation and beyond
Irrespective of one’s theoretical and methodological stance, there nowadays seems to be a general agreement that two words which appear to have the same signifié at the coarse-grained level, may at the fine-grained level still display subtle differences with respect to connotations, style, implications, expressed attitude, collocational behaviour (i.e. local context), etc. Collocational behaviour is traditionally understood as capturing non-compositional, idiomatic aspects. We are convinced, however, that all subtle aspects listed above become statistically manifest in collocational behaviour, that is, in observable co-occurrences. This is the guiding principle underlying the two earlier claims with respect to local and global context. The next two sections will describe possible ways in which these claims may be put into action.
3.
Capturing local context
3.1
Higher-order collocations and syntagmatic patterns
In order to identify significantly recurring local contexts, i.e. statistical collocations, as corpus correlates of psychological collocations, we take advantage of an extended iterative collocation algorithm (Belica 1995). The higher-order collocations that it induces from a corpus are word combinations that are potentially non-contiguous (in contrast to n-grams, e.g. Manning/Schütze 2001), and the serial order of these words as well as the word distances between them may vary (cf. Keibel/Kupietz/Belica 2008). This flexibility is important with respect to psychological reality because positionally restricted n-grams are likely to capture only a small fraction of the lexical patterns that are good candidates for . Several proposals for similarity-based representations of lexical meaning have been offered, of course, by the different branches of prototype theory (for reviews see Kleiber 1990/1993; Taylor 1995). However, these proposals are not applicable for present purposes, because they still focus on denotative and categorial meaning, albeit with a modified notion of category and of category membership. The kind of representation needed in an explanatory approach will involve similarities both at the fine-grained and the coarse-grained level of lexical meaning. . The concept of higher-order collocations was recently rediscovered as the very similar concept of concgrams (Cheng/Greaves/Warren 2006).
An empiricist’s view of the ontology of lexical-semantic relations 125
psychological collocations. The recurrent local contexts underlying the higherorder collocations form hierarchical clusters that reflect the relations between increasingly complex collocations. Due to their positional flexibility, higher-order collocations alone do not always readily elicit the psychological collocations of competent speakers. Therefore, they are listed together with a corresponding syntagmatic pattern which summarises the predominant use of the collocation in a kind of wild card expression that presents the collocates in their typical word order and, for improved legibility, together with inserted filler words that were observed to occur in that position at a certain rate. To illustrate these concepts, the following examples show three realistic higher-order collocations (a) together with their predominant syntagmatic pattern (b). The higher-order collocations list the collocates in the serial order reflecting their statistical dependencies. That is, in example (1), kommt is a primary collocate of the node word vorbei, and mehr is a secondary collocate of the pair vorbei kommt. In the corresponding syntagmatic pattern, the node word and its collocates are highlighted, and the observed gaps and the most frequently observed filler words are displayed in shades of grey, with lighter fonts representing fillers which are less frequently observed. Square brackets mark optional gaps, and the vertical bar signals alternatives (logical or). Example (2) in particular illustrates that higher-order collocations and syntagmatic patterns may capture long-distance relations and that these relations may provide strong cues to meaning that are easily missed by other notions of collocation. (1) a. vorbei: kommt mehr b. kommt [… keiner|niemand|nicht] mehr […] vorbei (2) a. vergewissern: wirklich b. sich zu vergewissern daß|ob|dass … auch wirklich (3) a. versprechen: viel gehalten b. viel […] versprochen [und|aber nichts|wenig] gehalten
Although extensive investigations of higher-order collocations and syntagmatic patterns have so far only been conducted for written German (cf. Section 3.3), the algorithm is language-independent. The larger the available corpus, the more fine-grained collocation structures can be uncovered. Some examples for English are given in the next section.
. All examples are taken from the CCDB, described in 3.3.
126 Cyril Belica, Holger Keibel, Marc Kupietz and Rainer Perkuhn
Figure 1. Collocation profile of the word machen (to make, do, render, …), only the top portion is shown
3.2
An empiricist’s view of the ontology of lexical-semantic relations 127
Collocation profile
For sufficiently frequent words, a corpus with a decent size will usually contain evidence for a large number of typical local contexts. As a naming convention, the full spectrum of typical local contexts (higher-order collocations together with the corresponding syntagmatic patterns and related characteristics) associated with a word are referred to as its collocation profile. This profile captures both the fine-grained nuances and the more coarse-grained aspects of the word’s use, including clues to global context such as the register of the discourse in which the meaning of the word is being negotiated. Figure 1 shows the profile of the German verb machen (to make, do, render, …) in tabular form, with each row corresponding to one typical local context of this word, inferred from the underlying corpus. The central column lists these local contexts in the form of higher-order collocations, and the corresponding predominant syntagmatic patterns are given in the right-hand column. Note that Figure 1 gives only the most typical local contexts of machen; depending on corpus size, word frequency, and other factors, collocation profiles are usually much longer. Readers familiar with the German language will easily verify that the various local contexts in the figure give rise to a broad range of different connotations. To demonstrate that the methodology is language-independent, Figure 2 presents a fragment of the collocation profile of the word why, inferred from a relatively small self-compiled web-based corpus of business English (2.5 million running words). Even for this small corpus, the range of typical local contexts corresponds to the intuitions of competent speakers.
3.3
Collecting collocation profiles
To recap, in order to detect significantly recurring local contexts of a word, a large number of occurrences is needed as a data basis. For this purpose, corpus data can be used directly, and this is essentially what was stated in the first claim (cf. 2.2). The second claim, slightly reformulated, states that for detecting typical global contexts, a large number of collocation profiles is needed as a data basis. In order to have fast access to many such profiles, we set up a collocation data base, CCDB (online since 2001; Belica 2001–2007; Keibel/Belica 2007). The CCDB currently comprises collocation profiles for more than 220,000 words. These profiles are based on a virtual corpus with approximately 2.2 billion running words, and this corpus is a subset of the Mannheim German Reference . Example taken from the CCDB (cf. Section 3.3).
128 Cyril Belica, Holger Keibel, Marc Kupietz and Rainer Perkuhn
Figure 2. Collocation profile of the word why, only the top portion is shown
Corpus (DeReKo), the largest corpus archive of contemporary written German (IDS 2007). All subsequent analyses are based on the collocation profiles in the CCDB. The CCDB serves as a transparent lab for several lines of research in the emergentist framework. Its principal motivation is to aid the formation of explanatory theories in the light of the explanatory gap that was discussed above. The general methodology underlying all these lines of research aims to discover structure in the high-dimensional space that is spanned by the collocation profiles and the
An empiricist’s view of the ontology of lexical-semantic relations 129
similarities between them. In addition to its primary purpose, the CCDB has proved useful for other applications as well, for example in descriptive linguistics and second language education.
4.
Capturing global context
4.1
General methodological considerations
Having introduced the CCDB, we can turn to how global contexts may be inferred from collocation profiles. The core idea is twofold: (i) global contexts leave traces in collocation profiles; (ii) these traces do not become manifest by themselves: rather, it is distinguishing aspects between them that may be observed in collocation profiles. To uncover these distinguishing aspects for a given word, we exploit the fact that the word tends to share different aspects with different sets of other words. In more technical terms, the collocation profiles may be conceived of as spanning a high-dimensional space, and generally, the immediate neighbourhood of the given word – or rather, of its profile – will be occupied by many other words. Consider Figure 3. A word w – or more precisely, its collocation profile – is best understood not as a point in the high-dimensional space, but rather as some extensive region, the shape of which may be very complex and contorted,
Figure 3. Schematic illustration of a hypothetical local situation in the space around a given word (dark grey)
130 Cyril Belica, Holger Keibel, Marc Kupietz and Rainer Perkuhn
consisting of multiple subregions that need not even be connected, that is, they may be conceived of as islands and peninsulas. In the figure, the given word w (dark grey) is depicted as having four such subregions. It is in these subregions that the distinguishing aspects associated with particular global contexts are located. Unfortunately, they cannot be identified directly because the space is far too high in dimensionality and too complex and opaque in its structure.10 However, there are other words in this space, and it is likely that each subregion of w has some of these words in its immediate neighbourhood. Of course, these words occupy extensive and oddly shaped regions too (light grey), such that they may overlap with the subregions of w and – crucially – also with each other. Therefore, if we are able to identify a set of words that overlap to some degree with w and to a high degree with each other, much of this mutual overlap will tend to take place in a particular subregion of w. This insight is used for roughly carving out the various subregions of w and thereby for deriving clues to the typical global contexts of w with which these subregions are associated. This approach, which we describe in greater detail below, works best, of course, in those areas of the space where the language under investigation provides a rich lexis and thus many words that may potentially display a complex structure of mutual overlap. As a prerequisite for studying this structure, we need to be able to compare multiple collocation profiles simultaneously, and in order to do that, we first need to look at the comparison of any two such profiles.
4.2
Comparison of two collocation profiles
Ultimately, assessing the degree of overlap between collocation profiles is a question of similarity. Inspecting the profiles of many words, one quickly observes that similar words – i.e. words that are used in similar ways – tend to share portions of their respective collocation profiles. A reverse reasoning leads to the hypothesis that similarities in two collocation profiles reflect similarities in the use of the respective words, including semantic similarities (cf. Keibel/Belica 2007). The message of this hypothesis is: words that look similar in terms of their collocation profiles are in fact similar in terms of language use. This hypothesis and the underlying line of reasoning are fairly common in quantitative linguistics. What is unique about the current approach is that it is based on very rich and complex collocation profiles using higher-order collocations instead of contiguous n-grams. 10. The two-dimensional cartoon-like illustration in Figure 3 is an extreme simplification of authentic situations in the full-scale space.
An empiricist’s view of the ontology of lexical-semantic relations 131
Figure 4. Overlapping collocations in the profiles of the verbs grinsen (to grin) and lächeln (to smile)
To illustrate the observation and the reverse reasoning, consider Figure 4 which provides a simple example of two German words and their collocation profiles. The profiles are in fact much longer; only their top portion is shown here. These two words were selected by us on the basis of our linguistic intuition as competent speakers of German because we expected them to behave in a very similar way semantically. It turns out that, indeed, the two profiles have many collocations in common: overlapping collocations are highlighted in the figure, and there are many more in the lower portions not displayed here. This is the perspective of the original observation. By contrast, readers of this paper who are not familiar with German are in a position to apply the reverse reasoning: given the substantial overlap between the two profiles, they may infer that the two corresponding words behave in a fairly similar way semantically. Visual inspection is obviously not enough. Formally, the degree of overlap between profiles is quantified by a complex mathematical measure that takes into account not only the amount of overlap but also the degree of cohesion of the overlapping collocates. A good way to verify that the hypothesis works well and that the particular measure we use here operationalises a plausible notion of similarity, is to determine the words whose profiles are most similar to that of a given word. Figure 5 (left) provides a simple example, listing the similar profiles for the word Hindi (Hindi). All words on this list are lexicalisations designating either specific natural
132 Cyril Belica, Holger Keibel, Marc Kupietz and Rainer Perkuhn
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.
Chinesisch Englisch Spanisch Türkisch Urdu Portugiesisch Japanisch Arabisch Italienisch Landessprache Polnisch Französisch Muttersprache Griechisch Hebräisch Ungarisch Amtssprache Tschechisch Russisch Niederländisch
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.
Merkmal Eigenheit Eigenschaft Eigenart Ausprägung Charakteristik Anliegen Element Besonderheit Charaktereigenschaft Ausformung Stilelement Kriterium Charakterzug Stilmittel Parameter Charakter Eigentümlichkeit Ereignis Spielart
Figure 5. Words with profiles similar to that of Hindi (Hindi) (left) or Charakteristikum (characteristic, feature) (right)
languages or language domains. Not only is the plausibility of this list – which was derived fully automatically – confirmed by the intuition of competent speakers, but such speakers can also easily expand the list by a few words through introspection. This task would be far more difficult in the case of Charakteristikum (characteristic, feature) (Figure 5, right): competent speakers would probably still confirm the similarity of this word to the other words in the list, but they would have to sit and think for a while to actively expand this list. The reason for this is that, in contrast to the previous example, the similarity structure around Charak teristikum appears to be far more complex, involving various kinds of similarity relation. This method appears to be sensitive to this broad range of similarity relations, and our empirical approach to lexical meaning depends entirely on this sensitivity: for it means that the formal measure together with the collocation profiles on which it operates not only implements a plausible notion of similarity of words in use, but also yields similar words way beyond an individual’s introspection. In the remainder of this paper, whenever we assume or discuss the similarity of any words, we refer to their similarity in terms of collocation profiles, and thus ultimately in terms of language use.
4.3
An empiricist’s view of the ontology of lexical-semantic relations 133
Comparison of multiple collocation profiles
The simultaneous comparison of multiple profiles is a far more challenging task. Having inspected many different profiles, one eventually observes that the similarities between these profiles give rise to a network of interrelations that is highly complex and seemingly contradictory. But it is not the similarity measure that causes this complexity and these contradictions; it is rather the complex lexical semantics. This is the point where we inevitably run into the peculiarities of finegrained structure. In contrast to taxonomic structures (Kuhn 1991/2000: 104), fine-grained lexical semantics is not as hierarchical, not as well or neatly structured. The relations we are dealing with are defined by fuzzy similarity, and not by identity. For example, the concepts of all types of animal may be captured nicely by some hierarchical ontology2, but the corresponding lexicalisations cannot – at least when connotation is taken into account. But how can this complexity be coped with? We propose a strategy consisting of three steps. First, taking up the general idea outlined in 4.1, the spatial region occupied by a word – the focus word – is explored indirectly by investigating the complex similarity structure between the words in its neighbourhood. To this end, we visualise this complex structure using self-organising methodologies. Currently, we primarily apply self-organising lexical feature maps (SOMs; Kohonen 1984; for details see Perkuhn/Keibel 2009). The second step consists of a discourse-based semiotic interpretation of the emergent signs in the SOMs resulting from step 1. That is, linguists or, even better, other competent speakers scan the SOMs for clues to global contexts, and they do this entirely on the basis of their individual discourse experience. In the third step, an attempt is made to map these interpretations onto existing systemic linguistic categories. Alternatively, observations in step 2 for a wide range of words may, in step 3, motivate the revision of existing categories, or the formulation of entirely new ones. Crucially, step 1 is fully automatic and objective. In step 2, then, we switch hardware, from the computer to the mind. Thus, in this step, subjective interpretations and judgements are involved. However, only individual discourse experience is used in these interpretations: linguistic categories do not come into play until step 3. A more comprehensive description of this three-step strategy can be found in Vachková/Belica (2009). The strategy is illustrated in the next section for the case of two particular relations: polysemy and synonymy.
134 Cyril Belica, Holger Keibel, Marc Kupietz and Rainer Perkuhn
4.4
Example analyses
As a first example, consider the SOM for the focus word Quark (quark) (Figure 6). This SOM displays the words that are most similar to Quark, and they are arranged in such a way that greater proximity in the SOM reflects greater similarity of the underlying profiles. Generating the SOM is all there is to be done in step 1. What is immediately striking in this SOM – as an observation in step 2 – is the fact that there are two completely isolated areas which apparently relate to two distinct global contexts of the word Quark. While one of them is a physics context, relating to the sense of Quark as an elementary particle, the other one is a food context, relating to the sense of Quark as a food item, or more precisely, as a dairy product. Interestingly, the superordinate terms by which one might want to
Figure 6. SOM for Quark (quark)
An empiricist’s view of the ontology of lexical-semantic relations 135
label the two areas (Elementarteilchen and Milchprodukt; elementary particle and dairy product, respectively) are already listed among the words in the SOM, but this is a coincidence and not generally the case. In step 3, an attempt to map these interpretations onto linguistic categories would probably lead us to note that this is an instance of homonymy. A more interesting example is provided by the SOM for the focus word Glas (glass), shown in Figure 7. In step 2, to look for clues to global contexts means to look for clusters of words that intuitively appear to have something in common that may be ascribed to a global context. In this example, we find more different global contexts than before (labelled in Figure 7), with a gradual transition or even overlap between them, rather than sharp boundaries as in the previous
Figure 7. Annotated SOM for Glas (glass)
136 Cyril Belica, Holger Keibel, Marc Kupietz and Rainer Perkuhn
example. These observations and interpretations might prompt us to infer that Glas constitutes a case of polysemy (step 3). Both examples illustrate how evidence at the level of global contexts may be discerned on the basis of information at the level of local contexts – after all, it is this information that is driving the SOMs. Turning now to (near-) synonymy, consider the two focus words schwierig and schwer (Figure 8) both of which correspond in English to a complex combination of words such as difficult, complex, heavy, rich (food), serious, and awkward. In step 1, the same SOM methodology is used as before, only this time it is applied to those words that are most similar to either of the two focus words.
Figure 8. SOM for the word pair schwierig vs. schwer (see text for translations)
An empiricist’s view of the ontology of lexical-semantic relations 137
Additionally, the SOM now displays interpretable shading:11 darker greyish areas contain mostly words that are more similar to schwer, and analogously, lighter greyish areas (including white areas) contain mostly words which are used more similarly to schwierig. In step 2, the task is now not only to look for clues to global contexts but also to consider whether these global contexts are more preferred by one word than by the other. One rather broad global context more preferred by schwer concerns the domain of accidents and injuries (cf. the large dark area in the left-hand portion of Figure 8), and intuitively, the word schwer is used in such a context to discuss how severe the accident or injury is. One global context more preferred by schwierig might tentatively be labelled as tasks and puzzles (the white area), and the word schwierig is used in this context to talk about how difficult a task or puzzle is to accomplish or solve. The areas coloured in intermediate shades of grey point to global contexts in which both words can be used equally, i.e. in which each may be replaced by the other. In step 3, one might therefore decide that the relation between the two words is best described in terms of near-synonymy (e.g. Dirven/Taylor 1988; Kennedy 1991) or a related phenomenon defined as plesionymy (e.g. Storjohann 2009). An SOM reflecting the ideal and hypothetical case of absolute synonymy would show an intermediate grey throughout. The near-synonyms ohnedies and eh (anyway) constitute another interesting example (Figure 9). Note, as part of step 2, the fairly sharp contrasts between darker and lighter areas. One may infer in this step that eh is typically used in colloquial contexts, namely in situations in which one would also use words such as nix, halt, blöde, or quatschen. By contrast, ohnedies appears to be typically used in more formal contexts, namely in situations in which one would also use words like Sicherheitslage, bemängeln, zugegebenermaßen, or oftmals. The next example is about the near-synonyms Notarztwagen and Rettungs wagen (Figure 10), both of which would probably be translated into English as ambulance. This is a case where the SOM reveals almost no contrasts in shading, indicating that both words are virtually synonymous in all contexts – at least as far as these contexts are represented in the underlying corpus. However, there is a more lightly shaded area in the lower left-hand corner which suggests that in a situation which is embedded in Austrian discourse, one will tend to say Notarztwagen rather than Rettungswagen. This, of course, can only be inferred by someone whose personal discourse experience involves encounters with samples of Austrian German or at least experience with Austria as a country – a good 11. The actual SOMs for contrasting words are colour-coded using the colour scale between red and yellow (see original at http://corpora.ids-mannheim.de/ccdb/). Due to editorial restrictions, we present these SOMs here using shades of grey.
138 Cyril Belica, Holger Keibel, Marc Kupietz and Rainer Perkuhn
Figure 9. SOM for the near-synonym pair ohnedies vs. eh (anyway)
example of why the interpretive second step cannot be performed by a computer. It also points to the expectation that different individuals will often infer different global contexts in step 2 of their analyses. Interestingly, however, in more systematic explorations employing this general methodology, a surprisingly high degree of agreement has been observed across the SOM interpretations by different individuals (Vachková/Belica 2009). In our final example, we leave the relation of near-synonymy and take the issue even further. Consider Figure 11 which contrasts the focus words Einsamkeit (loneliness) and Zweisamkeit (‘being two’, ‘being together’). Intuitively, there is a close semantic relation between these two words. They are not near-synonyms, but, although there seems to be some semantic contrast, they are not opposites either. As before, in step 2 it is necessary to look for clues to global contexts and
An empiricist’s view of the ontology of lexical-semantic relations 139
Figure 10. SOM for the near-synonym pair Notarztwagen vs. Rettungswagen (ambulance)
attempt to label them by intuitive descriptors. The dark area in the upper lefthand corner seems to refer to global contexts involving domains of love, marriage, family life, and maybe also friendship, and it seems plausible that these contexts are more preferred by Zweisamkeit. By contrast, the lighter area in the lower left-hand corner points to global contexts which are marked by some kind of psychological suffering or social difficulties. What is interesting about this example, however, is not the extremes – i.e. the areas with the darkest or lightest shading in the figure – but the transitions between them. Tracing these transitions in terms of global contexts may lead to deep insights into the semantic structure established by the use of these two words. For instance, one path of transition seems to have to do with the undesired absence or lack of something,
140 Cyril Belica, Holger Keibel, Marc Kupietz and Rainer Perkuhn
Figure 11. SOM for the word pair Einsamkeit (loneliness) vs. Zweisamkeit (‘being two’, ‘being together’)
another one with the personal experience of harmony. For each of these transitional contexts, specific scenarios come to mind by association in which one might want to talk about Einsamkeit or Zweisamkeit. Before concluding this section, three general remarks should be made. First, it should be stressed that any individual word displayed in one of the grid fields of an SOM does not matter itself. The only individual words that matter are the given focus word(s) for which the SOM is derived – e.g. Quark in Figure 6, or the two words schwer and schwierig for the case of Figure 8. All the words displayed in the actual SOM grid only serve to explore the structure of the focus word(s) in the high-dimensional space. But this methodology does not hinge on any single one of these grid words. In other words, when applying the three-step strategy,
An empiricist’s view of the ontology of lexical-semantic relations 141
one should always proceed in such a way that removing a random word from the SOM grid would not affect one’s interpretation in step 2. It is rather word clusters – i.e. blurred groups of words that are topologically close in the SOM (cf. Figure 7) – that prime one’s associations and give rise to interpretations. Second, these clusters may sometimes happen to correspond to a single field in the SOM grid, at other times they may stretch across many such fields, and sometimes one field seems to be the intersection of several clusters. The grid is therefore not to be taken as authoritative. Essentially, its only function is to guide a linguist or other human interpreter towards explicating their implicit knowledge of semantic structure, involving both fine-grained and coarse-grained aspects. Third, an SOM for the same focus word(s) will look differently when generated multiple times. The reason is that the SOM algorithm starts from a random initial constellation. However, in regenerated SOMs, the inferred global contexts usually remain stable across these SOMs. In particular, this is true for all global contexts that we inferred in the previous examples.
5.
Conclusion
Based on a very broad and flexible notion of word, the proposed three-step approach – consisting of self-organising clustering of collocation profiles extracted from very large text corpora, followed by a discourse-based semiotic interpretation, and a deferred mapping of the emergent signs to systemic linguistic categories – is, we would argue, a coherent and adequate methodology for the explorative study of lexical-semantic relations. The methodology is sensitive to the properties of these relations, ranging from subliminal fine-grained to more salient coarse-grained properties. This three-step strategy follows a rationale similar to that formulated by Mahlberg (2005: 186): Although humans understand language, they are not able to describe all the mechanisms that lead to the creation of meaning. Much of language use is unconscious. […] For linguists […] the mismatch between our intuitions and actual use [has] more important consequences. Corpora can make up for this mismatch. However, what is objectively observable in corpora is not meaning, but first of all sequences of words that are evidence of meaning. A description of meaning will always need human interpretation, drawing on knowledge of typical behaviour in social and cultural contexts.
Meaning is a question of interpretation and therefore can only be decided subjectively by individual humans. At the same time, most knowledge of meaning is implicit knowledge that cannot be accessed directly. The strategy proposed here
142 Cyril Belica, Holger Keibel, Marc Kupietz and Rainer Perkuhn
tries to overcome this problem by starting from observable word sequences and attempting to uncover clues to aspects of meaning in an automated and objective manner. At some point, however, subjective interpretation on the basis of discourse experience has to come into play. This strategy attempts to provide clues that guide human interpreters in making explicit the fine-grained and coarse-grained structure of lexical meaning otherwise hidden in our implicit knowledge. Our experience with this approach so far confirms that the methodology indeed guides linguists in detecting the subtle distinctions of global contexts and formulating psychologically valid statements about the semantic structure of words. What is driving these processes is the fact that SOMs prompt interpreters to recollect relevant fragments of their discourse experience – or, in other words, to activate latent portions of the their mental lexicon. Moreover, as mentioned before, a high intersubjectivity of SOM interpretations has been observed (Vachková/Belica 2009). Our tentative evaluations of the methodology so far corroborate the two claims with respect to local and global contexts. Furthermore, we found that the epistemic results obtained by applying this methodology generally have predictive and explanatory power. As a final concluding remark, we would like to encourage further interdisciplinary research into the SOMs in the semiotic context, including follow-up studies in the domain of psycholinguistics and cognitive linguistics.
References Barlow, Michael and Kemmer, Suzanne (eds). 2000. Usage-based models of language. Stanford: CSLI Publications. Belica, Cyril. 1995. Statistische Kollokationsanalyse und Clustering. Korpuslinguistische Analysemethode. Mannheim: Institut für Deutsche Sprache. (http://www.ids-mannheim.de/kl/ projekte/methoden/ka.html) Belica, Cyril. 2001–2007. Kookkurrenzdatenbank CCDB. Mannheim: Institut für Deutsche Sprache. (http://corpora.ids-mannheim.de/ccdb/) Bybee, Joan. 1998. “The emergent lexicon.” CLS 34: The panels (Chicago Linguistics Society): 421–435. Bybee, Joan and Hopper, Paul (eds). 2001. Frequency and the emergence of linguistic structure. Amsterdam: John Benjamins. Cheng, Winnie, Greaves, Chris and Warren, Martin. 2006. “From n-gram to skipgram to concgram.” International Journal of Corpus Linguistics 11(4): 411–433. Dirven, René and Taylor, John R. 1988. “The conceptualization of vertical space in English: The case of tall.” In Topics in Cognitive Linguistics, Brygida Rudzka-Ostyn (ed.), 379–402. Amsterdam: John Benjamins. Elman, Jeffrey L. 2004. “An alternative view of the mental lexicon.” Trends in Cognitive Sciences 8(7): 301–306.
An empiricist’s view of the ontology of lexical-semantic relations 143
Elman, Jeffrey L., Bates, Elizabeth A., Johnson, Mark H., Karmiloff-Smith, Annette, Parisi, Domenico and Plunkett, Kim. 1996. Rethinking innateness: A connectionist perspective on development. Cambridge, MA: MIT Press. Firth, John R. 1957a. “The technique of semantics.” In Papers in linguistics 1934–1951, John, R. Firth, 7–33. London: Oxford University Press. (Original work published 1935, English studies, xvii. 1). Firth, John R. 1957b. “Modes of meaning.” In Papers in linguistics 1934–1951, John, R. Firth, 190–215. London: Oxford University Press. (Original work published 1951, Essays and Studies, The English Association). Gale, William A., Church, Kenneth W. and Yarowsky, David. 1992. “One sense per discourse.” In Proceedings of the 4th DARPA Speech and Natural Language Workshop, February 23–26, Harriman, New York, 233–237. Morristown, NJ: ACL. Glasersfeld, Ernst von. 1996. Radical Constructivism: A way of knowing and learning. London: Falmer. Gruber, Tom. 2009. “Ontology.” In Encyclopedia of Database Systems, M. Tamer Özsu and Ling Liu (eds), 1963–1965. Berlin: Springer. Hoey, Michael. 2005. Lexical priming: A new theory of words and language. London: Routledge. Hopper, Paul J. 1998. “Emergent Grammar.” In The new psychology of language: Cognitive and functional approaches to language structure, Michael Tomasello (ed.), 155–175. Mahwah, NJ: Lawrence Erlbaum. IDS. 2007. Deutsches Referenzkorpus / Archiv der Korpora geschriebener Gegenwartssprache 2007-I (Release vom 31.01.2007). Mannheim: Institut für Deutsche Sprache. (http:// www.ids-mannheim.de/kl/projekte/korpora/) Keibel, Holger and Belica, Cyril. 2007. “CCDB: A corpus-linguistic research and development workbench.” In Proceedings of the 4th Corpus Linguistics conference, Matthew Davies, Paul Rayson, Susan Hunston and Pernilla Danielsson (eds), Birmingham: University of Birmingham. (http://corpus.bham.ac.uk/corplingproceedings07/paper/134_Paper.pdf) Keibel, Holger, Kupietz, Marc and Belica, Cyril. 2008. “Approaching grammar: Inferring operational constituents of language use from large corpora.” In Grammar & Corpora 2007: Selected contributions from the conference Grammar and Corpora, Sept. 25–27, 2007, Liblice, Czech Republic. František Šticha and Mirjam Fried (eds), 235–242. Prague: ACADEMIA. Keibel, Holger and Kupietz, Marc. 2009. “Approaching grammar: Towards an empirical linguistic research programme.” In Working Papers in Corpus-based Linguistics and Language Education, no. 3, Makoto Minegishi and Yuji Kawaguchi (eds), 61–76. Tokyo University of Foreign Studies. (http://cblle.tufs.ac.jp/assets/files/publications/working_ papers_03/section/061-076.pdf) Kennedy, Graeme. 1991. “Between and through: The company they keep and the functions they serve.” In English Corpus Linguistics, Karin Aijmer and Bengt Altenberg (eds), 95–110. London: Longman. Kleiber, Georges. 1993. Prototypensemantik: Eine Einführung (M. Schreiber, Trans.). Tübingen: Gunter Narr Verlag. (Original work published 1990, La sémantique du prototype: Catégories et sens lexical, Paris, Presses Universitaires de France). Kohonen, Teuvo. 1984. Self-organization and associative memory. Berlin: Springer.
144 Cyril Belica, Holger Keibel, Marc Kupietz and Rainer Perkuhn
Kravchenko, Alexander V. 2003. “The ontology of signs as linguistic and non-linguistic entities: A cognitive perspective.” Annual Review of Cognitive Linguistics 1: 179–191. Retrieved February 23, 2009 from http://cogprints.org/4009/. Kuhn, Thomas S. 2000. “The road since Structure.” In The road since Structure: Philosophical essays, 1970–1993, with an autobiographical interview, Thomas S. Kuhn, 90–104. Chicago: The University of Chicago Press. (Original work published 1991, PSA 1990, volume 2, East Lansing, Philosophy of Science Association). Kupietz, Marc and Keibel, Holger. 2009. “Gebrauchsbasierte Grammatik: Statistische Regelhaftigkeit.” In Deutsche Grammatik – Regeln, Normen, Sprachgebrauch (Jahrbuch des Instituts für Deutsche Sprache 2008), Marek Konopka and Bruno Strecker (eds), 33–50. Berlin, New York: de Gruyter. Langacker, Ronald W. 1987. Foundations of cognitive grammar (vol. 1: Theoretical Prerequisites). Stanford: Stanford University Press. MacWhinney, Brian (ed.). 1999. The Emergence of Language. Mahwah, NJ: Lawrence Erlbaum. Mahlberg, Michaela. 2005. English general nouns: A corpus theoretical approach (Studies in Corpus Linguistics, Vol. 20). Amsterdam: John Benjamins. Manning, Christopher D. and Schütze, Hinrich. 2001. Foundations of statistical natural language processing. Cambridge, MA: MIT Press. Paradis, Carita. 2008. “Configurations, construals and change: Expressions of DEGREE.” English Language and Linguistics 12.2: 317–343. Preprint version retrieved March 13, 2008 from http://www.vxu.se/hum/publ/cpa/degree.pdf Partington, Alan. 1998. Patterns and meanings: Using corpora for English language research and teaching (Studies in Corpus Linguistics, Vol. 2). Amsterdam: John Benjamins. Perkuhn, Rainer and Keibel, Holger. 2009. “A brief tutorial on using collocations for uncovering and contrasting meaning potentials of lexical items.” In Working Papers in Corpus-based Linguistics and Language Education, no. 3, Makoto Minegishi and Yuji Kawaguchi (eds), 77–91. Tokyo University of Foreign Studies. (http://cblle.tufs.ac.jp/ assets/files/publications/working_papers_03/section/077-091.pdf) Sinclair, John. 1991. Corpus, concordance, collocation. Oxford: Oxford University Press. Sinclair, John. 1998. “The lexical item.” In Contrastive lexical semantics, Edda Weigand (ed.), 1–24. Amsterdam: John Benjamins. Smolensky, Paul. 1988. “On the proper treatment of connectionism.” Behavioral and Brain Sciences 11: 1–74. Storjohann, Petra. 2009. “Plesionymy: A case of synonymy or contrast?” Journal of Pragmatics 41(11): 2140–2158. Taylor, John R. 1995. Linguistic categorization: Prototypes in linguistic theory (2nd ed.). Oxford: Clarendon. Tomasello, Michael. 2003. Constructing a language: A usage-based theory of language acquisition. Cambridge, MA: Harvard University Press. Vachková, Marie and Belica, Cyril. 2009. “Self-organizing lexical feature maps. Semiotic interpretation and possible application in lexicography.” Interdisciplinary Journal for Germanic Linguistics and Semiotic Analysis 13(2): 223–260. Yarowsky, David. 1993. “One sense per collocation.” In Proceedings of the ARPA Human Language Technology workshop, March 21–24, Princeton, New Jersey, 266–271. Morristown, NJ: ACL.
The consistency of sense-related items in dictionaries Current status, proposals for modelling and applications in lexicographic practice Carolin Müller-Spitzer
Consistency of reference structures is an important issue in lexicography and dictionary research, especially with respect to information on sense-related items. In this paper, the systematic challenges of this area (e.g. ‘non-reversed reference’, bidirectional linking being realised as unidirectional structures) will be outlined, and the problems which can be caused by these challenges for both lexicographers and dictionary users will be discussed. The paper also discusses how text-technological solutions may help to provide support for the consistency of sense-related pairings during the process of compiling a dictionary.
1.
Introduction
This paper deals with problems of consistency of sense-related items in dictionaries. It focuses on paradigmatic structures with regard to some lexicographic issues and consists of four main parts. Firstly, the term consistency will be defined in order to determine more precisely its meaning in this context. Secondly, a short overview of the current status of the consistency of sense-related items in some German dictionaries such as the Duden Synonymwörterbuch (2007) or elexiko (2003ff.) will be given. Thirdly, in the main section, a concept of data-modelling for reference structures will be presented. This concept for modelling reference structures is a way of supporting consistency in general lexicographic practice. Finally, some possible applications of the proposed model will be outlined. These concern, for example, new kinds of presentation of sense-related items, provided the underlying data are structured consistently and precisely. In this paper, the emphasis is placed on lexicographic practice and text-technological questions rather than on theoretical lexicological considerations.
146 Carolin Müller-Spitzer
2.
The term consistency
As a first step, the term consistency needs to defined. Consistency is derived from the Latin word consistere, meaning ‘to stand, hold or continue’. From this rather vague and broad meaning, a number of other senses have developed. One of them is a special meaning in the tradition of Aristotelian logic: In logic, a consistent theory is one that does not contain a contradiction. The lack of contradiction can be defined in either semantic or syntactic terms. The semantic definition states that a theory is consistent if it has a model; this is the sense used in traditional Aristotelian logic, although in contemporary mathematical logic the term satisfiable is used instead. The syntactic definition states that a theory is consistent if there is no formula P such that both P and its negation are provable from the axioms of the theory under its associated deductive system. (Quoted from www.wikipedia.com)
More generally, it can be summarised as ‘something characterised by a lack of contradiction’. In the context of databases, consistency refers to the lack of contradiction of data, controlled by constraints. Therefore, consistency is well defined in this special field. In addition to this, consistency has different meanings in everyday language. One of these meanings is influenced by the technical meaning described above. Merriam-Webster paraphrases this sense as “agreement or harmony of parts or features to one another or a whole” [Merriam Webster Online 2009]. For the purposes of the present paper, I will refer to the term consistency in two different ways: in the context of data modelling, the more technical sense is adopted, whereas in the context of dictionary usage, it is the more general meaning of consistency which is intended.
3.
Consistency of sense-related items in dictionaries: A short insight into the current status
In this section, I shall outline the situation we are currently facing in terms of consistency of sense-related items in existing German dictionaries. In the context of describing lexical-semantic relations in dictionaries, consistency has to mean that bidirectional relations, as existing in paradigmatic relations, are always given for both reference points between which a specific relation holds. For example, if require is given as a synonym of the entry demand, then demand . For a similar definition see also the entry “classical logic” in the Stanford Encyclopedia of Philosophy.
The consistency of sense-related items in dictionaries 147
Figure 1. Entries arbeitsunfähig, erwerbsunfähig and dienstunfähig from Duden: Das Synonymwörterbuch (2007)
should also be listed as a synonym of the entry require. This is a form of consistency which is important for the underlying lexicographic data model as well as for the dictionary user. In the following, two conventional German dictionaries of synonyms (Duden 2007 and Wahrig 2006) and elexiko, a new corpus-based dictionary of contemporary German which is being compiled at the IDS, have been chosen in order to illustrate the current situation. In Figure 1, three entries taken from Duden (2007), a dictionary of synonyms, are shown: arbeitsunfähig (unfit or unable to work), erwerbsunfähig (unable to work, incapacitated) and dienstunfähig (disabled, unfit for service). The descriptions of these three entries are semantically very close. The terms constitute a synonymous set or cluster. Nevertheless, there are striking inconsistencies. For example, in the entry arbeitsunfähig, the synonym erwerbsunfähig is missing, although arbeitsunfähig is given as a synonym of the headword erwerbsunfähig. In addition, dienstunfähig is not listed as a meaning equivalent to arbeitsunfähig, whereas in the entry dienstunfähig both arbeitsunfähig and erwerbsunfähig are listed as synonyms. It could be argued that consistency is not of particular importance here. Presumably, most lexicographers attempting to compile a reference work of synonymy aim to provide an abundance of words with similar meanings which can be substituted for each other. Their aim is not to depict theoretical lexical-semantic structures as lexicographic information (cf. also Lew 2007). However, it is argued here that, as the entry arbeitsunfähig in particular illustrates, a more consistent approach would help to provide the
148 Carolin Müller-Spitzer
Figure 2. Entries Mittelpunkt (centre), Brennpunkt (focus) and Knotenpunkt (junction) from Wahrig Synonymwörterbuch (2006)
dictionary user with better information. Presumably, any lexicographer would have added erwerbsunfähig as a synonym of arbeitsunfähig to this dictionary, had the incomplete listing been noticed. Figure 2 shows an example taken from a different reference work, the “Wahrig Synonymwörterbuch” (2006). Again, as the example demonstrates, the three entries are semantically very similar to each other: Mittelpunkt (centre), Brennpunkt (focus) and Knotenpunkt (junction). As Cruse (2004: 156) points out, generally no one is puzzled by the content of dictionaries of synonyms. Certainly, at a closer look, most native speakers of German would be surprised by many of the synonyms which are listed. However, it is not synonymous relations as such which are the focus of attention here. Turning again to the problem of consistency, the following questions arise: Why is only Herzstück (heart, core) listed as an equivalent of Mittelpunkt (centre), whereas in the entries Brennpunkt (focus) and Knotenpunkt (junction), both Herz (heart) and Herzstück are mentioned? Or why does the entry of Mittelpunkt only contain Nabel (navel, hub), whereas Nabel der Welt (hub of the world) is listed with Brennpunkt and Knotenpunkt? These are just two examples which pinpoint some of the issues concerning the problem of consistency. They are a good illustration of how more support in terms of general consistency would improve lexicographic results. Elexiko, which is an online resource of contemporary German, serves as a last example to illustrate a more complex and less obvious case of inconsistency. As mentioned before, elexiko is still in the process of being written (elexiko 2003 et. seq.), which means that this dictionary is not a complete reference book following
The consistency of sense-related items in dictionaries 149
Figure 3. Extract from the dictionary section “sense-related items” of the entry ehemalig in elexiko
an alphabetical compiling procedure. In the presentation of elexiko, each entry is divided into two main parts. The first part gives sense-independent information of a lexeme (in German “lesartenübergreifende Angaben”), and the second part provides information which is bound to a specific sense of the search item (in German “lesartenbezogene Angaben”, cf. Storjohann 2005: 62). Sense-independent details focus on information that applies to the entire search word and not to a specific sense, and it contains details of spelling, spelling variants and syllabication, morphological information, and so on. Information on a specific sense can be accessed by clicking on ‘signposts’, which are guidewords used for identifying senses (cf. Storjohann 2005: 68) (see, for example, the entry ehemalig and within that entry the sense ‘früher’ in Figure 3). The sense-related items are presented in a special dictionary section (or tab) headed “sinnverwandte Wörter” (“sense-related items”). If, for instance, the link to the synonym einstig is followed, the link target is the corresponding sense labelled ‘ehemalig’ which holds the synonymy relation (see Figure 4).
. Elexiko can be accessed free of charge and without registration.
150 Carolin Müller-Spitzer
Figure 4. Extract of the entry einstig in elexiko (Tab: paraphrase) (link target)
This means that it is not the dictionary part which contains general senseindependent information which is opened when a user follows a link, but it is the corresponding paraphrase of a specific sense to which the user is guided (as the first part of the sense-related information). However, when we go back to the entry ehemalig and look up another synonym, for instance früher, we notice that the link target is the entry früher as a whole, but it does not refer to a specific sense as such. In the context of elexiko, this is incorrect because all sense-related items are to be given specifically for a semantic instantiation or context as displayed by a sense or subsense respectively. Presumably, the linking mistake occurred because the entry ehemalig was edited first, before früher, and the lexicographer had no means of systematically checking each item of the same semantic field for its link reference. For this reason, the link to the entry früher could only be given on the sense-independent level at that time because, unless a headword has been fully analysed and described, each link to another term is always given without a context and thus sense-independently. As a result, the lexicographer simply forgot to add the necessary details on link reference, from general to sense-bound information, a mistake that is bound to happen at some point when no computer support is available . After the colloquium, the entry ehemalig was revised so that the relation between ehemalig and früher is now related to specific senses in both directions.
The consistency of sense-related items in dictionaries 151
(cf. Blumenthal et al. 1988: 365; Engelberg/Lemnitzer 2001: 211). It is simply impossible for lexicographers to bear all specific references in mind. This explains why it is essential to have computer assistance at hand. Turning our attention to the entry früher, this is a case where in the reverse direction, the synonym relation is connected to one specific sense. Opening the sense ehemalig in the head word früher and following the link to ehemalig, the corresponding paraphrase is again already open to the user. This is also a form of inconsistency: whereas in one direction the synonym relation is recorded as sense-related, in the other case it is not. These examples show clearly that, generally, entries in various German dictionaries are organised inconsistently with respect to reference structures. Both the practical working routines of lexicographers as well as dictionary users would benefit from better ways of organising reference structures. However, two questions are still open. Why is consistency important here? And who is really investigating the consistency of dictionary entries apart from metalexicographic critics and someone who is writing a paper about consistency of sense-related items in dictionaries? The average user of a printed dictionary is unlikely to pay much attention to the consistency of given sense-related items. Nevertheless, keeping entries in special dictionaries like dictionaries of synonyms consistent with each other may help to avoid semantic confusion in terms of possibilities of lexical substitution in specific contexts. In electronic dictionaries, however, things look quite different. Users are more likely to follow links to sense-related items than to look them up by leafing through a printed dictionary. So if in an entry, a synonym is given for a specific sense, and in the link-targeted entry this headword is not mentioned as a synonym, users are probably surprised by the lack of reverse linking. And for the working practice of lexicographers, support of any kind resulting in consistent entries is not only useful but essential. For example, it would be very helpful if, when starting to write a dictionary entry, lexicographers were to receive a computerised message that the entry is already mentioned as a target in another entry. So providing consistency is only possible with extensive computer assistance, particularly for comprehensive data. This is why many lexicographic software tools name this as an important topic in their descriptions, e.g. TshwaneLex contains various innovative features designed to optimise the process of producing dictionaries, and to improve consistency and quality of the final dictionary product. (Joffe/De Schryver 2004: 1)
. Cf. Lithkowski (2000).
152 Carolin Müller-Spitzer
And that leads us again to the meaning of consistency mentioned above, namely consistency with respect to the data model. A data model of reference structures which ensures that reference structures are encoded in a consistent way leads to consistent entries in dictionaries. In the next section, a way of achieving this goal will be proposed.
4.
Proposals for modelling sense-related items (and cross-reference structures in general)
One way in which computer assistance may be employed will now be described using the aforementioned electronic dictionary elexiko. The intention is not to introduce specific software or a single content management system. Instead, a concept for modelling reference structures is presented which is applicable to any dictionary and which is not meant for modelling just sense-related items, but lexicographic reference structures in general. The entries ehemalig and früher discussed earlier will serve as an example again. Figure 5 shows how the lexicographic data are realised on the underlying structure level below the presentation (on two very small parts of the whole entries). The data are inserted into a fine-grained XML-structure. Without going into detail, it is noticeable that the structure of the XML-based data is strictly content-based and not layout-based. The tags used here are termed <synonymie> or (relational partner) and not according to typographic presentation such as “italic” or “bold”. It is also the case that the
Figure 5. Parts of the XML structure (elexiko entries ehemalig and früher)
The consistency of sense-related items in dictionaries 153
modelling is very fine-granular. Every single piece of lexicographic information has its corresponding XML-tag, so every unit is (individually) accessible by computer. For the presentation, the XML data are transformed by an XSLT stylesheet into a HTML-based browser view. Consequently, the presentation of the lexicographic information is defined separately from its content. So the main question concerns the problem of how necessary reference information is recorded in the entries. If the lexicographer is working on one entry (in this case ehemalig) and a synonym relation with another word (in this case früher) is detected, for example, in the corpus, s/he has to list this term in its corresponding place within the XML structure of the entry. In this case, it is the tag within the parent element <synonymie>, where s/he has to insert a form of identification, a number assigned to the targeted entry () and in certain cases also the aforementioned signpost of an individual targeted sense (). The ID of an entry is a numeric code greater than zero. The ID of a specifically linked sense of a synonym is the corresponding guide word. The attribute “0” is only given in cases where the linked synonym has not yet been lexicographically described in full, due to the fact that elexiko is not yet a complete dictionary but an ongoing project. The problem, however, is the lack of technical assistance. When the lexicographer edits an entry and the XML instance is being parsed against the DTDs, there is no technical support in elexiko which signals whether the inserted item could be related to a specific sense or alternatively, if the reference needs to be set to the lexeme in general. If the latter is the case, the lexicographer chooses the attribute “0” instead of the right lesart-ID (the right guide word). So if a lexicographer is working on the entry früher at a later point than on its corresponding sense-related lexemes, s/he might forget to go back to a certain synonym, for example ehemalig. This step is, however, essential in order to add the necessary identification of the guide word and its corresponding sense to realise the link as a sense-related reference (). Therefore, when starting the lexicographic work on one entry, there is a need for an automatic message with information on all other existing entries which contain data references with respect to the entry that is currently being edited. Again, this is nothing new for software developers working in (commercial) lexicographic context. However, in this context, the problem is discussed on another level, namely the modelling level. Solving this problem for elexiko is difficult at the moment because from a strictly formal perspective, all reference structures in elexiko go from x to y and – at best – from y to x. This bidirectionality is not currently being checked
. Elexiko is a corpus-based dictionary.
154 Carolin Müller-Spitzer
Figure 6. Bidirectional references in elexiko: current status
or tested for systematically by any tool. At the moment, two reference points between which a bidirectional relation exists are just two references without really being in connection with one another, as illustrated in Figure 6. This is the problem most dictionaries face. As things stand, bidirectional references are two independent references from one point out of the hierarchical tree (which is an XML document) to another. Additionally, the two poles between which a relation holds have a different scope and they are technically independent of each other. Improving this situation is quite straightforward, as demonstrated in Figure 6. Provided that we want to base the modelling of reference structures on a standard, the XML-connected standard XLink (XML Linking Language) is one possible option. Regarding the application of such a standard, XLink is not currently implemented in many tools, so it could be argued that it is not necessary to map the modelling onto this standard. However, in my view, it is very useful to look at this standard, simply because the standard reflects a large number of considerations about reference structures in general, which can only be of benefit (cf. Nordström 2002). However, the following proposals for modelling reference structures could also be implemented in tailor-made XML DTDs or schemas. XLink has been established to allow “elements to be inserted into XML documents in order to create and describe links between resources” (XLink: 2). The view is taken here that it is a slim as well as an adequately complex format, enabling users to model simple but also more complex linking structures with XLink. In the introduction of the XLink specification, it is stated that: „XLink provides a . For the description of the XLink Standard see http://www.w3.org/TR/xlink/
The consistency of sense-related items in dictionaries 155
framework for creating both basic unidirectional links and more complex linking structures. It allows XML documents to: – assert linking relationships among more than two resources, – associate metadata with a link, – express links that reside in a location separate from the linked resources”. It is especially important to be able to associate metadata with a link and to build a link database or (abbreviated) linkbase (which is meant for storing links separately from the linked resources). Before going into detail as to why this is important, some notes on reference structures in general are necessary. Reference structures can be classified into unidirectional and bidirectional relations. For instance, a reference from a dictionary entry to an illustration, a corpus sample or an external encyclopaedia is a unidirectional link. It points in one direction only, i.e. from the target resource there is no reverse reference, which means that referring back to the original source is neither intended nor useful. In the majority of cases, the target resource is outside the lexicographer’s responsibility, as it is outside the lexicographic database. In the context of paradigmatic structures, it is bidirectional references which are of particular interest. Unlike the aforementioned unidirectional reference, creating bidirectional references is part of the lexicographer’s compiling responsibilities. It is only then that two resources may function as a source on the one hand and as the target resource on the other. For example, references to Wikipedia are always given in one direction only (cf. Müller-Spitzer 2007a and b: 169). In this section, the modelling concept is looked at more closely. The first and most general guideline of the concept for modelling cross-reference structures is to model bidirectional references as extended links (in the terminology of XLink) and to store them in a linkbase. This approach has the following advantages: – someone can work on the links or change them externally without touching the dictionary entries themselves and – a link database supports the management of the cross-reference structures. Looking at the current model of references in elexiko again, the difference should become obvious. The model idea which is pursued here is one where all links are stored on an individual level, separated from the entries (a draft of this is shown in Figure 7). What does this mean for the elexiko example? The lexicographic information relating to references should be imported automatically into the linkbase file. In Figure 8, we can see that every piece of important information relating to the crosslinking of the given sense-related items is transferred into XLink and its specific elements and attributes. The process is as follows: any information relat-
156 Carolin Müller-Spitzer
Figure 7. Bidirectional references in elexiko: use of a linkbase
Figure 8. Part of the XLink-linkbase
ing to reference structures is stored in this linkbase. This procedure is performed automatically during the process of checking the entry file into the underlying dictionary database system. In this model, the connections from one synonym to another and vice versa are not two independent references any longer, but form one complex link object. This has the following advantages: Firstly, precise addressing of source and target resource is possible. In the data model which is currently being used, one reference goes from the (so from one lexicographic piece of information in the entry) to the target entry or, alternatively, to a specific sense as a whole. Coming from the
The consistency of sense-related items in dictionaries 157
related item and looking in the other direction, it is just the other way around. So we have an incorrect addressing structure because the synonym relation is – with regard to the content – valid between two contextual instantiations as represented by senses. Now the storage in a linkbase enables us to address the starting point and the target resource independent of the position where the lexicographic information on these references is given in the entries. We can also see this in the part of the linkbase in Figure 8. The attribute @label addresses the sense as a whole as the starting or finishing point of the relation. We might ask whether this is really important. It is assumed here that it is, because if we think of quite different ways of presenting lexicographic data (for example a network of all semantically related senses), it is very important to identify the resources precisely. This is also of particular importance for accessing lexicographic data, for example, if one wishes to present a search word together with its sense-related items. The second main advantage of using a linkbase is the ability to associate metadata with a link. This is explained further by introducing some general features of XLink. One crucial point of extended links in XLink is that “the extended-type element may contain a mixture of the following elements in any order, possibly along with other content and markup: – locator-type elements that address the remote resources participating in the link – arc-type elements that provide traversal rules among the link’s participating resources – title-type elements that provide human-readable labels for the link – resource-type elements that supply local resources that participate in the link.” [XLink: 11] The option to add a human-readable title attribute and to specify traversal rules may seem interesting features in this context. Traversal rules are rules which define how to follow a link and can be specified in extended links as follows: An extended link may indicate rules for traversing among its participating resources by means of a series of optional arc elements. The XLink element for arc is any element with an attribute in the XLink namespace called type with the value ‘arc’. [XLink: 16] […] The arc-type element may have the traversal attributes from and to […], the behavior attributes show and actuate […] and the semantic attributes arcrole and title […]. The traversal attributes define the desired traversal between pairs of resources that participate in the same link, where the resources are identified by their label attribute values. The from attribute defines
158 Carolin Müller-Spitzer
resources from which traversal may be initiated, that is, starting resources, while the to attribute defines resources that may be traversed to, that is, ending resources. The behavior attributes specify the desired behavior for XLink applications to use when traversing to the ending resource. [XLink: 17]
Although it seems to be very interesting to define traversal rules and titles here in the linkbase together with the data model, it has to be noticed that in present-day technology one would prefer to separate the data structure from the application logic. This is an important point of criticism of the XLink-Standard. Looking again at the example (cf. Figure 8), we can see that in the presented part of the linkbase there are two objects modelled as resources, which are elements with the XLink attribute “locator”. They are identified by their labels, in this case the IDs, together with the guide words which are used for identifying senses in elexiko. At the bottom of the middle column in Figure 8, the first arc (an element with the XLink-specific attribute @arc) with the name <synonym_zu1-2>” appears. The second arc is termed <synonym_zu2-1>. All these elements are parts of the complex link object synonymierelation. The behaviour of attributes of XLink can generally speaking be characterised as follows: these attributes refer to an application while traversing the links. As has been pointed out before, these attributes should be defined separately from the linkbase. What impact do these general explanations have when they are related to the concrete examples of sense-related items in elexiko? By specifying a model for a linkbase, it is possible to define a fixed type of extended link for the relation type synonymy. This model may be presented as follows. The synonym link object always connects two remote resources with one another. These resources are entries or individual senses respectively. The corresponding arcs between these resources could be specified by traversal rules. In the case of synonymy, it would be useful to specify the value of show into new. That is, when a user clicks on a synonym, the targeted entry has to be opened in a new frame. In this way, both windows can be arranged next to each other on the computer screen and the two entries can then be received simultaneously. The value of actuate is probably to be assigned to onRequest. This means that the user has to click on the given relation partner (i.e. sense-related item) in the entry in order to follow the link. This abstract model is then applicable to each concrete synonym relation. This kind of model allows us to specify different presentations of different kinds of lexical semantic relation. For example, in the case of lexemes which are connected by hyperonymy (i.e. superordination) or hyponymy (subordination), users might like to look these up one after the other, whereas in the case of lexemes which are
The consistency of sense-related items in dictionaries 159
connected by synonymy or antonymy, users might compare both by simultaneously looking up both entries. Being able to add metadata to a link object can also be used in an extended way. It may be that a relationship between two words might be more significant for lexeme a than it is for lexeme b. When a collocation analysis of both terms individually is performed, synonym a might rank higher in its significance to b than the other way around. This is a regular observation with polysemous lexical items. Such an observation could also be given as metadata, so the XLink-feature of adding metadata could be used here as well. It is clear that this new modelling concept is much more powerful than the one that is currently being used, for example in elexiko and in other dictionaries as well. The final question that remains open is what the consequences might be of the employment and integration of such a linkbase into an editing system for the practical lexicographic work in elexiko (and similar projects). The course of the working process can be roughly sketched as follows: a lexicographer checks an entry out of the database system in order to work on it. Then, as a first step, the linkbase verifies whether the chosen entry is registered as a target resource. If it is, the lexicographer receives an automatic message informing him/her which resource (that is which entry) is the starting point of the relation and which type of relation exists between the items. So the lexicographer can have this in mind while studying the corpus results. Then the entry is lexicographically edited in an XML editor, including all references. Next, another check routine is added. While checking the edited entry-file back into the database system, the corresponding information is imported into the linkbase and – and this is the most important thing – a query is initiated as to whether the information about one reference structure is consistent with the information about the other. If one connection runs from the start resource to the target resource but not back again, the lexicographer receives an error message. In elexiko, it is important from a pragmatic point of view to keep the writing practice in the XML editor as it is. Alternatively, in other projects, it may be better practice to source all linking information out of the entries and store them in the linkbase only. Not only does this form of modelling reference structures allow for typing and assigning attributes as well as for new ways of presenting reference structures, but, most importantly, it also allows lexicographers to make entries more consistent. At this point, it should be stressed again that the modelling concept presented in this paper does not necessarily imply the use of XLink. A tailor-made XML-structure, based on the same guidelines, is able to perform the same tasks.
160 Carolin Müller-Spitzer
5.
Future perspectives
As we have seen, a consistent, content-based and fine-grained way of structuring lexicographic data allows for new ways of presenting sense-related items in dictionaries. In this paper, again the case of the dictionary elexiko is taken for the purpose of demonstration. Here, sense-related items are presented in a specific part of the dictionary. This presentation is well arranged and the screen is not overcrowded. The disadvantage of this kind of presentation is that it is difficult to see which sense-related items are given for one headword in all of its senses at a glance. Instead, the user has to click from one sense to another and it is therefore very difficult to compare the given sense-related items of the individual senses. Provided that all the information about sense-related items is consistently structured, it can be presented in a different way. In Figure 9, an alternative presentation illustrates the paradigmatics of früher in all its senses collectively. The sense-related items of the entry früher could be loaded into any tool which is able to generate a graph or net. The only prerequisite is strictly hierarchically structured data (as shown in Figure 9) in the underlying dictionary entry. Throughout this paper, the relation of synonymy serves as an example, but in this entry, we encounter numerous other types of paradigmatic relation. The
Figure 9. elexiko entry früher with its sense-related items in different senses presented as a net
The consistency of sense-related items in dictionaries 161
headword itself is positioned in the middle, and around it are its senses, labelled “LA” for “Lesart”, followed by the types of sense relation such as synonymy or incompatibility. The purpose of the graph is to show that as well as more traditional lexicographic presentations, there are other ways of presenting sense-related connections in a dictionary. What needs emphasising at this point is the fact that different ways of presentation can be made without any change to the underlying data. The preferred way of structuring data enables us to do this at the touch of a button, if the appropriate tool is available. Moreover, if the necessary data are structured according to the new modelling concept, much more elaborate ways of presentation are possible. To conclude, providing and supporting consistency of sense-related items and reference structures in dictionaries in general should not be seen as an irrelevant hobby of dictionary reviewers and text-technologists. Providing consistency is crucial for lexicographic practice and it can have an effect on qualitative enhancements. At the same time, a consistent way of data modelling and structuring is a prerequisite for developing innovative forms of presenting lexicographic data in an electronic medium such as the Internet. The benefits of consistency in general lie therefore on both sides, for the lexicographer who is compiling the dictionary and for the user who needs to make successful look up procedures.
References Blumenthal, Andreas, Lemnitzer, Lothar and Storrer, Angelika. 1988. “Was ist eigentlich ein Verweis? Konzeptionelle Datenmodellierung als Voraussetzung computergestützter Verweisbehandlung.” In Das Wörterbuch. Artikel und Verweisstrukturen (Jahrbuch 1987 des Instituts für deutsche Sprache), Gisela Harras (ed.), 351–373. Düsseldorf/Bielefeld: Pädagogischer Verlag Schwann-Bagel and Cornelsen-Velhagen u. Klasing. Cruse, D. Alan. 2004. Meaning in Language. (2nd ed.) Oxford: Oxford University Press. Engelberg, Stefan and Lemnitzer, Lothar. 2001. Lexikographie und Wörterbuchbenutzung (Stauf fenburg Einführungen, vol. 14). Tübingen: Stauffenburg. Joffe, David and De Schryver, Gilles-Maurice. 2004. “TshwaneLex – Professional off-the-shelf lexicography software.” In DWS 2004 – Third International Workshop on Dictionary Writing Systems, 17–20. http://tshwanedje.com/publications/dws2004-TL.pdf (last visited on 2009/07/03). Lew, Robert. 2007. “Linguistic semantics and lexicography: A troubled relationship.” In Language and meaning. Cognitive and functional perspectives, Malgorzata Fabiszak (ed.), 217– 224. Frankfurt am Main: Peter Lang. Litkowski, Ken C. 2000. “The Synergy of NLP and Computational Lexicography Tasks”. Technical Report 00-01. Damascus, MD: CL Research. Müller-Spitzer, Carolin. 2007a. “Vernetzungsstrukturen lexikografischer Daten und ihre XMLbasierte Modellierung.” Hermes 38/2007: 137–171.
162 Carolin Müller-Spitzer
Müller-Spitzer, Carolin. 2007b. Der lexikografische Prozess. Konzeption für die Modellierung der Datenbasis (Studien zur deutschen Sprache 42). Tübingen: Gunter Narr. Nordström, Ari. 2002. “Practical XLink.” In XML 2002, Proceedings by deepX. www.idealliance.org/papers/xml02/dx_xml02/papers/06-00-11/06-00-11.html (last visited June 2008/06/01). Storjohann, Petra. 2005. “elexiko: A Corpus-Based Monolingual German Dictionary.” Hermes 34/2005: 55–73. XML Linking Language (XLink) Version 1.0 (2001). W3C Recommendation 27 June 2001 http://www.w3.org/TR/xlink/ (last visited on 2009/07/03).
Reference works Duden. 2007. Das Synonymwörterbuch, 4th edition. Mannheim/Leipzig/Wien/Zürich: Dudenverlag. elexiko. 2003ff. In OWID – Online Wortschatz-Informationssystem Deutsch, Mannheim: Institut für Deutsche Sprache, www.owid.de/elexiko_/index.html (last visited on 2009/07/03). Merriam Webster Online. 2009. (www.Merriam-Webster.com). entry Consistency. http://www. merriam-webster.com/dictionary/consistency (last visited at 2009/07/02). Wahrig. 2006. Synonymwörterbuch, 5th edition. München/Gütersloh: Wissen Media Verlag. Wikipedia.com. 2009. (www.wikipedia.com). entry Consistency: http://en.wikipedia.org/wiki/ Consistency (last visited on 2009/07/03). Stanford Encyclopedia of Philosophy. (http://plato.stanford.edu/entries/logic-classical/). entry Consistency http://plato.stanford.edu/search/searcher.py?query=consistency (last visited at 2009/07/02).
Lexical-semantic and conceptual relations in GermaNet Claudia Kunze and Lothar Lemnitzer
GermaNet is a lexical resource constructed in the style of the Princeton WordNet. Lexical units are grouped in synsets which represent the lexical instantiations of concepts. Relations connect both these synsets and the lexical units. In this paper, we will describe the kinds of relations which have been established in GermaNet as well as the theoretical motivation for their use.
1.
Lexical-semantic resources in natural language applications
Digital lexical resources such as machine-tractable dictionaries (MTD) and lexical knowledge bases (LKB) are extensively used in natural language processing and computational linguistics. Basic application scenarios include: – – – – – –
word sense disambiguation; information retrieval and information extraction; linguistic annotation of language data on several layers of description; text classification and automatic summarisation; tool development for language analysis and language generation; machine translation.
Lexical-semantic wordnets, being considered as “lightweight linguistic ontologies”, have become popular online resources for a number of different languages since the success of the original Princeton WordNet (cf. Miller 1990; Fellbaum 1998). Encoding different kinds of lexical-semantic relations between lexical units and concepts, wordnets are valuable background resources not only in computational linguistic applications, but also for research on lexicalisation patterns in and across languages.
164 Claudia Kunze and Lothar Lemnitzer
2.
The relational structure of lexical-semantic wordnets
Wordnets which are structured along the lines of the Princeton WordNet (cf. Miller 1990; Fellbaum 1998) encode the most frequent and most important concepts and lexical units of a given language, as well as the sense relations which hold between them. In wordnets, a word or lexical unit is represented as a conceptual node, with its specific semantic links to other words and concepts in that language. For example, Stuhl (chair) is represented with its superordinate term Sitzmöbel (seating furniture) as well as its subordinate terms Drehstuhl (swivel chair), Klappstuhl (folding chair), Kinderstuhl (high chair) etc. Furthermore, the superordinate concept Sitzmöbel (seating furniture) is connected to the concepts Lehne (armrest), Sitzfläche (seat) and Bein (chair leg), which represent generic parts of a piece of seating furniture (see Figure 1). A concept is characterised not only by its representational node, but also by its web of semantic relations to other words and concepts. In wordnets, the basic unit of representation is the so-called synset which groups equivalent or similar meaning units, the synonyms, in a common conceptual node. Thus, wordnets do not conflate homonyms, but disambiguate word senses. As pointed out in the introduction, word sense disambiguation is a necessary precondition for numerous applications in natural language processing. Wordnets project natural-language hierarchies, in contrast to formal ontologies which constitute language-independent or domain-specific conceptual networks. In the next section, we focus on describing the German wordnet GermaNet and the semantic relations it encodes (Kunze 2004) as well as its integration into a polylingual wordnet architecture connecting eight European languages (EuroWordNet, see Vossen 1999).
Figure 1. Subtree Sitzmöbel (seating furniture) from the GermaNet hierarchy
2.1
Lexical-semantic and conceptual relations in GermaNet 165
GermaNet – a German wordnet
With GermaNet, a digital semantic lexicon has been built as an important contribution to a German language resource infrastructure. The German wordnet has adopted the database format and the main structural principles of the Princeton WordNet 1.5, which pioneered numerous language-specific initiatives in relation to building wordnets. GermaNet is not merely a translation of WordNet 1.5, but pursues its own core themes in conceptual modelling, by including artificial concepts (see below) and assuming a taxonomic approach for representing not just nouns but all different parts of speech. GermaNet was built from scratch, taking into account different lexicographical resources such as Deutscher Wortschatz (Wehrle and Eggers 1989) and Brockhaus-Wahrig (Wahrig et al. 1984), as well as the frequency lists of various German corpora. GermaNet models lexical categories such as nouns, verbs, and adjectives. The synset as central unit of representation supplies the set of synonyms for a given concept, as for example, {Streichholz, Zündholz} (match), {fleißig, eifrig, emsig, tüchtig} (busy) and {vergeben, verzeihen} (to forgive, to pardon). GermaNet encodes semantic relations either between concepts (synsets) or lexical units (single synonyms in the synsets). Currently, GermaNet contains some 58,000 synsets with nearly 82,000 lexical units, most of them nouns (43,000), followed by verb concepts (9,500) and adjectives (5,500). The coverage of the German wordnet is still being extended with text corpora being the basis of these extensions. GermaNet contains only a few multi-word items such as gesprochene Sprache (spoken language) or Neues Testament (New Testament). Proper nouns are primarily taken from the field of geography, e.g. city names, and are specifically labelled.
2.1.1 Lexical relations in GermaNet The richness of lexical-semantic wordnets derives from the high number of semantic links between the lexical objects. A principal distinction is made between lexical relations and conceptual relations: – Lexical relations are bi-directional links between lexical units (word meanings) such as synonymy (equivalent meanings for synset partners such as Ruf . Cf. http://www.sfs.uni-tuebingen.de/GermaNet. . However, the first computational large-scale semantic network based upon content words of dictionary definitions was developed by Quillian (1966) in order to model semantic memory in Artificial Intelligence. . Cf. Hamp and Feldweg (1997), the differences between GermaNet and Princeton WordNet are listed in Lemnitzer and Kunze (2002).
166 Claudia Kunze and Lothar Lemnitzer
und Leumund (reputation)), and antonymy, holding between, for instance, Geburt (birth) and Tod (death), glauben (to believe) and zweifeln (to doubt), schön (beautiful) and hässlich (ugly). – Conceptual relations such as hyponymy, hyperonymy, meronymy, implication and causation hold between concepts, and thus apply to all synset variants. Hyponymy and hyperonymy form converse pairs: while Gebäude (building) constitutes the hyperonym of Haus (house), Haus constitutes the hyponym of Gebäude. The most important structural principle in semantic networks is the hyponymy relation, as between Rotkehlchen (robin) and Vogel (bird), yielding the taxonomical structure of the linguistic ontology. For nouns, deep hierarchies are possible, e.g. the concept Kieferchirurg (oral surgeon) has a taxonomic chain which comprises 15 elements (known as its “path length”). The GermaNet data model also applies the taxonomic approach to verbs and adjectives. With regard to adjectives, the Princeton WordNet and several of its successors prefer a model based upon antonymy (between central pairs of antonym representatives such as good and bad which build clusters with similar concepts respectively grouped around them like satellites (the “satellite approach”)). GermaNet overcomes this unsatisfactory treatment of adjectives and accounts for the Hundsnurscher/Splett classification of adjectives. The meronymy relation (“part-whole relation”) is assumed only for nouns. Dach (roof) cannot be appropriately classified as a kind of Gebäude (building), but is part of it. Part-whole relations also pertain to abstract structures, as regards the membership of a certain group, such as Vorsitzender (chairman) of a Partei (party), or the substance in a composition, such as Fensterscheibe (window pane) which is made of Glas (glass). Typically, the link between lexical resultatives such as töten (to kill) and sterben (to die) or öffnen (to open) and offen (open) is specified as a causal relation. The causal relation can be encoded between all parts of speech. Little use is currently being made of the implication relation, the so-called entailment, which applies for example between gelingen (to succeed) und versuchen (to attempt). The meaning of a lexical unit is characterised by the sum of relations it holds to other lexical units and concepts. Furthermore, GermaNet encodes pertainymy as a kind of semantic derivational relation (such as finanziell (financial) and Finanzen (finance)), and associative links (see also) between concepts which cannot be captured by a . For further details cf. http://www.sfs.uni-tuebingen.de/GermanNet and Lemnitzer and Kunze (2002).
Lexical-semantic and conceptual relations in GermaNet 167
Figure 2. Partial tree from the GermaNet hierarchy for öffnen (to open)
standard semantic relation (such as Weltrangliste (world ranking list) and Tennis (tennis) or Talmud (Talmud) and Judentum (Jewishness)). Figure 2 depicts the causative verb öffnen (to open) and its semantically related concepts. Synsets and lexical units are presented with their respective reading numbers from the GermaNet database. The connection of the synset öffnen_3, aufmachen_2 (to open) with its superordinate term wandeln_4, verändern_2 (to change) is represented by the upwards arrow, the correlation with its three hyponyms – aufstoßen_2 (to push open), aufbrechen_1 (to break open) and aufsperren_1 (to unlock) – by downwards arrows. The causal relation to the intransitive concept öffnen_1, aufgehen_1 (to open) is represented by the arrow with the dashed line. The lexical units are related to different antonyms: öffnen_3 (to open) is related to its antonym schließen_7 (to close), and aufmachen_2 (to open) to its antonym zumachen_2 (to shut). The bi-directionality of the antonymy relation is represented by a left-right arrow.
2.1.2 Cross-classification of concepts and artificial concepts A concept like Banane (banana), as well as a number of other fruits, can be classified both as Pflanze (plant) and as Nahrungsmittel (food) and can thus be assigned to different semantic fields. In order to access this kind of information, cross-classification of such concepts in different hierarchies is applied (see Figure 3).
. In terms of the major distinction being made, pertainymy is a lexical relation, whereas the associative relation is classed as a conceptual relation.
168 Claudia Kunze and Lothar Lemnitzer
Figure 3. Cross-classification in GermaNet
Figure 4. Use of artificial concepts
Wordnets are expected to represent only lexical units and concepts of a given language. In contrast to this, GermaNet also makes use of artificial concepts in order to improve the hierarchical structure and to avoid unmotivated co-hyponymy. Following Cruse’s approach (cf. Cruse 1986), co-hyponyms (concepts which share a common mother node) should be incompatible with one another. For example, the concepts Säugling (baby), Kleinkind (toddler), Vorschulkind (preschooler) and Schulkind (schoolchild) embody subordinate terms of Kind (child), which are mutually exclusive. The word field lehrer (teacher) contains hyponymic terms such as Fachlehrer (specialist subject teacher), Grundschullehrer (primary school teacher) and Konrektor (deputy head teacher), which cannot be represented meaningfully on a common hierarchy level. In order to model the partial network more symmetrically, and therefore more adequately, two artificial nodes have been created and introduced into the natural language hierarchy, namely ?Schullehrer (teacher in a certain type of school) and ?hierarchischer_Lehrer (teacher with a hierarchical position), as can be seen in Figure 4. Furthermore, GermaNet encodes subcategorisation frames for describing the syntactic complementation patterns of verbal predicates. As we are focusing in
Lexical-semantic and conceptual relations in GermaNet 169
this paper on the description of semantic relations, we refer the reader to the GermaNet homepage for details on verbal frames and example sentences for these frames (see http://www.sfs.uni-tuebingen.de/GermaNet/).
2.2
EuroWordNet – a polylingual wordnet
The GermaNet base vocabulary, comprising some 15,000 synsets, has been integrated into the polylingual EuroWordNet (http://www.hum.uva.nl/˜ewn/), which connects wordnets for eight European languages in a common architecture (see Vossen 1999). EuroWordNet models the most frequent and important concepts of English, Spanish, Dutch, Italian, French, German, Czech and Estonian. The Interlingual Index (ILI) serves as the core component of the database architecture, to which monolingual wordnets are linked. As a language-independent module, the ILI contains an unstructured list of ILI-records (taken from WordNet and therefore biased towards American English) which are labelled with a unique identifier. Concepts from different languages are related to suitable equivalents from the ILI via equivalence links. Matching of specific language pairs applies via the ILI, for example guidare – conducir (Italian – Spanish) regarding the concept to drive (fahren) in Figure 5. Language-independent EuroWordNet modules also include the so-called Top Ontology with 63 semantic features and the Domain Ontology which supplies semantic fields. All language-specific wordnets contain a common set of so-called Base Concepts, which consists of 1,000 nouns and 300 verbs. The base concepts function as the central vocabulary of polylingual wordnet construction, ensuring compatibility between single language-specific wordnets. Base Concepts
Figure 5. EuroWordNet architecture
170 Claudia Kunze and Lothar Lemnitzer
are characterised by semantic features or feature bundles of the Top Ontology. For example, Werkzeug (tool) is characterised by the features artefact, instrument, object. Base Concepts dominate a number of nodes and/or a hierarchical multilevel chain of subordinates, or they constitute frequently occurring concepts in at least two languages. Base Concepts are on the one hand more concrete than the semantic features of the Top Ontology such as dynamic, function and property, but on the other hand more abstract than Rosch’s basic level concepts, such as Tisch (table) and Hammer (hammer) (see Rosch 1978). The adequate level of abstraction for Base Concepts is realised by superordinate concepts of Rosch’s Basic Level Concepts, for example Möbel (furniture) for Tisch (table) und Werkzeug (tool) for Hammer (hammer). After mapping the inventory of Base Concepts to the ILI, the Top Concepts and first-order hyponyms have been linked, thus yielding a first subset of some 7,500 concepts. The construction of language-specific wordnets could then be carried out independently, particularly as the Top Ontology inherits semantic features, allowing for a balanced coverage of different semantic fields across wordnets. Due to diverging lexicalisation patterns across languages arising from linguistic and cultural variation, and as a result of gaps in Princeton WordNet (as a basic resource feeding the ILI), it is not always possible to find matching equivalents for all language-specific concepts. Therefore, the EuroWordNet data model also supplies non-synonymous equivalence links, as well as combining them for the assignment of appropriate transfer terms. For the German concept Sportbekleidung, no synonymous target concept sports garment is provided by the ILI. Alternatively, two equivalence links can be established, one with the hyperonym garment (Kleidung), the other with the holonym sports equipment (Sportausrüstung). The international cooperation involved in constructing a polylingual wordnet architecture has brought about a quasi standard for wordnet development. It also functions as a model for adding further languages. In the year 2000, the Global WordNet Association was founded for fostering common research on wordnets. Several polylingual architectures go back to the EuroWordNet ILI, such as BalkaNet for some South Eastern European languages (cf. Tufiş et al. 2004) or CoreNet for Chinese, Korean and Japanese (see http://bola.or.kr/CoreNet_Project/).
Lexical-semantic and conceptual relations in GermaNet 171
3.
Adding syntagmatic relations
3.1
Motivation
It has recently been acknowledged that one of the shortcomings of wordnets is their relatively small number of syntagmatic relations, in terms of both types and individual instances. Some relations which cross the borders between the parts of speech are encoded, but they are few and far between, and in most cases they encode morphosemantic information. The prevalence of paradigmatic relations might be due to the origin of wordnets in the cognitive sciences, where paradigmatic relations are central and syntagmatic relations are only marginal. However, wordnets have grown out of their cognitive science context and are widely used in the field of natural language processing. It has recently been stated that in such application contexts, wordnets suffer from the relatively small number of relation instances between their lexical objects. It is assumed that applications in Natural Language Processing (NLP) and Information Retrieval (IR), in particular those relying on word sense disambiguation, can be boosted by a lexical-semantic resource with a higher relational density and, consequently, shorter average paths between the lexical objects. This situation also applies to the German wordnet. The lexical objects in GermaNet are connected by only 3,678 paradigmatic lexical relations between lexical units and 64,000 paradigmatic conceptual relations between synsets. Even if we count lexical objects which are related indirectly through an intermediate node, the network is not very dense, and most of the paths between the synsets and lexical units are very long. We have therefore decided in the context of a semantic information retrieval project, in which GermaNet plays a crucial role as a lexical resource, to extend the German wordnet with two types of syntagmatic relation. The first relation holds between verbs and the nominal heads of their subject noun phrases, and the second between verbs and the nominal heads of their direct object noun phrases. We decided to use terms for these relations which do not express or
. Morato et al. (2003) give an overview of applications of wordnets in general. . See, for instance, Boyd-Graber et al. (2006) and Fellbaum (2007). . “Semantic Information Retrieval” (SIR), a project funded by the German Science Foundation. Gurevych et al. (2007) and Gurevych (2005) describe the use of GermaNet in this project.
172 Claudia Kunze and Lothar Lemnitzer
imply a commitment to any syntactic theory. We therefore call the former relation “Arg1” and the latter relation “Arg2”. In the following, we report on our work on the acquisition of these two new types of (syntagmatic) relation. The sources which we have been using for this task are two large, syntactically annotated German corpora: a newspaper corpus and the German Wikipedia. Special attention was directed to the question of how the insertion of instances of this relation into GermaNet affects the neighbourhood of the nodes which are connected by an instance of the new relation. In particular, we observed whether there was a significant decrease in the sum total of path lengths which connect the newly related nodes and the nodes which are in the neighbourhood of these nodes. In the following section we will: (a) give an overview of research relating to syntagmatic relations in connection with wordnets; (b) describe in detail the acquisition and filtering process which leads to the extraction of relevant word pairs; (c) present the results of our experiments with path lengths over the GermaNet graph, and (d) draw some conclusions and outline future research.
3.2
Related work
Research into the (semi-)automatic detection and integration of relations between synsets has proliferated in recent years. This activity can be seen as a response to what Boyd-Graber et al. (2006: 29) identify as a weakness of the Prince ton WordNet: WordNet, a ubiquitous tool for natural language processing, suffers from sparsity of connections between its component concepts (synsets).
Research into the (semi-)automatic acquisition and integration of new synsets aims to reduce the amount of time-consuming and error-prone manual work required without these methods. Snow et al. (2006) and Tjong Kim Sang (2007) present highly efficient means of carrying out this task. They exploit the fact that taxonomic relations between lexical objects are reflected in the distributional patterns of these lexical objects in texts. These efforts are, however, directed towards paradigmatic relations, the hyperonymy/hyponymy relation in particular. They are therefore less relevant for our acquisition task, though we might use them in the future to extend GermaNet with more instances of these paradigmatic relations. Some effort has already been made to introduce non-classical, cross-category relations into wordnets. Boyd-Graber et al. (2006) introduce a type of relation
Lexical-semantic and conceptual relations in GermaNet 173
which they call “evocation”. This relation expresses the fact that the source concept as a stimulus evokes the target concept as a consistent human response. In other words, this is a mental relation which cuts across parts of speech. This approach nevertheless differs from ours, as we use corpus data instead of human response data and we acquire what is in the texts rather than what is supposed to be in the human mind. The relation we introduce is syntactically motivated, which is not the case in the experiment on which Boyd-Graber et al. report. Amaro et al. (2006) attempt to enrich wordnets with predicate-argument structures, where the arguments are not real lexical units or synsets but rather abstract categories such as instrument. Their aim is a lexical-semantic resource which supports the semantic component of a deep parser. This motivates their introduction of a highly abstracted categorisation of these arguments. This is not what we intend to do. Our data might lend themselves to all kinds of abstraction, as we will explain later on, but our primary intention is to capture phenomena which are on the surface of texts. Yamamoto and Isahara (2007) extract non-taxonomic, in particular thematic relations between predicates and their arguments. They extract these related pairs from corpora by using syntactic relations as clues. In this respect, their work is comparable to ours. Their aim, namely to improve the performance of information retrieval systems with this kind of relation, is also comparable to ours. However, they do not intend to include them in a lexical-semantic resource. Closest to our approach is the work of Bentivogli and Pianta (2003). Their research is embedded in the context of machine translation. Seen from this perspective, the almost exclusive representation of single lexical units and their semantic properties is not sufficient. They therefore propose modelling the combinatory idiosyncrasies of lexical units by two means: (a) the phraset as a type of synset which contains multi-word lexical units, and (b) syntagmatic relations between verbs and their arguments as an extension of the traditional paradigmatic relations. Their work, however, focuses on the identification and integration of phrasets. They only resort to syntagmatic relations where the introduction of a phraset would not otherwise be justified. We take the opposite approach, in that we focus on the introduction of instances of the verb-argument relation and resort to the introduction of phrases only in those cases where it is not possible to ascribe an independent meaning to one of the lexical units (see below).
3.3
The corpora
The acquisition of the new word pairs and their relation is based on two large German corpora: (a) the Tübingen Partially Parsed Corpus of Written German
174 Claudia Kunze and Lothar Lemnitzer
Figure 6. Parse tree of a TüPP-D/Z sentence
(TüPP-D/Z), and (b) the German Wikipedia. The first corpus contains approximately 11.5 million sentences and 204 million lexical tokens, and the second corpus contains 730 million lexical tokens. Both corpora have been linguistically annotated using the cascaded finite-state parser KaRoPars which had been developed at the University of Tübingen (cf. Müller 2004) and a modified version of the BitPar PCFG parser (cf. Schmid 2004 and Versley 2005). The results of the automatic linguistic analysis, however, have not been corrected manually, due to the sheer size of the corpora. To illustrate the explanations of the linguistic annotation, we present two example sentences. The first one, displayed in Figure 6, is from the newspaper corpus, and the second, in Figure 7, is from Wikipedia. The first sentence translates as “We need to sell the villas in order to pay the young scientists” where the accusative object Villenverkauf of the verb brauchen (to need) means “sale of the villas”. It is a complex noun and not very frequent in either corpus. The sentence in Figure 7 translates as “He gets the most inspiration from his father” where the accusative object Inspiration occurs in front of the subject er (he). The parsers analyse and mark four levels of constituency.
. We are aware of the vagueness of the term German Wikipedia. The Wikipedia is a moving target. To be more precise, we are using a downloadable snapshot of the Wikipedia which was available on the pages of the Wikimedia Foundation in September 2008. Since we are interested in recurrent lexical patterns only, our experiments are not sensitive to the subtle changes in the database which are caused by the permanent editing processes.
Lexical-semantic and conceptual relations in GermaNet 175
Figure 7. Parse tree of a Wikipedia sentence
The lexical level. For each word in both examples, the part of speech is specified using the Stuttgart-Tübingen-Tagset (STTS, cf. Schiller et al. 1999) which is a de facto standard for German. In the examples, some words are marked as heads of their respective chunks (“HD”), e.g. Villenverkauf and Inspiration. The chunk level. Chunks are non-recursive constituents and are therefore simpler than phrases. The use of chunks makes the overall syntactic structure less complex in comparison to deep parsing. In Figure 7 we have two noun chunks (labelled NCX) and one verb chunk (labelled VXVF). These are the categories which we need for our acquisition experiments. The functional specification of the noun chunks is of the utmost importance. NPs are, with regard to the predicate of the sentence, marked as subject NPs (in the nominative case, ON) or direct object NPs (in the accusative case, OA). The level of topological fields. The German clause can structurally be divided into the verbal bracket, i.e. one part of the verb in second position and the other part at the end of the clause, while the arguments and adjuncts are distributed more or less freely over the three fields into which the verbal bracket divides the clause: Vorfeld (VF), in front of the left verb bracket, Mittelfeld (MF), between the two brackets, and Nachfeld (NF), following the right bracket. The clausal level. The clause in Figure 7 is labelled as a simplex clause (SIMPX). In the example in Figure 6, “S” is the label at the root of the sentence. The parse tree which is generated by KaRoPars (i.e. the example in Figure 6) has a flat structure. Due to limitations of the finite-state parsing model, syntactic relations between the chunks remain unspecified. Major constituents, however,
176 Claudia Kunze and Lothar Lemnitzer
are annotated with grammatical functions in both cases. This information, together with the chunk and head information, is sufficient to extract the word pairs which we need for our experiments.
3.4
The acquisition experiment
From the linguistically analysed and annotated corpora, which we have described above, we extracted two types of syntactically related word pairs: (a) verb-subject (e.g. untersuchen, Arzt (to examine, doctor)) and (b) verb-direct object (e.g. finden, Weg (to find, way)). While the spectrum of possible subjects of a verb turned out to be very broad and heterogeneous, verb-object pairs were more readily identifiable and recurrent. Even in scenarios in which associations are derived on the basis of evocation, it is the case that a higher number of associations are arrived at by humans between verbs and their direct objects than between verbs and their transitive or intransitive subjects.10 We therefore focused our work on the analysis of verb-object pairs, taking only few verb-subject pairs into account. In order to rank the word pairs, we measured their collocational strength, which we consider to be a good indicator of their semantic relatedness. Two common measures – mutual information (MI), cf. Church et al. (1991), and log-likelihood ratio, cf. Dunning (1993) – are used and compared in our experiments. Mutual information can be regarded as a measurement of how strongly the occurrence of one word determines the occurrence of another; it compares the probability of, for example, two words occurring together with the probability of observing them independently of one another. Log-likelihood ratio compares expected and observed frequencies as might be expressed in a contingency table, a two by two table where the four cell values represent the frequency of word x occurring with word y, x and not y, not x and y and finally not x and not y, i.e. the number of observations where neither word appears. Mutual information seems to be the better choice for the extraction of complex terminological units, due to the fact that it assigns a higher weighting to word pairs where the partners occur infrequently throughout the corpus. Log-likelihood ratio, in contrast, is not sensitive to low occurrence rates of individual words and is therefore more appropriate for finding recurrent word combinations. This coincides with the findings which are reported by Kilgarriff (1996) and, for German, by Lemnitzer and Kunze (2007).
10. Such observations are reported by Sabine Schulte im Walde, cf. Schulte im Walde (2006).
Lexical-semantic and conceptual relations in GermaNet 177
Before encoding the new relations in GermaNet, we had to remove word pairs which we did not want to be inserted into the wordnet from the lists, for various reasons: – Pairs with wrongly assigned words due to linguistic annotation errors were removed. As has been stated above, the linguistic annotation was performed automatically and therefore produced errors. Some of these errors are recurrent and lead to word pairs which are highly ranked. For example, the auxiliary verb werden is very often classified as a full verb.11 So, from a sentence Er wird eine Aussage machen (He will make a statement), the pair werden – Aussage is erroneously extracted, instead of the correct pair machen – Aussage. – Word pairs which are fixed expressions or parts of them were also removed. It is well known that many idioms expose syntactic as well as semantic irregularities. In particular, it is impossible to assign a standard or literal meaning to the individual words. The idiom den Löffel abgeben (to hand in the spoon), for example, has nothing to do with cutlery, but is a colloquial expression for sterben (to die). Some of the word pairs are ambiguous. For example, the word pair spielen – Rolle can refer literally to somebody acting on a stage, or idiomatically to something which is important. The literal meaning can and should be captured by a relation connecting the word pair, whereas in the idiomatic meaning, this pair has to be treated as a single, complex lexical unit. – Support verb constructions were also discarded. It has been argued convincingly by Storrer (2006) that some types of support verb constructions and their verbal counterparts, e.g. eine Absage erteilen (literally to give a rejection) and absagen (to reject), show subtle differences in their use, and support verb constructions therefore merit an independent status as complex lexical units. Besides, it is often very difficult to assign a meaning to the verbal part of the construction, as a result of the mere supporting function of this element which is chosen for this role with some arbitrariness.12 For these reasons, we consider it to be inappropriate to represent (semi-) fixed expressions by relating their elements. They are very close to what Bentivogli and Pianta call “phrasets” (cf. Bentivogli and Pianta 2003). We see the need to also encode these complex multi-word expressions in GermaNet, but a discussion of a good strategy for doing this is beyond the scope of this article.
11. Which, in rare cases, it is. 12. For some constructions, it might be possible to assign a meaning to the supporting verb, e.g. Aufnahme finden (literally to find acceptance), but these cases are rare and border on free construction.
178 Claudia Kunze and Lothar Lemnitzer
Table 1. Cumulative path length reduction, average of 100 word pairs for both MI and G2 Method
Average PR value
MI G2
2762.04 15867.38
In order to design a manageable experimental setting by which the impact of the new relations can be measured, we decided to insert the 100 top-ranked of the remaining word pairs manually. Inserting the word pairs involved a manual disambiguation where all words were mapped to the correct synsets. Semi-automatic insertion of the new relation instances would require reliable word sense disambiguation which is not yet available for German. In the following, we report on experiments in which we calculated the local impact of the new relation instances.13
3.5
Measurable local effects of the new relations
In our experiments,14 we wanted to measure the impact of newly introduced relations between verbs and nouns on the path lengths between the related words and the words which are in their neighbourhood. First, we introduced the relation instances which connect the lexical objects which we had acquired through our collocational analyses. We inserted these relations one by one. The settings for measuring the impact of each new relation are as follows: let s1 and s2 be two synsets and R (s1, s2) the new relation instance connecting the two synsets. Further, let SPb be the shortest path between s1 and s2 before the insertion of R (s1, s2) and let SPa be the shortest path between s1 and s2 after the insertion of R (s1, s2). By definition, the length of the shortest path between s1 and s2 after the insertion of (SPa) is 1 (see Figure 8). We calculate the path reduction PRs1,s2 the result of SPb – SPa. We now take S1 and S2, the sets of all synsets which are in the two sub-trees rooted by s1 and s2 respectively; in other words, we take all the hyponyms, the hyponyms of these hyponyms and so forth. 13. We also measured the global impact of the new relations, i.e. the impact of these links on the overall reduction of path length between any two nodes. There are, however, no visible effects and the selected word pairs do not have any impact which is different from the impact of the same number of relations inserted between randomly chosen lexical objects. For details, cf. Lemnitzer et al. (2008). 14. We are grateful to Holger Wunsch and Piklu Gupta, who have run some of the experiments which we report here.
Lexical-semantic and conceptual relations in GermaNet 179
Figure 8. Local path reduction between two synsets s1 and s2. The dashed path is the old path. The new relation R (s1, s2) is depicted by the thick line between s1 and s2
We calculate the path reduction PRsm,sn for each pair sm ∈ S1, sn ∈ S2. The sum of all path reduction values is the local impact caused by the new relation instance. We calculated the sum total of the path reduction values for the 100 most highly ranked pairs according to the mutual information and the log-likelihood statistics. Table 1 shows the average cumulative path reduction value for both statistics. From these figures we can infer that: (a) there is a considerable local impact of the new relation instances, which is what we wanted to achieve, and (b) the impact of the word pairs extracted by log-likelihood ratio is much higher than that of the pairs extracted by mutual information. This confirms our assumption about the superiority of log-likelihood ratio for our acquisition task. By the new relations, we connect pairs of words in the wordnet which exhibit a certain kind of closeness through their frequent co-occurrence in texts. This is in contrast to the traditional paradigmatic relations. These relations connect words which are supposed to be related in the mental lexicon, but which seldom co-occur in texts. We assume that the syntagmatically related word pairs also have an organising function in the mental lexicon, but this is not our primary concern. The important point is that a verb like verschreiben (to prescribe), when occurring in a text, triggers a whole set of words which denote medical objects, but only when used in one particular reading. This is important for NLP tasks such as word sense disambiguation and anaphora resolution.
3.6
From collocation to semantic preference
It is collocations, or, in the terms of British contextualism, colligations, which have so far been encoded. We have linked verbal predicates with the nominal heads of their arguments. Collocations as well as colligations are also part of the
180 Claudia Kunze and Lothar Lemnitzer
Figure 9. Collocates of the verbal predicate beseitigen
description of the meaning of a lexical item as proposed by John Sinclair (1996). Another important aspect of the meaning of a lexical item is, according to Sinclair and his followers’ theory of meaning, the semantic preference of the lexical item: the term semantic preference refers to the (probabilistic) tendency of certain units of meaning to co-occur with items from the same semantic subset, items which share a semantic feature. (Bednarek and Bublitz 2007: 122)
It is possible, from the collocation we have encoded in GermaNet, to arrive at the more abstract level of semantic preference. The example presented in Figure 9 should illustrate this point. In the centre is the lexical unit beseitigen (to remove). We have grouped four of the many direct objects around this verbal predicate: Fehler (mistake, error), Mangel (fault), Verschmutzung (pollution) and Gefahrenquelle (source of danger). These few examples can be grouped along the semantic feature abstract (Fehler, Mangel) vs. concrete (Verschmutzung, Gefahrenquelle). With these few examples we have identified one of the semantic dimensions which distinguish the collocating objects of beseitigen, i.e. unpleasant (abstract) states and (concrete) objects. Currently, we have encoded 9,400 instances of the new relations: 1,800 Arg1-relations and 7,600 Arg2-relations. We have not yet performed this abstraction step on a broader scale, but we are optimistic that such a step can be based on the syntagmatic relation data we have acquired and encoded into GermaNet.
Lexical-semantic and conceptual relations in GermaNet 181
References Amaro, Raquel, Chaves, Rui Pedro, Marrafa, Palmira and Mendes, Sara. 2006. “Enriching wordnets with new relations and with event and argument structures.” In Computational Linguistics and Intelligent Text Processing – 7th International Conference, CICLing-2006, LNCS 3378, Alexander Gelbukh (ed.), 28–40. Berlin/Heidelberg: Springer. Bednarek, Monika and Bublitz, Wolfram. 2007. “Enjoy! – The (phraseological) culture of having fun.” In Phraseology and Culture in English, Paul Skandera (ed.), 109–135. Berlin/New York: de Gruyter. Bentivogli, Luisa and Pianta, Emanuele. 2003. “Extending WordNet with Syntagmatic Information.” In Proceedings of the Second International WordNet Conference – GWC 2004, Masaryk University Brno, Czech Republic, Petr Sojka, Karel Pala, Pavel Smrz, Christiane Fellbaum and Piek Vossen (eds), 47–53. Boyd-Graber, Jordan, Fellbaum, Christiane, Osherson, Daniel and Schapire, Robert. 2006. “Adding Dense, Weighted Connections to WordNet.” In Proceedings of the Third International WordNet Conference, Jeju Island, Korea, Petr Sojka, Key Sun-Choi, Christiane Fellbaum and Piek Vossen (eds), 29–36. Church, Kenneth, Gale, William, Hanks, Patrick and Hindle, Donald. 1991. “Using Statistics in Lexical Analysis.” In Lexical acquisition: exploiting on-line resources to build a lexicon, Uri Zernik (ed.), 115–164. Hillsdale, NJ: Laurence Erlbaum. Cruse, Alan. 1986. Lexical Semantics. Cambridge: Cambridge University Press. Dunning, Ted. 1993. “Accurate methods for the statistics of surprise and coincidence.” Computational Linguistics 19(1): 61–74. Fellbaum, Christiane. 1998. WordNet: An Electronic Lexical Database. Cambridge, MA: MIT Press. Fellbaum, Christiane. 2007. Wordnets: Design, Contents, Limitations. http://dydan.rutgers.edu/ Workshops/Semantics/slides/fellbaum.pdf. Gurevych, Iryna. 2005. “Using the structure of a conceptual network in computing semantic relatedness.” In Proceedings of the 2nd International Joint Conference on Natural Language Processing (IJCNLP’2005), Jeju Island, Republic of Korea, 767–778. Gurevych, Iryna, Müller, Cristof and Zesch, Thorsten. 2007. “What to be? – electronic career guidance based on semantic relatedness.” In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, 1032–1039. Association for Computational Linguistics. Hamp, Birgit and Feldweg, Helmut. 1997. “GermaNet – a Lexical-Semantic Net for German.” In Proceedings of the ACL/EACL-97 workshop on Automatic Information Extraction and Building of Lexical-Semantic Resources for NLP Applications, Madrid, Piek Vossen, Nicoletta Calzolari, Geert Adriaens, Antonio Sanfilippo, and Yorick Wilks (eds), 9–15. Kilgarriff, Adam. 1996. “Which words are particularly characteristic of a text? A survey of statistical approaches.” In Proceedings of AISB Workshop on Language Engineering for Document Analysis and Recognition, 33–40. Falmer: Sussex. Kunze, Claudia. 2004. “Lexikalisch-semantische Wortnetze.“ In Computerlinguistik und Sprach technologie: eine Einführung, Kai-Uwe Carstensen, Christian Ebert, Cornelia Endriss, Susanne Jekat and Ralf Klabunde (eds), 386–393. Heidelberg/Berlin: Spektrum Verlag.
182 Claudia Kunze and Lothar Lemnitzer
Lemnitzer, Lothar and Kunze, Claudia. 2002. “Adapting GermaNet for the Web.” In Proceedings of the first Global WordNet Conference, Central Institute of Indian Languages, 174–181. Mysore, India. Lemnitzer, Lothar and Kunze, Claudia. 2007. Computerlexikographie. Tübingen: Gunter Narr Verlag. Lemnitzer, Lothar, Wunsch, Holger and Gupta, Piklu. 2008. “Enriching GermaNet with verbnoun relations – a case study of lexical acquisition.” In Proceedings LREC 2008, Marrakech, Marokko. Miller, George. 1990. “Special Issue: WordNet – An on-line lexical database.” International Journal of Lexicography 3(4). Morato, Jorge, Marzal, Miguel, Llorens, Juan and Moreiro, José. 2003. “WordNet Applications.” In Proceedings of the Second International WordNet Conference – GWC 2004, Masaryk University Brno, Czech Republic, Petr Sojka, Karel Pala, Pavel Smrz, Christiane Fellbaum and Piek Vossen (eds), 270–278. Müller, Frank H. 2004. Stylebook for the Tübingen Partially Parsed Corpus of Written German (TüPP-D/Z). Sonderforschungsbereich 441, Seminar für Sprachwissenschaft, Universität Tübingen. Quillian, M. Ross. 1966. Semantic Memory. Unpublished Ph.D. thesis, Carnegie Institute of Technology. Rosch, Eleanor 1978. “Principles of Categorization.” In Cognition and Categorization, Eleanor Rosch and Barbara B. Lloyd (eds), 27–48. Hillsdale, NJ: Lawrence Erlbaum. Schiller, Anne, Teufel, Simone, Thielen, Christine and Stöckert, Christine. 1999. Guidelines für das Taggen deutscher Textcorpora mit STTS (Kleines und großes Tagset). IMS, Universität Stuttgart. (http://www.ims.uni-stuttgart.de/projekte/corplex/TagSets/stts-1999.pdf) Schmid, Helmut. 2004. “Efficient Parsing of Highly Ambiguous Context-Free Grammars with Bit Vectors.” In Proceedings of the 20th International Conference on Computational Linguistics (COLING 2004), Geneva, Switzerland. Schulte im Walde, Sabine. 2006. “Can Human Verb Associations Help Identify Salient Features for Semantic Verb Classification?” In Proceedings of the 10th Conference on Computational Natural Language Learning, New York City, NY. Sinclair, John. 1996. The search for units of meaning. Textus 59(IX), 75–106. Snow, Rion, Jurafsky, Dan and Ng, Andrew. 2006. “Semantic taxonomy induction from heterogenous evidence.” In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL, Morristown, NJ, 801–808. Association for Computational Linguistics. Storrer, Angelika. 2006. “Funktionen von Nominalisierungsverbgefügen im Text. Eine korpusbasierte Fallstudie.“ In Von Intentionalität zur Bedeutung konventionalisierter Zeichen. Festschrift für Gisela Harras zum 65. Geburtstag, Kristel Proost and Edeltraud Winkler (eds.), 147–178. Tübingen: Gunter Narr. Tjong Kim Sang, Erik. 2007. “Extracting hypernym pairs from the web.” In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. Companion Volume Proceedings of the Demo and Poster Sessions, Prague, Czech Republic, 165–168. Association for Computational Linguistics. Tufiş, Dan, Cristea, Dan and Stamou, Sofia. 2004. “BalkaNet: Aims, Methods, Results and Perspectives.” Romanian Journal of Information Science and Technology 7(1–2): 9–45. Versley, Yannick. 2005. “Parser Evaluation across Text Types.” In Proceedings of the Fourth Workshop on Treebanks and Linguistic Theories (TLT 2005), Barcelona, Spain.
Lexical-semantic and conceptual relations in GermaNet 183
Vossen, Piek. 1999. EuroWordNet: A mutlilingual database with lexicalsemantic networks. Dordrecht: Kluwer Academic Publishers. Wahrig, Gerhard, Krämer, Hildegard and Zimmermann, Harald. 1980–1984. BrockhausWahrig deutsches Wörterbuch. 6 vols. Wiesbaden/Stuttgart: Deutsche Verlags-Anstalt. Wehrle, Hugo and Eggers, Hans. 1989. Deutscher Wortschatz. Ein Wegweiser zum treffenden Ausdruck. Stuttgart: Ernst Klett. Yamamoto, Eiko and Isahara, Hitoshi. 2007. “Extracting word sets with nontaxonomical relation.” In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. Companion Volume Proceedings of the Demo and Poster Sessions, Prague, Czech Republic, 141–144. Association for Computational Linguistics.
Index
A abduction 117 agent classificatory assessment agent 121f. spontaneous-associative agent 121f. analysis of variance 31 anchor word 23 antonymy ancillary antonymy 50f., 60 antonym / antonymic pair 15ff., 49ff. antonymous groups 103, 112 comparative antonymy 52 coordinated antonymy 51, 55, 60f. direct antonyms 20 distinguished antonymy 52 extreme antonymy 52 goodness of antonymy 15ff., 21, 38f., 41f. idiomatic antonymy 52 indirect antonyms 20 interrogative antonymy 52 negated antonymy 52 transitional antonymy 52 association 24, 81f., 120f. associative link 166 B base concept see basic level concept basic level concept 169f. Belica, Cyril 124, 127, 130, 133, 138, 142 bidirectionality 26, 153 bidirectional evidence 36 bidirectional link 165 bidirectional relation 26, 146, 155 unidirectional evidence 36
British National Corpus see corpus C canonicity 21, 24ff., 33ff., 54 canonical antonyms 16ff., 20ff., 31ff., 44, 54f. canonical status 35 non-canonical antonyms 16f., 38, 40, 42, 54, 96 categories of being 118 causality 75ff. causal / causative relation 75ff., 166f. causation 75ff., 166 cause-effect 75ff. CCDB see corpus chunk 175 Church, Kenneth G. 11, 120, 176 cluster analysis 26ff. coarse-grained structure see structure Coco see corpus tool cognitive approach 7 language structures 119 mechanism / principle 70, 92f. processing 122 co-hyponymy 168 collocation 6, 115ff., 120ff., 127ff., 179f. see also co-occurrence collocational context 120 collocation algorithm 124 collocational profile 10, 72, 126ff. higher-order collocation 124f., 127, 130 statistical collocation 120, 124
communicative attitude 96f. comparative function 53 complementarity 95, 102ff. computational linguistics 10, 163 concept 73ff., 101, 164ff. conceptual closeness 79, 82 domain 16, 102 implication 74ff., 84, 90f. modelling 165 operation 75 relation 72, 76, 165f., 171 conceptualisation 73ff. concordance 56f., 61, 64 concordancer see corpus tool concgram 124 condition 78ff., 103 conditionality 78ff. conditional relation 79 connotation 118ff., 133 consistency 145ff. construal 7, 16, 71 construction 7, 49ff., 60ff., 69f., 85ff. ancillary construction 64 construction grammar 61, 65 dynamic contextual construction 73 lexico-syntactic construction see frame context 120ff. global context 120f., 124, 127, 129f., 130ff., 137ff., 141f. local context 121, 124, 127 situational context 120 contrariety 112 control group 17 co-occurrence 7, 19ff., 49ff., 59f., 115ff., 124ff., 179 see also collocation
186 Lexical-Semantic Relations
sentential co-occurrence 19, 21, 27, 35, 44 coordinated framework / structure see frame corpus 8ff., 15ff., 49ff., 71, 115ff., 124ff. British National Corpus 60 CCDB 127ff. corpus-based 17f. corpus-driven 17f., 71f., 118ff. German Reference Corpus / DeReKo 127f. corpus tool 9, 19ff., 56, 64, 127ff. concordancer 56f., 64 Coco 19ff. cross-reference 152ff. Cruse, Alan D. 6f., 15f., 69ff., 73, 75, 77, 82, 89, 93, 95, 102, 168 D data data-driven 20 data model 152ff., 169 data structure 152ff. database 50, 56, 71, 155f., 165, 167, 169 denotation 118, 120 DeReKo see corpus deviation 33f., 39 dichotomy 16f., 39 dichotomous view 39 dictionary 9f., 21, 96, 145ff. corpus-based dictionary 147, 153 editing system 159 electronic dictionary 152ff. elexiko 145, 147ff., 160ff. inconsistency 144ff. lexicographic database / resource 155ff., 165ff. lexicographic practice 9f., 145ff. machine-tractable dictionary 163 printed dictionary 147f., 151 usage 146ff. discourse 49ff., 85ff., 119ff., 127 function 50ff., 85ff. situation 110f.
domain ontology 169 DTD 153f. E elexiko see dictionary elicitation test 17, 23ff. Elman, Jeffrey L. 119, 123 emergence 118f., 123 emergentist perspective 118f. emotive attitude 98 empiricism empiricist perspective 116 entailment see implication entrenchment 118 epistemic attitude 96, 98f., 105 EuroWordNet see wordnet evaluation 101, 106ff., 110f. evaluative attitude 98 event type see speech act evidence-based theory 117 extended link see text-technology F factivity 99 falsification 117 Fellbaum, Christiane 15, 85, 163f. field lexical-semantic field 5f., 96, 101ff., 150, 167, 169f. topological fields 175 fine-grained structure see structure Firth, John R. 119f. frame 50ff., 85ff. co-ordinated frame 51ff., 86ff. lexico-semantic frame 51, 85ff. lexico-syntactic frame 54, 65, 85ff. subordinated frame 89ff. syntactic frame / sequence 85ff. fuzzy similarity 133 G GermaNet see wordnet general resource situation type see speech act
generalisation 122 gradability 95ff. gradable antonyms 95ff., 102ff. Gross, Derek 16f., 54, 65 Gruber, Tom 116 H Harras, Gisela 96f., 111 Herrmann, Douglas J. 17, 21ff., 35, 38f., 41, 54, 65 hierarchical structure 27, 168 high-dimensional space 128f. Hoey, Michael 119f. Hofmann, Thomas R. 95, 102 homonymy 164 Hopper, Paul J. 118f. hypo-/hyperonymy 75, 82f., 166, 172 hyperonym 75, 82f., 166, 170 hyponym 167f., 170 I implication 71ff., 166 entailment 72, 75ff., 95, 102, 166 unilateral entailment 77 inconsistency see dictionary indeterminacy 77, 80 induction 117 information retrieval 163, 171ff. institutional setting 101 interrelation 80, 89 interrogative function 53 J Jones, Steven 8, 17, 49ff., 65, 69, 85, 87f. judgement experiment 15ff., 28ff. Justeson, John S. 17, 19, 49, 54, 85 K Katz, Slava M. 17, 19, 49, 54, 85 Keibel, Holger 118, 121, 124, 127, 130, 133 knowledge conceptual knowledge 16, 73ff. encyclopaedic knowledge 7
implicit knowledge 73ff., 120 metalinguistic knowledge 41, 73 Kupietz, Marc 118, 121, 124 L Lang, Ewald 95, 102, 112 language competence 117, 123 convention 119, 122 experience 15ff., 71ff., 118ff., 133, 137, 140 use 8, 18, 49ff., 69ff. 115ff. Lehrer, Adrienne 37, 95, 102, 168 lexical association 24 field see field lexicalisation 72f., 96f., 101, 103, 106f., 132 lexico-semantic frames see frame linkbase (link database) 155ff. Löbner, Sebastian 95, 102 logical relation 70, 78f. Lutzeier, Peter R. 6, 70, 111 Lyons, John 6, 15, 70, 77, 102 M markedness theory 37f. meaning connotational meaning 118ff., 133 construed meaning 119 context-dependent meaning 118 denotational meaning 118, 120 representation of meaning 115f., 123 theory of meaning 115f., 120, 180 meaning equivalent see synonymy mental lexicon 7, 10, 179 mental representation 71 see also representation meronymy 166 metaphorisation 16 metonymisation 16
Index 187
Mettinger, Arthur 54, 96 Miller, George, A. 54, 163f. Miller, Katherine, J. 16f. modelling 145ff., 155, 159, 165, 173 concept 155, 159 parsing model 175 de Mönnink, Inge 18, 39f. Murphy, M. Lynne 8, 15ff., 49ff., 61f., 65f., 69f., 72, 83, 85, 93
Proost, Kristel 95f., 103 proposition 83, 96, 111, 118 propositional attitude 96ff., 105ff. content 96ff., 101, 104ff. prototypicality effect 16 psycholinguistic 23ff., 34ff. approach 69 experimental data 18 technique 34 purpose-orientation 75, 81f., 92
N natural language processing 10, 163f., 171f. near synonyms see synonymy negation 107ff.
Q quantitative linguistics 130 temporal reference 97f., 101, 104, 106, 108ff.
O online resource 148, 163 CCDB see corpus DeReKo see corpus elexiko see dictionary lexical knowledge base 163 lexical-semantic resource 163ff., 171ff. ontology 115ff., 166, 169f. domain ontology 169f. top ontology 169f. opposition see also antonymy binary opposition 51 good opposites 16, 38 word-internal oppositeness 110f. overlapping semantic range 20 P Paradis, Carita 5, 8, 10, 16, 18, 35, 42, 49f., 53, 55f., 65, 119 parse tree 174f. Partington, Alan 10, 69, 82, 91, 120 pattern see frame perception 71, 73ff., 121 plesionymy see synonymy polysemy 133, 136 pragmatic function 50 predicate 173, 175, 180 re-reactive predicate 100 presupposition 96ff., 108ff. priming 30, 55, 62ff., 141
R reference 74, 80, 83 database see GermaNet corpus see DeReKo / corpus link reference 151ff. structure see text-technology temporal reference 97 relation syntagmatic relation 6, 171ff., 180 taxonomic relation 172 thematic relation 173 relational coercion 61 repeated measures 31, 33 representation 7, 71, 85, 92f., 101, 164f. response word 24ff., 35, 41 see also stimulus word S scale 17, 24ff., 37f., 51f., 54f., 59, 70, 95, 102 self-organising lexical feature maps / SOMs 133 semantic categorisation 42, 165ff. dimension 35 element / feature 8, 70, 96ff. 118, 180 field see also field inclusion 75, 77 scale 51f., 54
188 Lexical-Semantic Relations
sense disambiguation 123, 163f., 171, 178f. sequencing 30 significance level 19 similarity 69ff., 121ff., 129ff. see also synonymy Sinclair, John 119f., 180 situational roles 110f. situation type see speech act speaker’s intention 96f., 99, 104ff., 109 speech act general resource situation type 96f. event type 97f., 101, 104, 106, 108ff. situation type 96f., 104, 106ff. speech act verb 95ff., 99ff., 106ff. stimulus word 23ff., 35ff., 45, 55f. Storjohann, Petra 10, 49, 65, 69, 85, 137, 149 structure coarse-grained structure 116f., 122f. fine-grained structure 116f., 122f., 133 structuralism 5f., 70
superordination see hyperonymy synonymy 68ff., 130ff. coordinated synonymy 85f. meaning equivalents 69ff., 84ff., 136ff., 147f. near-synonyms 91, 136ff. part-of-whole synonymy 75, 84 plesionymy 137 see also near-synonyms subordinated synonymy 85, 89 synonym cluster 85, 89 synset 164ff. systematisation, systematicity 121, 123 T taxonomic structure 133 Taylor, John 91, 137 template see frame text-technology extended link 157f. reference structure 159 target resource 155ff., 159 traversal rule 157f. XLink 154ff. XML-structure 152ff. theory of meaning see meaning
U usage-based 7, 54, 71, 115, 118f. V Vachková, Marie 121, 133, 138, 142 value descriptive value 116 explanatory value 117 functional value 116 W web-as-corpus approach 54 Willners, Caroline 10, 16f., 19f., 49f., 53, 55f. Winkler, Edeltraud 96, 182 wordnet EuroWordNet 169f. GermaNet 10, 161ff., 171f. language-specific wordnets 169f. WordNet 10, 163ff., 169f., 172 X XLink see text-technology XML see text-technology
In the series Lingvisticæ Investigationes Supplementa the following titles have been published thus far or are scheduled for publication: 28 Storjohann, Petra (ed.): Lexical-Semantic Relations. Theoretical and practical perspectives. 2010. viii, 188 pp. 27 Fradin, Bernard: La raison morphologique. Hommage à la mémoire de Danielle Corbin. 2008. xiii, 242 pp. 26 Floricic, Franck (dir.): La négation dans les langues romanes. 2007. xii, 229 pp. 25 Bat-Zeev Shyldkrot, Hava et Nicole Le Querler (dir.): Les Périphrases Verbales. 2005. viii, 521 pp. 24 Leclère, Christian, Éric Laporte, Mireille Piot and Max Silberztein (eds.): Lexique, Syntaxe et Lexique-Grammaire / Syntax, Lexis & Lexicon-Grammar. Papers in honour of Maurice Gross. 2004. xxii, 659 pp. 23 Blanco, Xavier, Pierre-André Buvet et Zoé Gavriilidou (dir.): Détermination et Formalisation. 2001. xii, 345 pp. 22 Salkoff, Morris: A French-English Grammar. A contrastive grammar on translational principles. 1999. xvi, 342 pp. 21 Nam, Jee-Sun: Classification Syntaxique des Constructions Adjectivales en Coréen. 1996. xxvi, 560 pp. 20 Bat-Zeev Shyldkrot, Hava et Lucien Kupferman (dir.): Tendances Récentes en Linguistique Française et Générale. Volume dédié à David Gaatone. 1995. xvi, 409 pp. 19 Fuchs, Catherine and Bernard Victorri (eds.): Continuity in Linguistic Semantics. 1994. iv, 255 pp. 18 Picone, Michael D.: Anglicisms, Neologisms and Dynamic French. 1996. xii, 462 pp. 17 Labelle, Jacques et Christian Leclère (dir.): Lexiques-Grammaires comparés en français. Actes du colloque international de Montréal (3–5 juin 1992). 1995. 217 pp. 16 Verluyten, S. Paul (dir.): La phonologie du schwa français. 1988. vi, 202 pp. 15 Lehrberger, John and Laurent Bourbeau: Machine Translation. Linguistic characteristics of MT systems and general methodology of evaluation. 1988. viii, 240 pp. 14 Subirats-Rüggeberg, Carlos: Sentential Complementation in Spanish. A lexico-grammatical study of three classes of verbs. 1987. xii, 290 pp. 13 Vergnaud, Jean-Roger: Dépendance et niveaux de représentation en syntaxe. 1985. xvi, 372 pp. 12 Hong, Chai-Song: Syntaxe des verbes de mouvement en coréen contemporain. 1985. xv, 309 pp. 11 Lamiroy, Béatrice: Les verbes de mouvement en français et en espagnol. Etude comparée de leurs infinitives. 1983. xiv, 323 pp. 10 Zwanenburg, Wiecher: Productivité morphologique et emprunt. 1983. x, 199 pp. 9 Guillet, Alain et Nunzio La Fauci (dir.): Lexique-Grammaire des langues romanes. Actes du 1er colloque européen sur la grammaire et le lexique comparés des langues romanes, Palerme, 1981. 1984. xiii, 319 + 58 pp. Tables. 8 Attal, Pierre et Claude Muller (dir.): De la Syntaxe à la Pragmatique. Actes du Colloque de Rennes, Université de Haute-Bretagne. 1984. 389 pp. 7 Taken from program. 6 Lightner, Ted: Introduction to English Derivational Morphology. 1983. xxxviii, 533 pp. 5 Paillet, Jean-Pierre and André Dugas: Approaches to Syntax. (English translation from the French original edition 'Principes d'analyse syntaxique', Québec, 1973). 1982. viii, 282 pp. 4 Love, Nigel: Generative Phonology: A Case Study from French. 1981. viii, 241 pp. 3 Parret, Herman, Leo Apostel, Paul Gochet, Maurice Van Overbeke, Oswald Ducrot, Liliane Tasmowski-De Ryck, Norbert Dittmar et Wolfgang Wildgen ess.: Le Langage en Contexte. Etudes philosophiques et linguistiques de pragmatique. Sous la direction de Herman Parret. 1980. iv, 790 pp. 2 Salkoff, Morris: Analyse syntaxique du Français-Grammaire en chaîne. 1980. xvi, 334 pp. 1 Foley, James: Theoretical Morphology of the French Verb. 1979. iv, 292 pp.