This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Language Learning & Language Teaching (LL<) The LL< monograph series publishes monographs, edited volumes and text books on applied and methodological issues in the field of language pedagogy. The focus of the series is on subjects such as classroom discourse and interaction; language diversity in educational settings; bilingual education; language testing and language assessment; teaching methods and teaching performance; learning trajectories in second language acquisition; and written language learning in educational settings.
Editors Nina Spada
Ontario Institute for Studies in Education, University of Toronto
Nelleke Van Deusen-Scholl Center for Language Study Yale University
Volume 24 Connected Words. Word associations and second language vocabulary acquisition by Paul Meara
Connected Words Word associations and second language vocabulary acquisition
Paul Meara Swansea University
John Benjamins Publishing Company Amsterdam / Philadelphia
8
TM
The paper used in this publication meets the minimum requirements of American National Standard for Information Sciences – Permanence of Paper for Printed Library Materials, ansi z39.48-1984.
Library of Congress Cataloging-in-Publication Data Meara, P. M. (Paul M.) Connected words : word associations and second language vocabulary acquisition / Paul Meara. p. cm. (Language Learning & Language Teaching, issn 1569-9471 ; v. 24) Includes bibliographical references and index. 1. Second language acquisition--Study and teaching. 2. Vocabulary--Study and teaching. 3. Language and languages--Study and teaching I. Title. P118.2.M423 2009 418.0071--dc22 2009019814 isbn 978 90 272 1986 2 (hb; alk. paper) / isbn 978 90 272 1987 9 (pb; alk. paper) isbn 978 90 272 8907 0 (eb)
chapter 3 Lex30: An improved method of assessing productive vocabulary in an L2
33
chapter 4 Exploring the validity of a test of productive vocabulary
45
section 3. Word association networks
59
chapter 5 Network structures and vocabulary acquisition in a foreign language
65
chapter 6 V_Links: Beyond vocabulary depth
73
chapter 7 A further note on simulating word association behaviour in an L2
85
section 4. Bibliograhical resources for word associations in an L2
97
chapter 8 Word associations in a second language: An annotated bibliography
101
Connected Words
section 5. Software applications
129
chapter 9 Lex 30 v3.00: The manual
131
chapter 10 V_Six v1.00: The manual
147
chapter 11 WA_Sorter: The manual
159
References
165
Index
171
Acknowledgements Chapter 1 first appeared in Interlanguage Studies Bulletin – Utrecht 3(2): 192–211, 1978. Chapter 2 first appeared in Nottingham Linguistics Circular 11: 28–38, 1983. Chapter 3 and Chapter 4 were co-authored with Tess Fitzpatrick, and appeared respectively in System 28(1): 19–30, 2000 and in Vigo International Journal of Applied Linguistics 1: 55–74 (2004). Chapter 5 first appeared in: P.J.L. Arnaud & H. Béjoint (Eds). 1992. Vocabulary and Applied Linguistics. London: Macmillan. Chapter 6 was co-authored by Brent Wolter, and appeared in Angles on the English Speaking World 4: 85–97, 2005. Chapter 8 was co-authored with Brent Wolter and Clarissa Wilks and first appeared in Second Language Research 21(4): 359–372, 2005.
Introduction Connecting words Applied linguistics is a curious discipline, one that seems particularly badly inflicted with band-wagon research. Every now and then someone produces an especially insightful paper, and within a couple of years almost everyone else has abandoned the research they were previously working on to follow up these new ideas. The problem then is that once the original band-wagon stalls, and another band-wagon appears, people are only too ready to move on to the new topic. One consequence of this is that there is a huge quantity of “research” which does very little to move the field forward in any real sense. Hardly anyone looks at the fundamental assumptions underlying the current band-wagon, hardly anyone asks critical questions about the methodologies that the current band-wagon depends on, and few people are willing to invest much time and effort into developing better methodologies that support these critical questions. This volume contains a set of papers that deal with word associations in a foreign language. I first started working in this area back in the 1970s when the psycholinguistics of foreign language speakers was a severely under-researched field. At the time, most applied linguists still looked towards theoretical linguistics as their source discipline, and most of the interesting ideas that were being discussed still relied very heavily on exciting theories about the nature of language that were being developed by linguists. This meant that there was a strong emphasis on formal aspects of second language production, but only a handful of people were interested in the processes which underpinned these productions. This emphasis can be seen very clearly in the 1984 volume edited by Davies, Criper & Howatt, which marked Pit Corder’s retirement as head of the Department of Applied Linguistics in Edinburgh. Selinker’s summary paper in that volume (Selinker 1984) identified nine issues which had emerged in the conference, and which he felt were central to the enterprise of interlanguage studies: methodology (by which he meant the use of intuitions and judgements about grammaticality), language transfer (the influence of L1 structural features on L2 output), fossilization (the way some L2 structures persist even when they are grammatically incorrect by Native Speaker standards), the Universal hypothesis, Universal Grammar (the assumption that learners have a grammar in their heads), Interlanguage Strategies (specifically the distinction between communication strategies and learning strategies), Interlanguage Discourse (specifically the impact of classroom discourse on interlanguage
Connected Words
development), and Context in Interlanguage Studies (by which he seems to mean the impact of special purposes environments on language acquisition). My own contribution to that volume was unusual in that it was the only piece that identified second language vocabulary as an area of importance – and it got some criticism for doing so. At the time, this didn’t surprise me – after all, it was not long since the whole question of L2 lexical competence had been dismissed as uninteresting by the major figures in the field. Hockett’s assertion that “there is no point in learning large numbers of (words) until one knows what to do with them ... The acquisition of new vocabulary hardly requires formal instruction” was widely accepted as a non-negotiable premise (Hockett 1958:266). More recently, Canale and Swain’s seminal paper on communicative competence, which was to define the dominant paradigm in SLA for many years, had reduced vocabulary knowledge to a very minor role in grammar competence (Canale & Swain 1980). With hindsight, however, it is perhaps more of a surprise that so few people were taking vocabulary acquisition seriously. This, after all, was the heyday of Verbal Learning – a vast area of psychological research, which dealt entirely with words, how we learn them and how we use them, and what we can learn about memory and cognition by studying the way people handle words. I suppose that verbal learning had a bad press with linguists because it was linked in many people’s minds to behaviourism. Everyone knew that behaviourism had been definitively rubbished by Chomsky in his review of B.F. Skinner’s Verbal Behavior (Chomsky 1959) – a review which was often quoted but rarely read. A few die-hard psychologists still clung stubbornly to behaviorist views, but most linguists believed that these ideas had little to offer to linguistics. This view of the verbal behaviour movement was, of course, a travesty of what the work really involved. True, a lot of it did involve people learning lists of words, and a lot of it even involved people learning long lists of nonsense words. But this work was actually a lot more interesting than linguists made it out to be. Some of it was extremely sophisticated, making use of complex mathematical models that were rarely used by linguists. Some of it was not so sophisticated, but the sheer volume of work that was carried out, the large number of variables that were studied, and the huge range of problems that the results were applied to meant that verbal learning and verbal behaviour formed a compulsory core element in the training of academic psychologists in a way that it never did for linguists. A good example of this divergence can be found in Crothers & Suppes (1967) book, which presented a coherent and mathematically sophisticated model of the way L2 learners might acquire vocabulary from word lists. As far as I know, this work was not reviewed in any of the major journals in Applied Linguistics at the time of its publication, and even today, it is only rarely mentioned by the main researchers in vocabulary acquisition. Nation (2001), for example, summarises this text in three short sentences, while Singleton (1999) and Schmitt (2000) do not mention it at all.
Introduction
The study of word associations was one small part of this vast enterprise, but even so, it managed to attract significant figures. James Deese’s work was particularly important, e.g., Deese (1965) and David Palermo and James Jenkins published standard lists of word association norms which had enormous influence on the kinds of research people did in psycholinguistics (e.g., Palermo & Jenkins 1964). Word associations turned out to have implications for the study of memory, (Bousfield 1953), child language acquisition (Brown & Berko 1960), cognitive and behavioural disorders (Rapaport, Gill & Schafer 1968), language loss (Lesser 1974), cross-cultural psychology (Szalay & Deese 1978) and bilingualism (Lambert 1955, 1956), to name but a few of the many applications which appeared at this time. Wallace Lambert’s work was particularly interesting, because it hinted that word associations might be used as a way of assessing the overall language ability of L2 speakers. Lambert was mainly working with bilinguals in Canada – that is, with speakers of French and English, whose competence in both languages appeared to be relatively high. It was obvious, though, that the methodologies Lambert was using could be used with lower level subjects too, and might throw some light on the way vocabularies developed in more traditional L2 learners. This was not a new idea, of course. Klaus Riegel had already carried out some large scale and pioneering work with learners of German and Spanish as an L2, and by 1972, he had already established the main features of L2 word associations – that they were much less stable than L1 associations, that groups of L2 learners exhibited relatively low levels of associational stereotypy, and that associations broadly became more native-speaker-like as learners became more proficient in the L2. However, this line of research seemed to stop following Riegel’s early death. The only other influential piece of work which dates from this time was a paper by Politzer (1978) which suggested that vocabulary development in L2 learners parallels what happens in children learning an L1. Ervin (1961) had shown that young L1 speakers have a tendency to produce syntagmatic associations, but when they get to seven or eight years of age, this tendency drops off, and older children are more likely to produce paradigmatic associations – at least to high frequency stimulus words. Politzer argued (incorrectly in my view) that a similar shift occurs in L2 speakers. Surprisingly, this claim came to dominate L2 word association work (e.g., Söderman 1989). My own early work on word associations was very much in the style of Palermo and Jenkins. Later on, I managed to persuade some of my Master’s students to work on word associations too, and this enabled me to look critically at some of the methodological issues in word association research with L2 learners. The question of stability began to emerge as a crucial issue in these projects. When we ran repeated tests with L2 speakers, we found that the data they produced varied enormously from one administration to another, in a way which was not at all characteristic of L1 speakers. This feature of L2 word associations made it difficult
Connected Words
to interpret the raw data we were getting from our experimental work, and made it difficult to take some of the standard reports at face value. It also became clear that the “standard” word list that everybody used for word association research, the Kent-Rosanoff list – reproduced at the end of this chapter, was perhaps not the best stimulus list to use with non-native speakers. This list, which dated from 1910, had been widely used by psychologists, translated into many languages (e.g., Rosenzweig 1961), and underpinned the many lists of word association norms which had become available in the 1970s (e.g., Postman & Keppel 1970). I toyed briefly with the idea of developing an alternative list, but the difficulty of getting large enough numbers of subjects to take part in the necessary evaluations soon put paid to this plan. Instead, we developed two different approaches which looked as though they might make good use of small subject numbers. The first of these was to collect multiple associations from learners, rather than single associations. We thought that this might get round the problem of instability in L2 word associations, and we also thought that we might be able to use clever weighting systems based on the published norms to show that learners’ associations developed in systematic ways over time. Neither of these plans came to very much, mainly because the learners we had access to at the time had relatively small vocabularies, and this meant that they had difficulty producing more than two or three plausible associations for a given stimulus word. We also found that it was difficult to provide a plausible rationale for the weighting systems that we considered – a position which I later discovered to have been strenuously argued by Palermo & Jenkins (1964). Different weighting systems could give wildly different results, and the choice of one system over another seemed to be essentially unmotivated and arbitrary. Eventually, we were heavily influenced by another critical article from Sharwood Smith’s team in Utrecht (Kruse, Pankhurst & Sharwood Smith 1987), which seemed to show that there was no relationship at all between scores on a word association test and proficiency level in English as a second language, and questioned the very idea of using word associations as a proficiency measure. This paper appeared at a time when my own research was going particularly badly, and it convinced me that I didn’t have much of a future as a researcher. With hindsight, perhaps, I should have stuck to my guns and been a bit less overawed by colleagues who managed to get their work published in major journals, while my own rejected papers languished in drawers. Kruse et al. used only a tiny number of subjects, a handful of target words, a weighting system which was difficult to justify, and a multiple association task which required their subjects to produce up to twelve associations for each target word. These conditions were very different from the ones which my own group had been using, and we should not have been surprised that our own work was producing data which looked very different from what Kruse et al. were
Introduction
reporting. In practice, however, we pretty much abandoned main-stream association work, in favour of some rather different methodologies. One of these methodologies was a backwards association task (Vives Boix 1995), in which we gave subjects a set of association responses and asked them to identify the original stimulus word that they were all associated to. This turned out to be a very difficult task, which often stumped even advanced learners of the L2, and since most of the subjects available to us at the time were relatively low level learners, we did not take this idea very far. A second method turned out to be more interesting. It involved asking people to generate association responses to sets of semantically related words, and recording how often a word within the set occurred as an association to one of the other words in the set. For native speakers, small sets of words of this sort would often turn out to be densely inter-related. For example, the set SLEEP DREAM PILLOW SNOOZE WAKE BED NIGHT tended to produce very dense networks of associations. The same set of words presented to non-native speakers generated much sparser response networks, with hardly any evidence in extreme cases of the kind of semantic clustering that we were finding with native speakers. The effect was very easy to reproduce. This made me wonder whether it might be possible to move away from the individual responses generated by L2 speakers as raw data, and look instead at the structural properties of their L2 networks. The beauty of this idea was that it would allow us to ignore the ephemeral and unstable features of L2 speakers’ word associations and focus instead on deeper underlying structural properties of these lexical networks. At the time this seemed like an obvious and straightforward way to go, but it turned out to be much more difficult to implement than I had expected. The basic idea was that non-native speakers’ lexical networks would be less well-developed than the networks found in native speakers, a claim which few researchers would disagree with. In practice, however, it has been very difficult to pin this idea down, and finding a robust experimental methodology which could exploit this way of thinking about vocabularies has turned out to be a surprisingly elusive goal. The papers in this collection are very far from the last word in word association research. They do, however, illustrate two important themes which I hope might be of value to young scholars and beginning researchers. The first theme is that doing research is not straightforward. People who study bibliometrics – the relationships between published works and their authors – have established a number of laws which describe the structure of a research field. One of the most important of these laws is Lotka’s Law, which describes the number of publications generated by people working in a particular research field. The Law states that “the number of authors making N contributions is about 1/N2 of those making one contribution” (Lotka 1926). What this means in practice is that more
Connected Words
than half of the papers in a particular field are generated by people who publish only one paper in that field, and the number of people who publish more than a handful of papers is relatively small. The L2 word association field is probably too limited for Lotka’s Law to apply accurately, but the Law gives you an idea of how much “research” is actually produced by people who have not grappled with the ideas for very long. One of the problems with research that gets done in this way is that it only rarely asks the right questions. Most one-off research tends to ask questions which are obvious and unsubtle. The subtle questions only become apparent when you work with a problem over a long period of time, and eventually realise that the questions you started off asking were not the ones you should have been asking at all. It is at this point that research gets really interesting, but much more difficult. Almost by definition, you find yourself working at the edge of your methodological and conceptual competence, and inevitably some of the things you find yourself saying turn out to be misconceived, or just plain wrong. Once you step away from the obvious research questions, it becomes much harder to explain to your colleagues what you are doing, and it becomes much more difficult to persuade them that what you are doing is useful and relevant. Every applied linguist knows about syntagmatic and paradigmatic word associations, for example, and people seem to find it reassuring and comforting to sit through conference papers which go over this old ground, perhaps providing a couple of useful illustrations that can be used in first year lectures on psycholinguistics. Complex analyses of these familiar problems are unsettling, and generate surprisingly stiff resistance. Nonetheless, the right questions to ask almost always turn out to be questions that other people simply can’t see the point of. Finding these questions takes time and effort, and you may not recognise them when they appear. However, the longer you work in an area, and the more you worry its basic assumptions, the more likely you are to find the critical questions that are really worth answering. The second theme that arises in these papers is the importance of importing research methods from outside your own discipline. This shows up in three ways. Firstly, all the papers in this volume are heavily influenced by the work of psycholinguists. They use experimental and statistical methods which even now are infrequently used by applied linguists, who often seem to be uncomfortable with quantitative approaches to research. I think that this may be one reason why this work has not been as influential as it might have been, at least in the UK. (The way that “hard” psycholinguistic research is disappearing from undergraduate linguistics worries me a lot – I suspect that it will be seriously detrimental to research in Applied Linguistics in the future.) Secondly, some of the papers in this volume rely heavily on an abstract mathematical approach to the analysis of network structures called graph theory
Introduction
(cf. Harary 1969). This methodology is widely used in the Social Sciences, and has had enormous influence on our understanding of the way networks function. Surprisingly, however, few linguists have tried to apply these methods to vocabulary networks. Most of the running here is again made by psycholinguists e.g., Miller’s WordNet project (Fellbaum 1998) or by computer scientists (Landauer & Dumais 1977; Ferrer-i-Cancho & Solé 2001), and again, these works are rarely cited by the main figures working on L2 vocabulary acquisition. The fact that Applied Linguists typically make little of developments of this sort is again a worrying feature of the discipline. The third interdisciplinary feature of the work reported in this volume is that a lot of it relies on computing. Over the many years that I have been working on word associations, a lot of my research time has been taken up by writing computer programs, and I am now a fairly proficient programmer in half a dozen languages. This investment has influenced my work in a number of ways. At the simplest level, most of the empirical data reported here was collected using computer programs which I wrote to standardise the data collection, and to facilitate the analysis of the data. A lot of research is basically drudgery and slog, but word association research is particularly bad in this respect. I do not think I could have done this work had I not been able to write simple computer programs that reduced what would have been weeks of hand-coding to a few minutes of processing time. More importantly, perhaps, some of the later papers in this volume make use of modelling and simulation techniques, which also rely on programming skills. Modelling of this kind is still in its infancy as far as Applied Linguistics is concerned, and it has not always been welcomed by the research community. In my view, it is pretty obvious, even from the simplistic models developed here, that modelling is an enormously powerful tool which has huge potential for Applied Linguistics research, and I feel strongly that elementary computer programming should form an essential part of the training that young researchers in Applied Linguistics are provided with. The papers in this book illustrate the value of working on a problem for a long time, and attacking it from a number of different perspectives. Most research is fundamentally metaphorical, in the sense that it develops an analysis which describes some aspects of a problem in terms of other more familiar objects. In our case, the core metaphor seems to be the idea that vocabulary is a network, an idea that seems to be so blindingly obvious that it is easy to forget its metaphorical nature. What these papers show is that we are far from understanding precisely what assumptions the network metaphor brings with it, and that our understanding of the way second language vocabulary networks function is very far from complete. The rest of this book explores these ideas in more depth. The book is divided into five sections. Section 1 consists of two papers that first appeared in 1978
Connected Words
and 1983 respectively. They represent what you might call “classic” word assocation studies: in one paper I report a small collection of data from L2 learners of French, while in the second paper, I comment on some of the problems this type of approach presents for L2 researchers. Section 2 is a slightly different take on word associations. The papers in this section illustrate how word association data can be used to ask interesting questions about aspects of vocabulary knowledge which are not usually addressed in this way. The focus here is on L2 learners’ productive vocabulary. This is a notoriously difficult area to work in, and the word association techniques described offer a methodologically innovative solution to this problem. The papers in this section were first published in 2000 and 2004 respectively. Section 3 contains three papers which explicitly explore the idea of a vocabulary as a network, and they involve a much more sophisticated approach to word association data than anything in Section 1 or Section 2. Chapter 5 is an early paper (1992) which introduces the basic idea of applying graph theory concepts to word association data. Chapter 6, which first appeared in 2004, describes some attempts to implement these ideas in a computerised test format. Chapter 7, a paper from 2005, illustrates a more ambitious approach in which I tried to model word association behaviour using simulation studies. Section 4 consists of a single, previously unpublished paper, which lists and summarises all the available work on word associations in a second language. Section 5 is in many ways the most innovative part of this book. It contains instruction manuals for a set of computer programs which I have developed as part of my work ongoing on word associations. These programs will allow readers to explore for themselves some of the many ideas discussed in this book. It has become something of cliché to describe research as journey. When you set out it is by no means obvious how you are going to get to your journey’s end. When you get to the end of your journey, the place you arrive at is usually not the place that you expected to be in when you set out. Most often, the interesting happenings are the serendipitous and unexpected discoveries that you make on the way, the byways you explore en route, rather than what you expected to see. This book has explored lots of byways. I hope that the reader will find them as interesting as I have done.
Acknowledgements A number of people have influenced the way I have thought about word associations, and their ideas will be found throughout the papers in this volume. I owe a particular debt of gratitude to Clarissa Wilks, Brent Wolter and Tess Fitzpatrick, all of whom taught me far more than they realise. Good colleagues all.
Introduction
Downloads Up-to-date versions of the computer programs described in Section 5 can all be downloaded from my website: http://www.lognostics.co.uk/
Appendix The Kent-Rosanoff Word Association List The Kent-Rosanoff list is a set of 100 words commonly used in studies of word associations. They first appeared in: G.H. Kent & A.J. Rosanoff. 1910. A study of association in insanity. American Journal of Insanity 67: 37–96 & 317–390. 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97
table 2 man 6 mountain 10 comfort 14 butterfly 18 sweet 22 slow 26 beautiful 30 foot 34 sleep 38 high 42 trouble 46 eagle 50 dream 54 boy 58 memory 62 swift 66 ocean 70 religion 74 hammer 78 butter 82 lion 86 tobacco 90 quiet 94 king 98
dark 3 deep 7 house 11 hand 15 smooth 19 whistle 23 wish 27 window 31 spider 35 anger 39 working 43 soldier 47 stomach 51 yellow 55 light 59 sheep 63 blue 67 head 71 whiskey 75 thirsty 79 doctor 83 joy 87 baby 91 green 95 cheese 99
music 4 soft 8 black 12 short 16 command 20 woman 24 river 28 rough 32 needle 36 carpet 40 sour 44 cabbage 48 stem 52 bread 56 health 60 bath 64 hungry 68 stove 72 child 76 city 80 loud 84 bed 88 moon 92 salt 96 blossom 100
sickness eating mutton fruit chair cold white citizen red girl earth hard lamp justice bible cottage priest long bitter square thief heavy scissors street afraid
section 1
Early work The two papers in this section represent what we might broadly call “classic research” in word associations. Chapter 1 is a report of a set of word association data collected from a group of L1 English speakers learning French in school. Each subject contributed a single response to each of 100 stimulus words. The paper lists the primary, secondary and tertiary responses to each of these stimulus words, together with a complete set of responses to a small number of the stimulus words, and comments on some of the characteristics of this data which make it different from L1 data. Surprisingly, perhaps, this paper remains one of only a handful of studies which have looked at the development of lexical skills in low-level learners of French. With hindsight, the approach it adopted wasn’t a particularly innovative or insightful way of going about things. It merely followed the standard methodology, transporting it into a new setting. The paper was very much influenced by Riegel and Zivian’s pioneering work on learners of German (Riegel & Zivian 1972). It soon became clear, however, that there were a number of problems with the type of approach illustrated in this chapter, and some of these problems are discussed in Chapter 2. Again, with hindsight, it is easy to see that the main problem was a general lack of a theoretical framework which would make sense of the masses of data that word association studies generate. Listing the responses was easy, and followed the standard approach used by psycholinguists at the time (e.g., Postman & Keppel 1970). What proved more difficult was to tease out what made L2 word association responses different from L1 associations, and how you could use these findings to develop interesting claims about the way L2 vocabularies grow. The main difference I identified was a surprisingly large number of responses that made absolutely no sense to native speakers, and a similarly large number of responses which could best be described as arising from misreadings of the stimulus words. This should have made me ask questions about the way word forms are coded in L2 lexicons, but I did not appreciate its significance at the time, and only picked up on this idea much later (Meara & Ingle 1986). The prevailing framework of analysis at the time was to classify responses as syntagmatic, paradigmatic or clang responses, following work done with young L1 speakers, and though it soon became obvious that this set of classifications was not the most
Connected Words
illuminating way of analysing L2 word association responses, it was not obvious what other classifications would have been better. There were also some serious issues with the standard stimulus word list, which seemed not to be responsive to the most interesting features of L2 lexicons. These niggling problems made me wonder about alternative ways of analysing word association data, which would be less dependent on the specific stimuli and specific responses, and more responsive to the overall structural properties of the learners’ vocabularies. These ideas will re-emerge in Section 3. Meanwhile, the data generated by this study, and other studies that we were running alongside it, was extensive. 76 subjects each producing responses to 100 stimulus words amounted to 7600 data points. At this time, desk top computers were still a far-off dream, so the data had to be stored in hard copy, and sorted and analysed by hand. It was a massive and mind-numbing task. Even carrying the data around was no mean feat. Everyone is familiar with the old adage about research being 1% inspiration and 99% perspiration, but the amount of perspiration involved in processing this data seemed to be excessive, and I started to look around for alternative ways of handling this amount of data. About this time, London University was beginning to offer training in elementary computer programming for staff, and I learned to write simple programs to process word association data. The programs were not difficult to write. There was a language called SNOBOL4, (Griswold, Poage & Polonsky 1971) which had very powerful string processing commands, and allowed you to write short programs that sorted out piles of data in no time at all. There were, however, two logistical problems. The first problem was that data needed to be stored on punch cards. These were thin pieces of card, which stored data in the form of holes punched in columns. The cards had to be punched using an enormous machine – about the size of a small dinner table – and these machines were operated by a team of data inputters. The data I had to deal with was fairly interesting compared to the numerical data that these operators punched all day, but the level of errors generated by the inputters was still high, and the data needed to be checked rigourously. Sometimes it took several days to get a data set properly recorded. Each card could store 80 columns of data – about one line of text. In practical terms, that meant that you could just about store 10 words on a card, so if your subjects were generating responses to the standard 100 word list, then you needed 10 cards per subject. If you had 76 subjects, then your data required a nearly a thousand punch cards to encode the entire data set. You could just about fit this much data into a small suit-case. The second problem was that there were only two data input points in London University and the nearest one to my office was a twenty-minute walk away. The drill was that you took your program and data cards to the input point, and left them to be processed by an operator. The operator ran them through a
Section 1. Early work
machine that read the punch cards. If you were lucky, and your program worked correctly, you could come back the next day and pick up your data. The chances of a 1000-card job running without a mis-feed were fairly remote, so you generally had to break big jobs down into smaller ones, and then recombine the results into a single report later. It was a lot of work. The arrival of desk-top computing has made it much easier to process large amounts of data. The suit-cases of data that I used to carry across London would now fit comfortably on a single floppy disk, and a job that would have taken me several days to complete can now be accomplished in a matter of minutes. Unfortunately, processing word association data is still a fiddly and frustrating job unless you can write your own programs. I have therefore developed a small suite of utility programs that researchers interested in working with word association data can use to collect and analyse their data. These programs, which take some of the effort out of sorting and analysing large word association data sets, are described in more detail in Section 5 of this book. In fact, the main spin-off from this early experience of writing programs was a long love affair with computers, which turned out to be a significant part of my development as a researcher. The skills I learned while I was processing word association data turned out to be immensely useful as desk-top computers became standard research equipment. They made me relatively independent of professional programmers (who were expensive to employ), and also independent of the fairly limited research packages which were available at the time (which severely constrained the types of research questions you could follow up). Eventually, I got to the stage where I was able to write applications for delivering tests and processing the data they generated. Some examples of this type of application are reported in Section 2. More importantly, perhaps, my programming skills allowed me to develop simulation programs with which it was possible to explore some of the ideas about global lexical competence which were beginning to emerge from these early studies on word associations. These ideas are discussed in more detail in Section 3.
chapter 1
Learners’ word associations in French Introduction A Word Association Test consists of a list of words which are presented one at a time. For each word in the list you have to write down or say aloud the first word that comes to your mind. For many people, tests of this sort are closely associated with psychoanalysis, and a popular image of them is that they are a key to our subconscious and innermost selves. Word associations are indeed used in psychoanalysis, and in a number of other clinical situations, but there is also a long and respectable history attached to the study of the word associations produced by people who are not disturbed in any way. In contrast with the popular image, the word associations of normal adults are very unrevealing about their subconscious selves, and they show a surprisingly high degree of unoriginality. Table 1 below contains a list of ten common words taken from one of the standard word association tests, The Kent-Rosanoff list (Kent & Rosanoff 1910). Read through the list quickly, and write down the first word that comes to mind for each word in the list. When you have done this, check your answers against Table 2. Table 1. Ten stimulus words from the Kent-Rosanoff list 1: TABLE
______________
2: MAN
______________
3: SOFT
______________
4: BLACK
______________
5: HAND
______________
6: SHORT
______________
7: SLOW
______________
8: NEEDLE
______________
9: BREAD
______________
10: BITTER
______________
Table 2 lists the most common responses to the words in Table 1, and you should find that most of your responses are to be found there. For common stimulus words, such as these, the associations that normal people make are in fact very predictable. Given TABLE, for example, 78% of respondents reply with chair; given MAN, 78% respond with woman; BLACK produces white 70% of the time; BREAD gives butter 56% of the time, and so on.
Connected Words
Table 2. Commonest responses to the stimulus words in Table 1 1: TABLE 2: MAN 3: SOFT 4: BLACK 5: HAND 6: SHORT 7: SLOW 8: NEEDLE 9: BREAD 10: BITTER
chair woman hard white foot long fast thread butter sweet
cloth dog cushion night finger tall quick cotton jam lemon
talk boy light cat glove fat train pin cheese beer
desk child bed dark arm small snail eye food sour
Normal adults produce two main types of association, called syntagmatic and paradigmatic associations. Syntagmatic associations are associations that complete a phrase (syntagm) and some typical responses of this sort are shown below:
BRUSH HOLD BLACK BANK
teeth hands mark robber
Paradigmatic associations are ones in which the stimulus word and the response that it evokes both belong to the same part of speech, nouns evoking nouns, verbs evoking verbs, and so on. In these cases, the two words usually share a large part of their meaning, and both stimulus and response can usually occur in the majority of contexts where the other appears. Typical paradigmatic responses are:
MAN BOY FATHER HOT TREE
woman girl son cold bush
(meaning identical except for sex) (meaning identical except for sex) (different views of the same relationship) (polar opposite adjectives) (both plants of a woody kind)
An association such as MAN ~ snail would technically be classed as a paradigmatic association, but responses of this sort, where the two words are not closely related semantically, are rather uncommon. Normal adults tend to produce more paradigmatic responses than syntagmatic ones, provided the stimulus words are reasonably common. Less frequent words, which tend to occur in more constrained contexts, are more likely to produce syntagmatic responses. Children under seven years of age have a strong tendency to produce syntagmatic responses as a first preference to any word. They also tend to produce a large number of so called “clang associates” – associations where the
Chapter 1. Learners’ word associations in French
response is heavily influenced by the form of the stimulus word rather than its meaning. Some examples of clang responses are given below:
Responses of this last type are rare in normal adults, though they frequently occur in some types of mental illness, and under the influence of drugs.
The study The associations reported in this paper are those of 76 girls learning French in two London comprehensive schools. All the girls were preparing for the O-Level examination in French, and were tested at the beginning of their final year of study. The girls were each given a list of 100 French words and were asked to write down beside each one the first French word that it made them think of. The words were a translation of the standard Kent-Rosanoff list (Rosenzweig’s 1957 translation). This list is made up of high frequency words which students at this level would be expected to know. All but seven of the words are contained in either the premier or the deuxième degré of the français fondamental list (Gougenheim et al. 1956). The complete list will be found in the tables that follow. There are a number of reasons why it is interesting to look at the word association patterns of a group of students who are moderately proficient in a foreign language, but who have not yet achieved any real degree of fluency. Firstly, most of the work on the psychology of foreign language learning has concentrated on syntactic aspects of acquiring a new language. Hardly anyone has looked at what happens to foreign language words in the early stages of their acquisition, although learners themselves often identify vocabulary as a major problem area. It seems important that this neglect should not be allowed to continue. Secondly, the work on syntactic aspects of foreign language acquisition has suggested that there are a number of interesting parallels between learners and children acquiring their first language. It would be interesting to know whether these parallels also extend to vocabulary, and in particular, it would be interesting to know whether there is any tendency for learners to produce the syntagmatic responses and clang associates that are characteristic of young children, or whether they produce typically adult responses from very early on in the learning process. Thirdly, there is the problem of how foreign words are stored in the learner’s mental lexicon. Are they organised into semantic networks that are quite separate from the native language lexicon, or
Connected Words
do learners merely tag their French words onto their native language equivalents? If the latter were the case, one would expect to find that a large proportion of the associations produced by learners were merely translations of the normal English responses to the equivalent English stimulus word. If the learners were building independent lexicons for the two languages, then one would expect to find systematic differences between learners’ responses in English and French. The word associations produced by native French speakers are broadly comparable with those of native English speakers. Both groups produce a high proportion of paradigmatic responses, and in many cases the most common responses are very similar in both languages. In other cases, either for cultural reasons, or because there is a mismatch between the French and English lexicons, the principal responses in the two languages are quite different. Some examples are given in Table 3 below. Table 3. The most common responses in English and French to ten words from the Kent-Rosanoff list DEEP PROFOND
shallow creux
sea mer
water puits
MOUNTAIN MONTAGNE
hill neige
valley plaine
snow mer
HOUSE MAISON
home toit
garden foyer
door porte
BUTTERFLY PAPILLON
moth fleur
wing aile
net couleur
SWEET DOUX
sour dur
sugar mou
bitter agréable
EARTH TERRE
soil mer
sky ciel
ground ronde
SOLDIER SOLDAT
sailor guerre
army plomb
uniform armée
STOMACH ESTOMAC
food digestion
ache ventre
pain faim
YELLOW JAUNE
blue vert
red citron
green serin
BREAD PAIN
butter vin
jam blanc
cheese manger
HEALTH SANTE
sickness maladie
wealth fragile
happiness bonne
MEMORY MEMOIRE
mind souvenir
thought intelligence
forgetfulness leçon
Chapter 1. Learners’ word associations in French
The results of this study will be found in Table 4. This table contains the three most frequent responses produced by the learner group. (These are known respectively as the primary, secondary and tertiary responses.) Table 4 also reports the number of students contributing to each response and the French primary response for each stimulus word. This table lists each of the 100 stimulus words (col2), the most common native speaker response (col3), and the three most frequent responses produced by the learner group (cols4–9). The numbers indicate the number of subjects contributing to each of the responses. The final column gives the number of different responses produced by the learner group. The symbols preceding the learner responses are explained in the text. Table 4. Data elicited from 76 female learners of French
table sombre musique maladie homme profond mou manger montagne maison noir agneau confort main petit fruit papillon lisse ordre chaise doux sifflet femme lent désirer rivière blanc beau fenêtre
chaise clair note lit femme puits dur boire neige toit blanc doux fauteuil pied grand pomme fleur rugueux désordre table dur train homme rapide vouloir fleuve noir joli rideau
rugueux citoyen pied araignée aiguille rouge sommeil colère tapis fille haut travail aigre terre difficulté soldat chou dur aigle estomac tige lampe rêve jaune pain justice garçon clair santé évangile mémoire mouton bain villa rapide bleu faim prêtre océan tête fourneau
lisse vote chaussure toile fil noir lit rouge moëlleux garçon bas repos doux mer facilité guerre fleur mou oiseau digestion fleur lumière sommeil vert vin balance fille obscur maladie bible souvenir doux mer mer train mer soif noir mer cheveux cuisine
long religion cognac enfant amer marteau soif ville carré beurre docteur bruyant voleur lion jolie lit lourd tabac bébé lune ciseaux tranquille vert sel rue roi fromage fleur effrayer
court église alcool petit doux pilon faim Paris rond jaune maladie enfant bicyclette crinière tristesse repos léger fumée rose nuit couper calme pré mer maison reine blanc rose peur
Notes: The French norms are taken from Rosenzweig (1970) and are the primary responses produced by 184 female students. Rosenzweig also reports two other sets of data, responses from 104 male students and responses from 136 workmen, but the female norms seemed most appropriate for comparison with the learner group which was also composed of females. Rosenzweig’s male and female students differ only rarely in their primary responses, though there are a number of differences between the student responses and those produced by the workmen.
Consider first the learner’s primary responses. These fall into three main categories: Category A (marked = in Table 4) comprises primary responses which are the same as the primary responses reported for the native French speakers; Category B (marked : in Table 4) is made up of words which are not the normal primary response of French speakers, but which do nonetheless occur in the list
Connected Words
of normal responses for native francophones; Category C (marked / in Table 4) are responses that are not normally made by French speakers. The number of responses in each category will be found in Table 5. Table 5. Distribution of the learner’s primary, secondary and tertiary responses Category Primary Secondary Tertiary
= 23
: 40 46 40
/ 37 54 60
= learner’s primary response is the same as the French primary : learners’ response appears in the native speaker norms / learners’ reponse is never made by native speakers
Category A, 23 cases, is basically uninteresting, in that though the learners produce the same primary response as the native speakers, these primaries are translation equivalents of the corresponding English primary. For the four cases where this is not so, the primary response is a translation equivalent of a corresponding high frequency response in English. There is no way of deciding whether the learners are producing genuine French-like responses here, or whether they are merely translating their normal English responses. Category B, 40 cases, also appears to be largely made up of translations of English responses. Twenty-five of the learner primaries are translations of the corresponding English primaries or other very frequent responses. Of the remaining cases, six are marginal in that they are made very infrequently by native French speakers (not more than once in a sample of 150 speakers). This leaves us with only six primary responses which are genuinely French and un-English: ESTOMAC ~ manger, CLAIR ~ lune, EVANGILE ~ église, TETE ~ yeux, DOCTEUR ~ hôpital and EFFRAYER ~ enfant. The third category, totally unFrench associations, is surprisingly large. Eighteen of the thirty-seven cases can be classified as clang associates, relying heavily on the form of the stimulus word and ignoring its meaning completely. The second largest sub-category consists of associations which are quite reasonable, but just do not figure in the French norms. There is also a third set which arises as a result of the stimulus word being misunderstood. JAUNE ~ vieux and CITOYEN ~ auto are fairly simple cases of this but SANTE ~ noël and SEL ~ acheter and MOU ~ vache are rather more serious. What seems to be happening here is that the learners are interpreting the stimulus words in terms of a suitable English sounding word rather than reacting to the French
Chapter 1. Learners’ word associations in French
stimulus. The final type of unFrench association is where the stimulus word is used as a base to generate a morphologically related reponse word. There were three examples of this type: CONFORT ~ confortable, BEAU ~ belle and MALADIE ~ malade. For the secondary and tertiary responses, the number of unFrench responses is considerably higher, 54% and 60% respectively. Here again there are a number of clang responses, and several examples of misunderstandings of the SEL ~ vendre type. The fact that this discussion has been limited to the three most frequent responses may make these typically unFrench responses seem less important than they really are. These three responses account for only 33% of the total number of responses made by the learners, and unFrench responses are much more common among the less frequent responses. To illustrate this point, Table 6 contains the whole range of responses produced to three of the stimulus words. In this table, “French response” includes any word that appears in Rosenzweig’s norms, even words occurring only once in the list of responses generated by a group of 378 subjects. Even with a criterion as lenient as this, it is clear that only a fraction of the responses produced by the learners can be classified as French-like associations. Table 6 also contains the complete set of native-speaker responses for comparison. These three sets of responses are fairly typical of the complete responses to the 100 stimulus words. With less frequent words such as LISSE and RUGUEUX, the number of non-responders and those who claim not to know the stimulus word rises. In the case of very frequent words such as HOMME or BLANC, the number of individual responses is lower, and the number of respondents contributing to the most frequent responses is rather higher than in these examples. The data in Figure 6 is untypical in that there are few examples of clang associates. This is probably due to the fact that two of the stimulus words are close cognates of English words. Clang associates are particularly common with less frequent French stimuli. Other points worth noting are the complete absence of some very frequent responses made by native speakers from the learners’ data, and the very small number of syntagmatic responses. MEMOIRE gives rise to no syntagmatic responses, although there are a number of examples of this type in the native speaker data. LONG produces mainly paradigmatic responses. PAIN produces a number of syntagmatic responses – beurre, eau, fromage – which are phrases, but only two examples of genuine syntagmas – manger and grillé. There is no evidence in the data as a whole that the learners produce syntagmatic responses in any systematic way.
Connected Words
Table 6. Three Complete Response Sets. The table shows the complete response set
generated for three stimulus words by the learner group (N=76) and by Rosenzweig’s student group (N=378). For the learner group, “French responses” are marked with *. Responses marked $ were generated by subjects who claimed not to know the meaning of the stimulus word responses to PAIN Learner Responses
Discussion There are two possible approaches that we can take towards the data presented above. The first is to take the very obvious discrepancies between the association patterns of learners and native speakers as indicative of serious inadequacies in the learners’ grasp of French. Ideally, it might be argued, learners ought to aim at performing like native speakers on every language task, and this ideal could be applied not only to primary language activities such as listening and speaking, but also to secondary activities such as the word association task. These secondary activities are not just academic curiosities: they are a useful way of investigating the way speakers’ knowledge of their language is structured and stored. Word associations
Chapter 1. Learners’ word associations in French
clearly tell us something about the way our mental dictionaries are organised. The data suggests that the native speaker’s mental dictionary is organised mainly on semantic lines, rather more like a thesaurus than a conventional dictionary. Words of similar meaning, or words that have the same range of convenience are stored in such a way that they readily evoke each other. In the learners’ case, however, this semantic organisation seems to be much less well established. The learners studied here do show some evidence of semantic organisation, but this is mainly dependent on translation between French and English. There also appears to be a conflicting principle of organisation, which makes use of the forms of words rather than their meaning. Even among respondents who claimed to have understood the meaning of the stimulus word, there is a strong tendency for totally extraneous words to emerge as associates. These responses are not related to the form or the meaning of the stimulus word. This lack of a proper semantic organisation for foreign language words may explain a large part of the difficulty that learners experience in processing both written and spoken foreign language material. Receptive skills rely heavily on a predictive process whereby the reader/listener anticipates what is about to appear, and checks these predictions against what does actually appear in the speech stream or text. A semantically based lexicon would obviously be effective here. It is usually possible to predict at least part of the meaning that your interlocutors are trying to convey, even though it is not always possible to predict the exact words that they will use. If we imagine that predicting the occurrence of a particular word brings to mind not only that word, but a whole cluster of other words that are closely related to it, then in a semantically organised lexicon, all the words brought to mind would be relevant to the matter in hand, and it is highly likely that one of the words in this cluster will match what appears in the utterance or text. A dictionary that was organised along non-semantic lines would be less efficient, since the cluster of words would contain a large number of items that were totally irrelevant to the message in hand. A dictionary based on formal criteria, for example, would bring to mind a whole cluster of similar sounding words, and this would be confusing even when the predicted word was correctly anticipated. In effect, learners with such a mental dictionary would be bombarded with irrelevant messages, which would make it very difficult for them to extract the true meaning of what they are trying to understand. If this characterisation of the learner’s mental dictionary as lacking a proper semantic organisation is a true description, then one implication would be that we ought to put a considerable research effort into developing learning methods which could lead learners to develop mental lexicons that are properly structured, and as closely as possible like those of native speakers. On the other hand, there is a second equally plausible, but quite contradictory approach which could be taken: to claim that though there are large and obvious
Connected Words
discrepancies between the learners and the native speakers, they are not really of any importance. It might be the case that all learners go through a phase when their foreign language lexicon is organised on non-semantic criteria, or indeed even randomly. If the lexicon was relatively small, this might not really matter, and it might be the case that given enough exposure the lexicon reorganises itself on semantic grounds when the number of words it contains becomes large enough to make efficient organisation important. Our knowledge of how learners acquire foreign language vocabulary, and how this part of their competence is elaborated, is so slight that there is not really any evidence available which could indicate which of these two approaches is more likely to be the correct one. This rather unhappy state of affairs has three main causes. Firstly, most of the major developments in applied linguistics in the last decade have been chiefly concerned with aspects of syntactic development. This is due to the existence of well-developed and useful models which have been worked out in the course of studies of first language acquisition in children. This work is obviously important, but it is also important to remember that syntactic problems are only part of a whole set of problems faced by learners of foreign languages. Syntax is not a serious source of difficulty for more advanced learners, and vocabulary problems are probably much more serious once the early phases of learning are past. Secondly, where vocabulary problems have been studied, this has almost always been from the point of view of the teacher, the tester or the course writer, rather than that of the learner. West’s work on the frequency of English words as a criterion for inclusion in text books (West 1953) and the work on français fondamental (Gougenheim et al. 1965) are good examples of this. Such work is clearly of great value, but it leaves unasked a number of questions of a fundamental kind about the psychological aspects of acquiring foreign language vocabulary. Thirdly, the small amount of work that has looked at learners acquiring vocabulary has usually assumed that learning a foreign word is merely a matter of being able to recognise that HOMME means MAN. The model that underlies this kind of thinking is an adaptation of the paired-associate idea found in psychological work on verbal learning, and implies that native language words and foreign words are linked together in simple stimulus response relationships. This is an impoverished view of the complexities involved, though. It assumes that vocabulary items are discrete, and ignores the networks of semantic relations that exist between words, and the fact that sets of related words in one language rarely map in any simple way onto the equivalent set in another language. More importantly, by defining the problem in terms of inter-language pairs, any comparison between what a learner
Chapter 1. Learners’ word associations in French
does with a foreign word and what a native speaker does are explicitly ruled out of consideration. This last point is an important one. ‘Knowing a word’ for a native speaker is a complex and multi-faceted skill, perhaps best described in behavioural terms as the ability to react to a word in ways which are considered appropriate by the speech community. Many learners are incapable of reacting appropriately to a word, even though technically they know its meaning and might be able to use it in a sentence. Two examples will suffice. Native speakers have little difficulty recognising words spoken against a background of noise, but even fluent learners are very much less tolerant of noise, and can fail to recognise words at noise levels which have no effect at all on the performance of native speakers. Native speakers can read single words exposed on a screen for as little as 30 milliseconds, but learners require much longer exposure times, even when the words tested are very common ones. Being able to perceive words in noise, or read words quickly are both examples of the type of skill all native speakers are expected to have by their speech community. Both are important subcomponents of the ability to communicate. It is clearly important that learners should be trained to share these appropriate reactions, so that they can perform these tasks, and others like them, with something like the facility found in native speakers. The case of word associations is not so clearly important as the activities mentioned above, as there is a very wide range of tolerance found among native speakers, and since the production of word associations is not so clearly related to ordinary language activities. My own feeling, however, is that all the various types of language activity are reflections of the same underlying basic skills, and that if we could develop learning methods that, as a side effect, produced learners with native-like association patterns, we would also be producing learners who were better able to communicate in their foreign language.
chapter 2
Word associations in a foreign language Lexicography for L2 learners is a well-developed and influential part of research in Applied Linguistics. Most of this work deals with the linguistic features of words, and very little of it is concerned with a related, equally interesting, but much more elusive question: what does a learner’s mental lexicon look like, and how is it different from the mental lexicon of a monolingual native speaker? As a part of a preliminary skirmish into this area, my students and I have been using word association tests. So far we have produced a small number of interesting, but unsurprising findings, and a large number of methodological puzzles and problems. The main findings have already been published elsewhere, and so in this paper I shall discuss them very briefly before dealing at greater length with the problems and their implications for further research. As pointed out in the previous chapter, the basic word association game is extremely simple. It requires two players: one whose task is to call out or show single words, and a second whose task is to respond to these words with the first word that comes into his or her head. Despite its popular image as a sure-fire way of probing people’s innermost secrets, the most striking thing about associations is that they are actually extremely boring and predictable. Even relatively unpredictable stimulus words like MEMORY or MUSIC still produce a very limited range of responses. With a hundred people, you would be likely to get about 25 to 30 different responses, but most of these will occur more than twice, and only a relatively small number will be unique responses. Using bigger groups of subjects does not make very much difference to this pattern; responses tend to stabilize with groups of fifty or more, and using a group very much larger than this makes little difference to the range or pattern of responses. It is customary to claim that word association responses generally fall into two main classes called syntagmatic associations and paradigmatic associations. These terms have much the same meaning as they do in Saussure. Syntagmatic associations are responses which form an obvious sequential link with the stimulus word. Given DOG, for example, bark, spotted, naughty, or bite would generally be classified as syntagmatic responses. Responses which are from the same grammatical form class as the stimulus word are classed as paradigmatic. Thus, given DOG,
Connected Words
cat, wolf or animal would all be classified as paradigmatic responses. Personally, I have found that this distinction is very difficult to work in practice, especially when you cannot refer back to the testee for elucidation, but this difficulty is not generally commented on in the literature. The distinction is important because it is generally held that most normal native speaking adults have a tendency to produce paradigmatic responses in preference to syntagmatic ones. Children, on the other hand, tend to prefer syntagmatic responses, at least until they reach the age of seven or so. Children also tend to produce large numbers of clang associates – i.e., responses which are clearly related to certain phonological features of the stimulus word, but bear no obvious semantic relationship to it. Rhyming responses, assonance, responses with the same initial sounds as the stimulus, or a similar prominent consonant cluster are common types of clang associate. The word associations produced by non-native speakers differ fairly systematically from those produced by native speakers. Surprisingly, learners’ responses tend to be more varied and less homogeneous than the responses of a comparable group of native speakers. This is an odd finding because learners must have a smaller, more limited vocabulary than native speakers, and this might lead one to expect a more limited range of possible responses. Learner responses are not generally restricted to a subset of the more common responses made by native speakers, however. On the contrary, learners consistently produce responses which never appear among those made by native speakers, and in extreme cases, it is possible to find instances of stimulus words for which the list of native speaker and learner responses share practically no words in common. The reasons for this are not wholly clear, but one contributory factor is the fact that learners have a tendency to produce clang associations like young children. A second contributory factor is that learners very frequently misunderstand a stimulus word, mistaking it for a word that has a vague phonological resemblance to the stimulus. This clearly leads to maverick responses, but these cannot be dismissed out of hand. The frequency of the phenomenon suggests that actually identifying foreign language words reliably is a major problem for many learners, and this seems to be the case even when the words are simple, and when the learners themselves claim to know them. Some examples of learner responses of this type are shown in Table 1, along with a set of plausible interpretations. This sort of data, taken together with the fact that learner responses tend to be relatively heterogeneous anyway, suggests that the semantic links between words in the learner's mental lexicon are fairly tenuous ones, easily overridden by phonological similarities, in a way that is very uncharacteristic of native speakers. So much, then, for the basic findings. What about further research based on these foundations? The word association test is so simple to use, and produces such a wealth of data with a minimum of effort, that one would expect to find a
Chapter 2. Word associations in a foreign language
Table 1. Associations to French Stimulus words which seem to be based on misinterpretations of some sort Stimulus
large amount of research using this paradigm. Surprisingly, this is not the case. A number of studies do exist, (see Meara 1981 for a survey of this work ), but they all seem to cover much the same ground, producing little in the way of new findings, and rarely even trying to break new ground. There are no theoretical models which account satisfactorily for word association behaviour in a second language, and consequently almost all the work published so far (including my own study reported in Chapter 1, alas) has been content merely to describe the sorts of responses that learners produce, together with a minimal statistical analysis. It seems to me that one of the prime reasons for this lack of development is that far too little consideration has been given to what words should be used as stimuli. Some of the published work makes use of idiosyncratic lists from which it is difficult to make generalizations. (An extreme case of this is Ruke-Dravina (1971) who used only four stimulus words in her study of Latvian-Swedish bilinguals.) Generally, where idiosyncratic lists of stimuli are used there is no discussion of why these words were chosen, or why they might be considered especially worthy of note. This is
Connected Words
unfortunate because it means that discrepant results can always be explained away in terms of the stimuli used, and there is no incentive to incorporate these discrepancies into a coherent overall framework. The alternative to idiosyncratic lists is to use one of the many standard lists of stimuli – generally the Kent-Rosanoff list. This list of words was first used by Kent and Rosanoff in 1910 as the basis for a study of the word associations made by mentally ill subjects. Since then, it has been widely used in word association research, both in English and – in translation – in a range of other major languages. The list consists of 100 relatively frequent words, all of which produce fairly stable response patterns in normal native-speaker adults. The extensive use of this list means that a very large number of sets of association norms are available: i.e., collections of responses based on large groups of similar subjects, (cf. for example, Postman & Keppel 1970). In theory, this ought to make it possible to do useful and illuminating comparisons between the responses of learners and native speakers, and, indeed, a number of studies have attempted to do this. Unfortunately, the Kent-Rosanoff list is not a particularly useful one for research on second language learners. The most important reason for this is that the high frequency words used tend to produce very similar responses in both the TL and the NL. Adjectives, for instance, tend to produce their polar opposites, so one finds BLACK ~ white; NOIR ~ blanc; MOU ~ dur; SOFT ~ hard. This makes it difficult to decide whether a Subject’s response to a stimulus word is really a direct L2-L2 response, or whether it is produced via translation into the mother tongue and back again. The same argument applies in the case of nouns which are marked for sex: these tend to produce the opposite sex form as a response; so, KING ~ queen; ROI ~ reine; and BOY ~ girl; GARÇON ~ fille. As far as English and French are concerned, about 60% of the items in the Kent-Rosanoff list are of this sort. I do not know the figures for any other pair of languages, but it seems probable that most European languages at least are likely to fall in the same general range. This means that the list as a whole is not a very sensitive tool when it is used with non-native speakers: fewer than half the words are really effective items. A second problem with the Kent-Rosanoff list is again one that derives from its one apparent advantage: the use of frequent words. Almost all the words in the list lie in the highest frequency band – in the French version, for instance, only four words do not appear in either the first or second steps of the Français Fondamental. This means that all the words tested are among the first words that learners acquire in their second language – often at a stage where learning new words is an unfamiliar and strange experience. This has two drawbacks. Firstly, we know very little about how second language vocabulary is acquired, but it seems a reasonable supposition that the early stages of learning a language might produce acquisition patterns that differ quite radically from what goes on when more advanced, fairly fluent speakers learn words. It is possible that the resulting
Chapter 2. Word associations in a foreign language
word association behaviour with basic L2 words might be quite different from what happens with more “advanced” vocabulary, and it might be quite wrong to generalize on the basis of what happens with a hundred highly frequent words learned in peculiar circumstances. Secondly, the use of the Kent-Rosanoff lists has had the effect of concentrating attention on a small number of words which form the hard core of the learners’ L2 vocabulary, and this has distracted attention away from what is potentially a much more interesting problem: what is happening at the periphery of a learner’s vocabulary – how new words are acquired and integrated into the existing word stock. The third problem with the Kent-Rosanoff list is that the apparent bonus of being able to compare learners’ responses with the published norms for native speakers turns out on closer inspection to be of doubtful value. In Chapter 1, I suggested that it was reasonable to expect learners to aim towards producing nativelike responses on a word association test, for the simple reason that one wants learners to behave like native speakers in all types of language behaviour. Several people have pointed out to me, however, that this argument is not a good one. Teaching a language aims to produce people who are bilingual, not mere replicas of monolingual speakers. It would, therefore, be more appropriate to compare the associations of learners with those of successful bilingual speakers, and not with native speakers. Unfortunately, of course, the necessary background work needed to make such comparisons has not yet been carried out. These three reasons, and particularly the first two, seem to me to be strong arguments for abandoning the use of the Kent-Rosanoff list with non-native speakers. It would be nice to be able to suggest a concrete alternative at this stage, but this is obviously very difficult to do. What would count as an appropriate set of stimuli depends very much on what questions you are trying to answer. Perhaps the general point to be made is that experimenters do need to think about their choice of words more carefully. Tried and trusted tools which work for L1 situations are rarely wholly appropriate for L2 situations, and word association research is clearly one of these cases. The problem of what words to use as stimuli in word association research with non-native speakers is one that requires thought, but not a topic that raises any really important questions. Now that we have got it out of the way, we can pass on to three topics which seem to me to be of rather more interest, both theoretical and practical. These are the stability of learners’ associations, what happens to new words as they are acquired, and on a slightly different tack, what we can deduce from obvious errors in word association tests about the way words are stored and handled by learners. The stability of learners’ responses in word association tasks is an important methodological question that has not been generally considered in the literature.
Connected Words
We know that native speakers’ associations are relatively stable: subjects tend to give the same responses to stimulus words, and this tendency is even more marked if we consider the responses of whole groups of subjects. This means that one can be reasonably confident that a single test is a reliable tool to use with native speakers, and that it is unlikely that a second test would produce wildly different response patterns. It is much less clear that this assumption can safely be made about learners, however. Learners’ vocabularies are by definition in a state of flux, and not fixed; learners often tend to give idiosyncratic responses; the indications are that semantic links between words in the learner’s mental lexicon are somewhat tenuous – all these considerations would lead one to suspect that learners’ responses could be considerably less stable than the response patterns of native speakers. If this turned out to be so, it would severely reduce the value of one-off studies of learners, and it would be impossible to ascribe to studies of learners the same sort of status we usually ascribe to one-off studies of native speakers. It would also mean that considerable caution would be needed in the interpretation of studies such as that of Randall (1981). Randall attempted to relate changes in association responses to measurable changes in the proficiency of a group of EFL learners. However, if learners’ responses are generally unstable, then there is no way of deciding whether observed changes are really permanent ones, and thus represent real progress, or whether they are just part of the random flux of the whole system. We have carried out two studies on stability so far, with a third study planned. These studies show rather mixed results. Morrison (1981) looked at FinnishEnglish bilingual children and found that they were equally stable, or rather equally unstable, in both languages. This is not very surprising, however, since children tend to be fairly unstable anyway. Hughes (1981), in a bigger and better controlled study of several groups of ESL learners found that responses on the whole were very unstable, but the general level of stability differed considerably from group to group and from word to word. There were, however, no obvious reasons for these discrepancies, and all we can say at the moment is that it seems safest to assume that learners’ word associations are not very stable. This is obviously an unsatisfactory state of affairs, as it effectively inhibits any other research in this area. It is equally obvious, however, that learners’ responses are not totally unstable, and our immediate aim is to work out what conditions lead to reasonably stable patterns and what are the causes of the instability. The second question that has interested us is what happens to new words which are acquired by learners, and how do they become integrated into the learners’ mental lexicon? It is often implicitly assumed that learning vocabulary is an immediate all-or-nothing affair – when words are studied, they are either acquired or not. This is a position which seems inherently implausible to me. Most
Chapter 2. Word associations in a foreign language
learners have the experience of knowing that they know a word, but being quite unable to say what it means, even though looking the word up in the dictionary produces an instant ‘of course!’ reaction. This experience and others like it, suggest that learning vocabulary is not just a question of pairing L2 stimuli and L1 meanings often enough for them to be ‘learned’. Some sort of complex absorption processes are likely to be involved, which allow words which have just been met to gradually find their proper place in the learner’s L2 lexicon. Perhaps it would be possible to tap this process by recording the associations made to new words and observing how these associations change over a period of time? So far we have carried out one experiment on these lines (see Beck 1981 for details). A group of English speaking students learning French at ‘A-level’ were given a list of forty French words that they were unlikely to know, and asked to produce chains of responses to each one. Not surprisingly this produced few responses overall, a large number of clang-type responses and only a handful of native-speaker-like responses. Subsequently twenty of the words were introduced into the students’ class-work in a non-obtrusive fashion, and two further tests were given over a twelve week period. The results of the first re-test showed that there was no real change in the responses to the words that had not been used in class teaching. They still produced a low level of total responses, lots of clang associations and few native-speaker-like responses. In contrast, the taught words changed markedly, producing a greater number of total responses, fewer clang associates, and a greater proportion of native-like responses. The second re-test again showed no change in the untaught words. The taught words showed a slight decline in the total number of responses they evoked, but an increase in the proportion of native-like responses. This data clearly confirms the view that learning vocabulary is not an instantaneous process. Changes are still taking place twelve weeks after the initial presentation of the taught words. Indeed, given that the total number of responses was far short of what one would expect of a fluent speaker, and given that the number of nativelike responses was less than 20% of the total, it seems plausible to suggest that the integration of these words was far from complete, and that these changes are likely to continue for quite long periods of time. The questions to be asked at this stage, then, are: how long does this stabilizing period last? is it the same for all words and for all learners? what environmental factors reduce or extend it? It should be possible to get answers, at least, of a preliminary sort, to all these questions by means of word association tests, and further work along these lines is projected. The third question which is currently interesting us concerns the large proportion of responses made by learners which are clearly ascribable to errors – either errors in the identification of the stimulus word or error in the choice of a response. These errors bear some resemblance to the sorts of errors native speakers of English
Connected Words
make when they produce malapropisms and slips of the tongue. The errors listed in table one, for example, show that certain features of the target tend to be preserved – initial consonants and salient consonant clusters seem to be fairly robust, while vowels and medial syllables seem to be particularly vulnerable, and these are the same features that crop up consistently in work on errors in English as an L1. This suggests that the mechanisms which underlie vocabulary errors in an L2 might be closely related to the sources of errors of vocabulary in an L1. Given that such errors typically occur with infrequent words, and that L2 words are by definition relatively infrequent items in the learner’s total word stock, this is perhaps not very surprising. Nevertheless, it does suggest that the traditional emphasis on L2 as a self-contained, independent system may be an unhelpful one, at least as far as vocabulary is concerned, and that a lot might be gained if we began to consider the learners’ total vocabulary, in all the languages they know, as an integrated whole, and not just as a set of small discrete components.
Conclusion This paper has discussed some of the findings and some of the interesting problems that have arisen out of our work on word associations – itself part of a wider project on Vocabulary Acquisition in a Second Language. Vocabulary Acquisition is generally considered to be a topic of little inherent interest and of slight theoretical importance, and even on the practical level it is very often ignored or treated in a cavalier fashion. I hope that this paper will help to convince sceptics that these attitudes are unjustified, and that vocabulary acquisition is not just an interesting area to work on, but potentially quite an exciting one too.
section 2
Associations as productive vocabulary Introduction In the previous section, I introduced the idea of word association tests, and discussed some of the problems that we encountered when we tried to use these tests with L2 speakers. This section takes a rather different tack, and illustrates how word association data can be used to examine issues which are usually thought of in different terms. Zareva (2005:560) has noted that “the analysis of the quantitative features of ... word associations was found to be of little practical usefulness... [but] we do believe that associative measures hold a potential as valid measures of L2 learners’ lexical knowledge that need to be re-examined in an assessment context”. And this was pretty much the position I had reached after my early work in this area. I came away from this early work with a feeling that the data word association studies generated was enormously rich, but it was difficult to know how to exploit this richness. By the mid 1980s, I had got involved in a series of studies aimed at measuring vocabulary size in L2 speakers. This project started out with some very exploratory work on YES/NO tests (e.g., Meara & Buxton 1987) – tests in which we simply asked learners to indicate whether they knew the meaning of the target words or not. It soon turned out that the YES/NO format was much more powerful than it looked at first sight. Specifically, it opened up the possibility that we might be able to make reasonably accurate statements about the size of a learner’s vocabulary – as long as we were interested in making claims about passive, receptive vocabulary, at least. The project soon mushroomed into a sizeable programme, which involved the development of tests in a whole range of languages, and the development of computer platforms to deliver them. Eventually, I felt pretty confident that we had a reasonable testing procedure, and started to think about how it could be used to develop the theory of vocabulary acquisition in an L2. I was thinking in terms of developing a two or three dimensional model of vocabulary acquisition which would allow us to track the relationship between vocabulary size, vocabulary organisation, and vocabulary accessibility – ideas we will return to in Section 3.
Connected Words
In the meantime, however, as interest in the YES/NO tests developed, we were getting a lot of criticism that our tests “merely” addressed receptive vocabulary skills, and had nothing to say about the more interesting facets of productive vocabulary. My first reaction to this criticism was that it didn’t matter very much. True, the YES/ NO tests were a very minimal vocabulary test, testing little more than students’ ability to recognise that a word form existed, but I felt that there were strong philosophical reasons for adopting this approach. Simple as they were, the YES/NO tests threw up an enormous range of problems, particularly when we started to develop them in languages other than English (Meara 2005), but at least it was obvious that these problems existed. With more complex test formats, the problems might have been much less obvious. I also felt that the relationship between productive and receptive vocabulary was probably a fairly straightforward one: receptive vocabulary would generally be larger than productive vocabulary, and it was simply an empirical question to determine how big this difference was in a typical case. Discussions with colleagues made me realise that this position wasn’t as straightforward as I thought. There are two obvious claims that we might want to make about the relationship between productive and receptive vocabulary. One was that productive vocabulary is typically a constant percentage of receptive vocabulary, so, for example, we might want to argue that Subjects typically use about 50% or 75% of the vocabulary they know, while the remainder of their vocabulary remains in a passive state. This position implies that there is a substantial gap between receptive and productive vocabulary, and that this gap gets bigger in real terms as the Subjects’ vocabularies get bigger. The second position is that productive vocabulary lags behind receptive vocabulary by a constant amount, typically, say, a few hundred recently acquired words. The idea here would be that over time words gradually move along some sort of passive/active continuum, and passive words would therefore generally be words which had been acquired only recently by the learner. (This idea of an active/passive continuum is one which is widely accepted by the vocabulary research community, e.g., Melka Teichroew 1982, and has strongly influenced the way people think about vocabulary teaching. As we shall see in Section 3, however, I think there are reasons for rejecting this way of looking at things.) Both these ideas imply that there is some sort of linear relationship between receptive vocabulary size and productive vocabulary size, and if this were the case then we would be justified in using a receptive vocabulary size test like the YES/NO test as a substitute for a more comprehensive productive vocabulary test. A more interesting, but less obvious, claim is that the relationship between productive and receptive vocabulary size is not linear, but varies. For example, it is possible that receptive vocabulary grows in spurts, and that productive vocabulary grows in the consolidation periods between these spurts. Once you start thinking
Section 2. Associations as productive vocabulary
in these terms, it becomes obvious that there are some really interesting theoretical questions to be asked about the way vocabularies grow, and how the relationship between receptive and productive vocabularies changes as a result of this growth and development. The obvious way to approach these issues was to take a reliable test of receptive vocabulary, and a reliable test of productive vocabulary, and compare the results. The YES/NO tests looked as though they could provide one half of the necessary tools, but it was much harder to identify a good test of productive vocabulary. Some work in this area had been undertaken by Laufer (1995) and by Laufer & Nation (1995, 1999), but for reasons which are explained in more detail in Chapter 3, neither of these approaches was without problems, and we felt there was perhaps some merit in looking at productive vocabulary from a different perspective. The basic problem was that most approaches to productive vocabulary relied on Subjects producing texts for analysis, but these texts were highly topic dependent, which biased the types of words they elicited, and more importantly, they failed to elicit uncommon words in large numbers. Eighty percent or so of most texts come from the first 1000 words of the target language, and this makes it difficult to estimate the productive vocabulary of an author without resorting to some sophisticated mathematics, which probably didn’t apply to short texts anyway. Word association data was not restricted in this way, however. We therefore began to wonder whether it might be possible to develop a standardised word association task that would tap into the productive vocabulary of L2 learners, a task which would have good measurement characteristics, and allow us make a preliminary foray into the areas outlined earlier in this section. Lex30, the instrument described in Chapters 3 and 4, was the outcome of this work. Unlike some of our other experimental tools, Lex30 was widely and rapidly taken up widely by other researchers. It was favourably reviewed by Baba (2003), and found particular favour with researchers in Spain, (e.g., Naves & Miralpeix 2002; Jiménez Catalán & Moreno Espinosa 2005). For a time we were worried that what we viewed as an exploratory tool was being adopted prematurely as a standard by over-eager researchers, though these fears receded somewhat as our confidence in the Lex30 approach grew. Our current view is that Lex30 is still in need of further validation, but it provides an interesting alternative to more traditional approaches (Morgan & Oberdeck 1930) and also more radical approaches (e.g., Meara & Bell 2001) to the question of how we assess productive vocabulary. The whole notion of productive vocabulary has turned out to be much less tractable than we expected it to be, and we suspect that it may in the long run turn out to be an idea which is best approached through intensive single subject studies rather than through studies of groups of learners in experimental situations. In the meantime,
Connected Words
the Lex30 approach remains an imaginative application of word associations in a difficult research context. Section 5 contains a detailed users’ manual for the Lex30 programs, and we do encourage readers to try them out for themselves. The current version can be downloaded from: http://www.lognostics.co.uk/
chapter 3
Lex30 An improved method of assessing productive vocabulary in an L2 Introduction This paper describes a tool which we believe can be used to make straightforward assessments of the productive vocabulary of non-native speakers of English. The data reported here are preliminary in the sense that we are not putting forward a well-developed and properly validated testing instrument. Rather, we are trying to address a complex ‘chicken and egg’ situation which is causing something of a blockage in the field of vocabulary research. This blockage arises from the fact that there are no well-established and easy-to-use tests of productive lexical skills. The nearest thing we have to a useful tool in this area is Laufer and Nation’s test (Laufer & Nation 1995, 1999). We think these are problematical for reasons which will be explained in more detail below. However, until we have some kind of test which might be interpreted, however loosely, as an index of productive vocabulary, it is unlikely that we will be able to make very much headway in this area. The tools described here, then, are intended as a first step in this direction. Our aim has been to develop a methodology which we think might be honed into something more formal. This paper first describes the methodology that we have developed, and then shows how the methodology could be used to make interesting comparisons between productive and receptive vocabulary in L2 learners. Successful L2 learners are avid collectors of words, and tend to measure their own success by the number of words that they know. Current teaching materials and methodologies exploit and encourage this. The New Cambridge English Course, for example, proudly claims that “students will learn 900 or more common words and expressions during level 1 of the course” (Swan & Walter 1990:5). The communicative language teaching techniques and comprehension-based teaching methodologies of the last two decades also attached more importance to vocabulary acquisition than did, for example, the grammar translation and audio-lingual approaches which dominated pre-1970 language teaching (Nunan 1995; Lightbown et al. 1998).
Connected Words
In most practical contexts, it is clear that communicative effectiveness is achieved more successfully by learners with a larger vocabulary than by learners with a more detailed command of a smaller one. It is not surprising then, that measurements of vocabulary size have been shown to correlate positively with proficiency levels in reading (Anderson & Freebody, 1981) and writing (Engber, 1995), and in general language proficiency (Meara & Jones, 1988). In practice, however, most claims of this sort have relied on measures of passive, receptive vocabulary knowledge, since it has been difficult to measure control of productive vocabulary effectively. The implicit assumption here is that active vocabulary knowledge can be reasonably extrapolated from measures of receptive knowledge. This assumption is not an implausible one. Few researchers would dispute that receptive vocabulary is probably larger than productive vocabulary, and that some level of receptive knowledge of a word must exist in order for the word to be produced. Nonetheless, one could imagine situations where the relationship between active and passive knowledge might not be straightforward, and for this reason, it would be useful to have an independent measure of active vocabulary. Unfortunately, it is much more difficult to assess productive vocabulary knowledge than it is to assess receptive vocabulary knowledge. The main reason for this is that the vocabulary produced by a learner, whether in written or spoken form, tends to be so context-specific that it is difficult to calculate from any small sample the true size or range of the learner’s productive vocabulary. It is also difficult to devise simple tasks which produce the large quantities of vocabulary that are necessary to make reasonable estimates. There are two principal methods of estimating productive vocabulary currently in use, but neither of these has fully resolved these problems. Controlled productive vocabulary tests prompt subjects to produce predetermined target words. Testees are given a sentence context, a definition, and/or the beginning of the target word, e.g.:
The book covers a series of isolated epi_________ from history.
and are required to complete the missing word – in this case episodes (Nation, 1983; Laufer & Nation, 1999). Free productive vocabulary tests such as Laufer & Nation’s (1995) lexical frequency profiling tests, analyse a written or spoken text generated by the subject, and categorise the vocabulary used in terms of frequent, less frequent and infrequent words. The higher the percentage count of infrequent words, the larger the subject’s productive vocabulary is estimated to be. There are problems inherent in these two types of test. The controlled productive vocabulary tests are effective mainly at low levels; when, for example, testees are expected to have a limited vocabulary size, the method allows a high proportion of these words to be tested. The controlled productive test used in
Chapter 3. Lex30
Laufer & Nation (1999) attempts to elicit 18 target words for each of five frequency bands: two thousand, three thousand, five thousand, the University Word List, and ten thousand. Although this approach seems to be effective at lower levels, it must be difficult to extrapolate about the size of a testee’s productive lexicon beyond a relatively small vocabulary. At the 10,000 word level, we are in effect testing 18 words from a pool of several thousand words, and using this to draw conclusions about the testee’s knowledge of all the other words in this pool. Suppose, for example, that we test the word fragrance with an item like:
The fra_________ of the flowers filled the room.
This item treads a fine line between receptive and productive skills: the production of the target word is dependent on receptive understanding of the surrounding context words. Additionally, it is possible that our subjects do not know fragrance, but do know scent, aroma, perfume, ..., similarly infrequent words that could easily have fitted the slot, but for the (helpful?) hint. The point is that this kind of test item can easily identify what testees do not know, but it is rather less successful at identifying the full extent of what they do know. In any case, if we are testing a vocabulary of any size, say, three or four thousand words, it would be impossibly difficult in practice to devise a comprehensive set of items large enough to provide the sort of coverage that we would need to get reliable estimates of productive vocabulary. The free productive vocabulary tests are problematical too. They are contextlimited, although in many cases the effects of this are minimised by using a broad subject base (e.g., essays discussing a moral dilemma; c.f. Laufer & Nation, 1995). In most cases, however, it is unclear that the material these tasks elicit genuinely encourages testees to ‘display’ their vocabulary in the way that a test of productive vocabulary would require. In addition to this, free productive vocabulary tests are not a cost-effective way of eliciting vocabulary: most text – even text generated by fluent native speakers – is predominantly made up from a small set of highly frequent words. A huge amount of text is needed to generate more than a handful of infrequent words, and it is often difficult to elicit texts of this length from non-native speakers. Laufer & Nation (1995), for example, reported that they needed to elicit two 300-word essays from their testees in order to obtain stable vocabulary size estimates. This required two hours of class time, a figure which would be prohibitive, except in special circumstances. One superficially attractive alternative to continuous text as a source of productive vocabulary is the spew test (Palmberg, 1987; Waring, 1999). In spew tests, subjects are simply asked to produce words which share a common feature, e.g., words beginning with the letter B. In our view, research using spew tests has not lived up to its promise, however. There are major problems over standardisation of scoring
Connected Words
which have not been addressed, and although we think there is some potential in the method, we think that spew tests need a lot more development work before they can be used reliably. Clearly, then, there is a need for a cost-effective and efficient way of eliciting data from testees which can give us enough material to make a rough estimate about their productive vocabulary skills. The rest of this paper describes a new productive vocabulary test, which has been designed with these criteria in mind, and addresses the practical problems we have discussed. The test generates rich vocabulary output from testees, but it is easily administered, and can be scored automatically using a computer program. We believe that it therefore has the potential to be developed into a practical and effective research tool.
Lex30 This section describes Lex30, and discusses the sorts of data it generates and the analysis that we apply to these data to make estimates about the productive vocabulary of the testees.
The test format The Lex30 task is basically a word association task, in which testees are presented with a list of stimulus words, and required to produce responses to these stimuli. There is no predetermined set of response target words for the subject to produce, and in this way, Lex30 resembles a free productive task. However, the stimulus words tend to impose some constraints on the responses, and Lex30 thus shares some of the advantages of context-limited productive tests. Word association tasks typically elicit vocabulary which is more varied and less constrained by context than free production tasks. The test consists of 30 stimulus words, which meet the following criteria: 1. All the stimulus words are highly frequent – in our experiment, the words were taken from Nation’s first 1000 word list (Nation, 1984), i.e., they are words which even a fairly low-level learner would be expected to recognise. This is a deliberate choice: it makes it possible to use the test with learners across a wide range of proficiency levels. 2. None of the stimulus words typically elicits a single, dominant primary response – the formal criterion that we adopted here was that the most frequent response to the stimulus words, as reported in the Edinburgh Associative Thesaurus (Kiss et al. 1973) should not exceed 25% of the reported responses. In this way, we avoided stimulus words like BLACK or DOG, which typically
Chapter 3. Lex30
elicit a very narrow range of responses, and selected stimulus words which typically generate a wide variety of different responses. 3. Each of the stimulus words typically generates responses which are not common words – the formal criterion here was that at least half of the most common responses given by native speakers were not included in Nation’s 1000 word list (Nation 1984). In this way, the stimulus words give the testee a reasonable opportunity to generate a wide range of response words.
Subjects A group of 46 adult learners of English as a foreign language were used as test subjects. These people were from a variety of L1 backgrounds ranging from Arabic to Icelandic. Their class teachers rated them from elementary to intermediate proficiency level.
Method The testees were asked to write a series of response words (at least three if possible) for each stimulus word, using free word association (an example was worked through with each class before the test). Stimulus words were presented one at a time, and testees were allowed 30 seconds to respond to each stimulus word, after which the administrator called the number of the next stimulus word. The entire test therefore took 15 minutes to complete. For an example of a completed test see Appendix A. The testees also completed a standard Yes/No Vocabulary Size test (Meara & Jones, 1990). Both tests were completed within the same week.
Scoring In order to score the test, each testee’s responses (approximately 90 per subject) were typed into a machine readable file. The stimulus words are discarded for the purpose of the analysis. Each of the responses was lemmatised so that inflectional suffixes (plural forms, past tenses, comparatives, etc.) and frequent regular derivational affixes (-able, -ly, etc.) were counted as examples of the base-forms of these words. Words with more unusual affixes were not lemmatised and were treated as separate words. For a full account of the criteria used, see Appendix B. The list of lemmatised suffixes corresponds to levels 2 and 3 of Bauer and Nation’s Word Families (Bauer & Nation, 1993). Once the stimulus words have been discarded, we are left with a short text generated by each testee, which typically contains about 90 different words. Each testee’s text is then processed using a program similar to Nation’s VocabProfile (Heatley & Nation, 1998). The program reports the frequency level
Connected Words
of each word in the text, and produces a report profile for that testee. Table 1 illustrates a typical results profile. Level 0 words (high frequency structure words, proper names and numbers) and Level 1 words (the most frequent 1000 content words in English) score zero points. Any response which falls outside these two categories scores one point up to a maximum of 90. In the example given in Table 1, the score was (10+40)=50 points. Table 1. A typical profile generated by Lex 30
Subject A1
Level 0
Level 1
Level 2
Level 3+
4
49
10
40
Results The results of the productive vocabulary test, Lex30, can be seen in Table 2. Not surprisingly, the number of structure words produced is low. Native speaker word association tests (Postman & Keppel, 1970; Kiss et al. 1973) also produce mostly content words. Most of the words produced by Subjects fall into Nation’s first thousand category (Nation, 1984). Analysis of the completed tests shows that the first response to a stimulus word was usually a frequent word; the second, third and fourth responses were more likely to be less frequent words. About a third of the responses, on average, fell outside this highly frequent set of words, and some testees produced very large numbers of words outside this category (Figure 1). Table 2. Mean profile for Lex30
Mean sd
Level 0
Level 1
Level 2
Level 3+
Total Wds
Lex30 score
3.7 3.6
59.3 13.9
7.8 3.6
20.8 11.4
91.6 24.2
28.9 13.9
The Lex30 scores were also compared with the results of the receptive Yes/ No vocabulary size test. The maximum score on this test was 10,000: two subjects scored this maximum. Mean scores on the standard Yes/No test were 5089, with a standard deviation of 2803. Figure 2 shows the relationship between testees’ scores on the two tests. The correlation between these two scores was 0.841 (p<.01). This indicates that subjects with a large receptive vocabulary also tended to produce a relatively high number of infrequent words on the Lex30 test, and that scores on one of the tests can largely be predicted from the other. The high correlation suggests that testees’
Chapter 3. Lex30 9 8
number of cases
7 6 5 4 3 2 1 1–10
11–15
16–20
21–25
26–30
31–35
36–40
41–45
46–50
51–55
56–60
Lex30 scores Figure 1. Distribution of Lex30 scores.
70 60
Lex30 scores
50 40 30 20 10 0
0
2000
4000
6000 8000 yes-no test scores
Figure 2. Comparison of yes-no test scores and Lex30 scores.
10000
12000
Connected Words
productive vocabulary is at least partly predictable from their receptive vocabulary as measured by the standard Yes/No vocabulary size test. However, a closer examination of the data suggests that this interpretation may not be the best way of approaching these data. Figure 2 seems to suggest that some testees have scores which lie relatively far from the regression line. Testees whose scores lie below the regression line are those with a relatively higher receptive vocabulary in relation to their productive vocabulary, and those whose scores lie above the regression line have a relatively higher productive vocabulary. The graph suggests that the more proficient subjects become, the larger their receptive vocabulary is in relation to their productive vocabulary. This appears to lend support to Laufer’s (1998:267) observation that “an increase in one’s passive vocabulary will, on the one hand, lead to an increase in one’s controlled active vocabulary, but at the same time lead to a larger gap between the two”. Although Laufer’s comment was mainly concerned with the vocabularies of 10th and 11th grade learners of English, it might be more generally applicable.
Discussion The main purpose of the work described in this paper has been to examine the performance of a simple task that might serve as a practical index of productive vocabulary. The results reported above suggest that Lex30 might be modestly successful in this regard. The fact that the Lex30 scores relate closely to scores on a test of passive recognition vocabulary suggests that Lex30 is sensitive to gross differences in vocabulary knowledge. More importantly, however, where the Lex30 scores deviate from a close fit with the standard Yes/No vocabulary size test scores, these mis-fitting cases seem to fall into plausible patterns. Taken together, these finding suggest that Lex30 might form the basis of a useful index of productive vocabulary. The test results were also surprisingly stable. In order to rule out the possibility that testees were producing infrequent words randomly (i.e., for some stimulus words, but not for others) we used a split-half procedure to check the results for internal consistency. This produced a correlation of 0.84 (p<.01), indicating that the test has a high level of internal consistency. This suggests that testees produced infrequent words if they were able to, regardless of the stimulus word used. The main practical advantage of Lex30 is that it is extremely easy to administer, and requires very little time to complete. The version reported here takes a mere 15 minutes for each testee, which means that Lex30 can easily be administered as
Chapter 3. Lex30
part of a larger test battery. A computerised version of Lex30 is described in more detail in Section 5 of this volume. Earlier, we criticised other attempts to measure productive vocabulary on the grounds that they often constrained the testee’s vocabulary choice very tightly, and on the grounds that they generated large amounts of data that threw very little light on the extent of the testee’s productive vocabulary. Lex30 seems to address both these issues effectively. The use of single word stimuli means that a large number of vocabulary areas are opened up very economically. Moreover, the ‘texts’ that the testees generate tend to be lexically rich compared to the texts generated by more traditional elicitation methods, and this means that almost every word in the text gives us some useful information. The lenient scoring method adopted for Lex30 – basically, any slightly unusual word produced by the testee counts towards their score – means that testees are given credit at every possible opportunity. This contrasts sharply with the scoring practices typically used in more strictly controlled productive tests, where only the ‘correct’ response is counted. In Lex30, the stimulus word POTATO might cause a medical student to respond with carbohydrate, and a waiter to respond with mashed. Both responses are ‘unusual’ in the sense that we are using that term here, and so both are awarded a point. In this way, we do not penalise students whose experience of words is influenced by special circumstances or special expertise. Lex30 also appears to have some potential as a diagnostic tool. Our results suggest that for most testees, the size of their productive vocabulary is broadly proportionate to the size of their receptive vocabulary. However, Figure 2 suggests that some testees do not fit this expected pattern. Particularly interesting are the four outlying cases in Figure 3 (cases 7, 18, 19 and 31), who appear to have productive vocabulary scores that are considerably larger than we would expect from their receptive scores. Conversely, cases, 23, 33, 36, 40 and 45 have a larger receptive vocabulary than might be expected from their productive vocabulary score. Interestingly, case 23 stated during the test that she could recognise many more words than she could use, “because I read a lot of scientific journals, but I don’t often get to speak English”. Whatever the cause of these imbalances, it ought to be possible to develop specific training programs designed to make up deficiencies of this sort, once they have been identified. Clearly, this line of argument is not straightforward, and the assumptions behind it need to be examined in rather more detail. It is easy to imagine cases where a severe imbalance between receptive and productive vocabulary might be acceptable, or even normal: L2 learners have diverse aims and needs, and it would be wrong to expect all learners to fit into a single, oversimplified model. Nonetheless, the data reported here suggest that there might be some basic patterns
Connected Words 70 31
60
29
Lex30 scores
50
45
40
18
30
40
7
36 33
20 23 10 0
0
2000
4000
6000 8000 yes-no test scores
10000
12000
Figure 3. Comparison of yes-no test scores and Lex30 scores: cases deserving special discussion.
in the development of L2 productive vocabulary, and that a tool like Lex30 could help to tease these patterns out.
Conclusion This study has examined a test of productive vocabulary which has a number of practical administrative advantages over tests currently in use. The data reported suggested that our test has considerable potential as a ‘quick and dirty’ productive vocabulary test, that might be used alongside other tests as part of a vocabulary test battery. It seems to correlate highly with a test of receptive vocabulary, but also to have considerable potential as a diagnostic tool for identifying cases where vocabulary development appears to be abnormal or skewed. We are currently working on a fully computerised version of Lex30. Future versions of Lex30 will be normed against native speaker behaviour, and we also intend to examine whether the performance of the test can be improved by a more careful choice of stimulus words. There are, of course, a number of outstanding issues
Chapter 3. Lex30
concerning the reliability and validity of the Lex30 methodology, but we hope that this preliminary account of our current work will stimulate further debate in this important area of research.
Appendix A Sample data: a completed Lex30 test 1
attack
war, castle, guns, armour
2
board
plane, wood, airport, boarding pass
3
close
lock, avenue, finish, end
4
cloth
material, table, design
5
dig
bury, spade, garden, soil, earth, digger
6
dirty
disgusting, clean, grubby, soiled
7
disease
infection, hospital, doctor, health
8
experience
adventure, travel, terrible
9
fruit
apple, vegetable, pie
10
furniture
table, chair, bed
11
habit
smoking, singing, nagging
12
hold
grip, hang on, cling
13
hope
expect, optimistic, pessimistic
14
kick
football, ground, goal, footballer
15
map
country, roads, way, location
16
obey
disobey, children, mum and dad, school rules
17
pot
kitchen, vegetables, cook, roast
18
potato
salad, roast, boiled, baked, chips
19
real
true, sincere, really
20
rest
pause, sleep, music
21
rice
pudding, fried, pasta
22
science
technical, physics, chemistry
23
seat
bench, sit, sofa
24
spell
grammar, test, bell
25
substance
material, chemical, poisonous
26
stupid
dumb, silly, brains
27
television
tv, cupboard, video, armchair, relax
28
tooth
ache, dentist, filling, injection
29
trade
commerce, bank, exchange, money
30
window
house, glass, broken, pane
Connected Words
Appendix B Lemmatisation criteria Words were lemmatised according to the criteria for level 2 and level 3 affixes described in Bauer & Nation (1993). Words with affixes included in the lists below were treated as instances of their base lemmas, and scored accordingly. Words with affixes that do not appear in the lists were not lemmatised, and were treated as separate words. Thus, UNHAPPINESS contains two level 3 affixes, UN- and -NESS, and is lemmatised as HAPPY. HAPPY is a level 1 word, and therefore UNHAPPINESS scores zero points. In contrast, LAUGHABLE contains an affix -ABLE which is not included in the level 2 or level 3 lists. LAUGHABLE is therefore not lemmatised as LAUGH. Although LAUGH is a level 1 word, LAUGHABLE is not, and it therefore scores one point for the testee. Level 2: Inflectional suffixes Plural past tense -ing superlative
3rd person singular present tense past participle comparative possessive
Level 3: Most frequent and regular derivational affixes -able (not when added to nouns) -er -ish -less -ly -ness -th cardinal - ordinal only -y adjectives from nouns non- un-
chapter 4
Exploring the validity of a test of productive vocabulary Introduction A few years ago we reported on the design of a new test of L2 productive vocabulary, Lex30 (Chapter 3 and Fitzpatrick 2000). The basic premise of this test was that a representative sample of words could be elicited from the productive L2 lexicon, using a word association task. This sample could then be categorised according to word frequency in order to measure the lexical resource of the test-taker. This method has one important advantage over traditional ways of assessing productive vocabulary, in that the ‘texts’ it generates are lexically very dense. Unlike essays, they contain few function words, and a very high proportion of content words. Our preliminary studies yielded some promising, if inconclusive results, with test scores correlating significantly with another measure of vocabulary size, and we concluded that the test had “considerable potential as a quick and dirty productive test that might be used alongside other tests as part of a vocabulary test battery” (Meara & Fitzpatrick 2000:28). Since the publication of this report, Lex30 has attracted a certain amount of attention from researchers, both in terms of its potential as a practical testing measure (Baba 2002; Moreno Espinosa & Jiménez Catalán, 2004), and in terms of the contribution it can make to the growing literature concerning the identification and categorisation of vocabulary knowledge (Rimmer, 2000). Baba in particular drew attention to the fact that the Lex30 studies which had been reported failed to draw any meaningful conclusions about the validity and reliability of the test. We feel that this is a legitimate and important criticism, and our recent work with Lex30 has included a number of experiments which aim to redress this. It is our intention here, then, to provide a brief description of the Lex30 test and a summary of our 2000 study, and then to report on three experiments which address test reliability, concurrent validity using native speaker norms, and concurrent validity using collateral test measures, respectively. We will also explore the construct validity of Lex30. In conclusion we will discuss a number of issues which have arisen from these experiments, looking both more closely at
Connected Words
the design of the test, and more broadly at the validity and usefulness of the concept which it claims to measure. The Lex30 test was described in detail in Chapter 3. Briefly, it comprises a word association task, in which Subjects are presented with a list of 30 carefully selected stimulus words in the L2 (English) and are required to produce up to four L2 responses to each of these stimuli. All of the items produced by a Subject in response to the stimulus words form a corpus which is then processed, and a mark is awarded for every infrequent word a subject has produced. Chapter 3 also reports a preliminary validation study of Lex30 in which we compared Lex30 scores with scores from a test of receptive vocabulary size. The correlations between the two tests were reasonably high, and we took this as an indication that Lex30 had some potential as a test tool. However, as was noted in the concluding remarks of Chapter 3 there were still “a number of outstanding issues concerning the reliability and validity of the Lex30 methodology”. These ideas are further explored in this chapter.
Reliability study If a test is reliable, it “produces essentially the same results consistently on different occasions when the conditions of the test remain the same” (Madsen 1983:179). A straightforward way to test reliability, then, is to present the same subjects with the Lex30 test on two different occasions, keeping all test conditions consistent, and to evaluate any difference between their scores. One of the crucial features of this test-retest method of reliability assessment is the time lapse between test time 1 and test time 2. To minimise any ‘practice effect’ (Bachman 1990), sufficient ‘forgetting time’ must be allowed between test times, but to minimise the effect of improvement (or attrition) in language ability, there should not be too much time between tests. After considering these factors, we decided that a 3-day gap between test times was appropriate. The subjects used for this experiment were 16 L2 users of English, from a range of L2 backgrounds, and varying in language proficiency from lower intermediate to advanced level. They took the Lex30 test twice, with a 3-day gap between test times. The test and retest scores for each Subject are illustrated in Figure 1. A comparison of means at the two test times gives a t-value of t=1.58 (p=.135), indicating that there is no significant difference between the two sets of scores. The correlation between the two sets of scores is .866 (p<.01). This demonstrates that subjects taking the Lex30 test more than once at a given point in their L2 development will achieve broadly similar scores each time. From this we can propose that the Lex30 test is indeed giving us information about the current state of that subject’s lexicon.
Chapter 4. Exploring the validity of a test of productive vocabulary 50
percentage score test time one
40
30
20
10
0
0
10
20 30 percentage score test time two
40
50
Figure 1. Test and retest Lex30 scores for each subject.
Clearly the similarity in subjects’ scores at test times one and two might be due to them having produced the same words in response to the test task. In order to investigate this, we divided each subject’s pair of corpora into three wordlists: words produced at both test times one and two, words produced only at test time one and words produced only at test time two. Figure 2 provides a visual comparison of illustrates the relative sizes of these word lists. Even without examining the statistics for individual cases, we can see that the corpora produced by a subject at test times 1 and 2 were in fact quite different in terms of the actual words they contained. It appears that all subjects demonstrate a tendency to produce new responses on the second test time, regardless of their overall corpus size or their final Lex30 score. In fact, as the statistics in Table 3 demonstrate, only around half of the words produced at test time one were actually produced again at test time two. However, we know that there is a strong correlation between the Lex30 scores from the two test times (.866, p<.01). These two facts, taken together, allow us to draw an important conclusion about the responses stimulated by Lex30. It appears that, although many of the actual words produced at each test time will be different, the profile of these words will be broadly the same. In other words, subjects are likely to produce similar proportions of infrequent words at each test time.
Connected Words 200 test time 1 + 2 test time 2 only
number of words
test time 1 only
100
0 subjects Figure 2. Numbers of words produced by each subject at test time one only, test time two only, and both test time one and test time two.
Table 3. Average number of words produced at test time one, test time two, and both test time one and test time two
Mean Sd
Test time 1 only
Test times 1 and 2
Test time 2 only
38 18
42 14
39 15
This is clearly an important observation. It seems to support the idea that the profiles which result from the Lex30 test task and analysis, are indeed individual to the subject’s lexicon; even if subjects produce a different set of response words, their profiles remain essentially the same at multiple iterations of the test. This in turn implies that we have succeeded in eliciting a sample of items from the lexicon which are representative, in terms of their inherent frequency, of the overall content of the lexicon. This experiment, then, has established that the Lex30 test has a high degree of test-retest reliability, and has indicated that the test is successful in eliciting a representative sample of the subject’s productive lexicon. However, while ‘reliability
Chapter 4. Exploring the validity of a test of productive vocabulary
is a requirement for validity’ (Bachman 1990:238), it is important to recognise that a reliable test is not necessarily a valid test, and we now turn our attention to two experiments which explore the validity of Lex30.
Validity study 1: Native speaker norms The study described in the previous chapter indicated that subjects perform in a similar way on Lex30 and on the EVST test of vocabulary recognition. However, as we have mentioned, these two tests are based on different constructs – productive vocabulary in the case of Lex30 and receptive vocabulary in the case of EVST, and we should therefore be cautious about validity claims based on this experiment. In Bachman’s words, validity is a quest for “agreement between different measures of the same trait” (1990:240); whether these two tests constitute the ‘same trait’ is arguable. We are therefore left needing to find other ways of evaluating the validity of Lex30. While one way of doing this is to compare the performance of a subject group on two tests measuring the same trait, a second approach is to look at the performance of two different subject groups on the same test. This approach can assess what Bachman calls “concurrent criterion relatedness”, (1990:248), with the criterion in question being “level of ability as defined by group membership”. Following this approach requires us to identify a group who we know to have a certain level of ability in the trait being measured. Native speakers of English seem to be a sensible choice here; although they disagree as to the actual vocabulary size of a native speaker, researchers agree that the native speaker lexicon will be much larger than that of a nonnative speaker (Aitchison 1987; Meara 1988; Nation 2001). We can argue therefore that by comparing the performance of a group of native speakers on Lex30 with the performance of a group of non-native speakers, we will be able to evaluate the concurrent validity of the test. The subjects used for this experiment were 46 adult L1 speakers of English from Britain and North America. The native speaker subjects completed the Lex30 test, and their scores were then compared with those of the 46 non-native speakers’ scores which we had obtained from the pilot study (for this experiment all scores represented the number of infrequent words expressed as a percentage of all words produced). The descriptive statistics for native and non-native speaker groups are shown in Table 4. Table 4. Descriptive statistics for native and non-native speaker Lex30 scores
Native speaker Non native speaker
n
mean
Sd
46 46
44 30
7.62 9.34
Connected Words
In general, native speakers’ Lex30 scores are higher than those of non-native speakers. In fact, the table shows that the mean scores of the two groups differed considerably, with non-native speakers scoring an average of 30 and native speakers averaging 44. An independent samples t-test also indicates that native speakers score consistently higher than non-native speakers taking the test (t=7.5, p<.001). A closer look at the statistics, though, shows us that the difference between the scores of the two groups is not an absolute one. In fact, as illustrated in Figure 5, which shows the number of native speaker and non-native speaker cases falling within each band of scores, there seems to be a good deal of overlap between the scores of the groups. These results raise two important issues about the way Lex30 measures the productive lexicon. Firstly, there appears to be a broad but distinct difference between the scores achieved by native and non-native speakers. Secondly, though, and somewhat contrarily, there is a considerable degree of overlap between the scores of the two groups. A comparison of the mean scores of the two groups of subjects leaves us in little doubt that native speakers respond to the Lex30 test differently from non-native speakers; they produce a higher percentage of low-frequency words in response to the association prompts. The simple answer to the question of why they do this is, of course, because they can. Our native speakers have a larger lexical resource than our 14 NS 12
NNS
number of cases
10 8 6 4 2 0
10 to 14 20 to 24 30 to 34 40 to 44 50 to 54 15 to 19 25 to 29 35 to 39 45 to 49 55 to 59 score
Figure 5. Number of cases falling within score bands.
Chapter 4. Exploring the validity of a test of productive vocabulary
non-native speakers. It is likely that both groups’ lexicons will contain most if not all of the 1000 most frequently occurring words in English, as these are the most commonly encountered words and probably the most often used. The lexicons will differ, then, in the number of infrequent words they contain. If we randomly select words from the larger lexicon of the native speaker, we are more likely to retrieve infrequent words than we will from the smaller lexicon of the non-native speaker. In this respect the results of this experiment are encouraging; we designed the Lex30 test in order to obtain as representative a selection of words as possible from the productive lexicon. It makes sense to assume that the sample from the native speaker lexicon will contain more infrequent words than a sample from the non-native speaker lexicon, and indeed this is the case, indicating that our sampling technique is an effective one. This conclusion is tempered somewhat, though, by the fact that there is an overlap between the scores of the two groups, with 18 non-native speakers achieving a higher score than some native speakers, and only 6 of the native speaker group scoring higher than the highest scoring non-native speaker. We should perhaps not be surprised about the variation in native speaker scores; while in theory Bachman believes native speakers should provide us with an effective control group, the complexities of their language use can make this a problematic choice in reality. Bachman warns that: “The language use of native speakers has frequently been suggested as a criterion of absolute language ability, but this is inadequate because native speakers show considerable variation in ability” (1990, p. 39). The abilities which he particularly has in mind are “cohesion, discourse organisation and sociolinguistic appropriateness”, and while we had hoped that the discrete and context-free nature of the Lex30 task made it less susceptible to variation, this is perhaps not the case. Despite individual variances, though, the native speaker subjects all scored higher than the average nonnative speaker score. This is illustrated in Figure 6, where the average non native speaker score of 30 is marked. Figure 7 allows us to compare nonnative speaker subjects’ scores with the native speaker mean score of 44. Five nonnative speaker subjects actually scored higher than the native speaker average. It is helpful to look more closely at those five non native speaker subjects with exceptionally high scores. Four of the five subjects are Icelandic secondary school teachers of English, which in itself marks them out as potentially very proficient language users. In our pilot experiment we obtained vocabulary size scores for these subjects using the EVST test, and these are shown in Table 8. These EVST scores are interesting because the EVST test has a ceiling score of 10000; native speakers consistently score between 9500 and 10000. This suggests that, at least for subjects 2, 3 and 4, the Lex30 test, like EVST, is simply not sensitive enough to recognise them as non-native speakers. Subjects 1 and 5 have relatively high EVST scores too, though not in the native speaker range. Subject 5 is a teacher
Connected Words 60
Lex30 score
50
40
30
20
10
0
subjects
50
Figure 6. Distribution of native speaker scores (reference line shows non native speaker average score). 60
Lex30 score
50
40
30
20
10
0
subjects
50
Figure 7. Distribution of non native speaker scores (reference line shows native speaker average score).
Chapter 4. Exploring the validity of a test of productive vocabulary
of English, and subject 1 is a very proficient German student who, in terms of tests and coursework, consistently scores higher than his peers in the top level advanced language class. The fact that Lex30 fails to mark these unusually proficient subjects as non-native speakers indicates that it does in fact work well enough to pick out quasi-native speakers. Table 8. Lex30 scores and EVST scores of highest scoring non native speaker subject
nns 1 nns 2 (I) nns 3 (I) nns 4 (I) nns 5 (I)
Lex30 score
EVST score
53 48 47 47 44
6500 9900 9850 10000 7700
(I) = Icelandic teacher of English
We can conclude, then, that this study demonstrates that the Lex30 test has some validity. Insofar as the design of the test allows, it can distinguish native speakers from nonnative speakers. For Lex30 to be of practical use, though, it should distinguish between non native speakers of different language proficiency. Our next study addresses this issue.
Validation study 2: Collateral tests A large part of our motivation for devising Lex30 was the dearth of effective tests of productive vocabulary currently available. We have discussed this issue at more length elsewhere (Chapter 3 and Fitzpatrick, 2003). However, we feel that our investigation into the validity of the Lex30 test would be incomplete without making a comparison of its behaviour alongside other tests which claim to measure the same construct, notwithstanding that we have some reservations about the effectiveness of those tests. The final study we will describe here, then, is a test of concurrent validity “examining correlations among various measures of a given ability” (Bachman, 1990:248). The measures we have selected for use alongside Lex30 are the Controlled Productive Version of the Levels Test (Laufer & Nation, 1999) and a straightforward translation task from L1 to L2. The Productive Levels Test, like Lex30, evaluates vocabulary knowledge with reference to word frequency bands. Eighteen target words are selected from each frequency band, and are embedded in a contextually unambiguous sentence. The first few letters of the target word are given in order to eliminate other conceptually
Connected Words
possible answers, and subjects are required to complete the target word. The vocabulary knowledge displayed in the completion of this test is productive in that the subject has to be able to write the word rather than recognise it. The test is “controlled” in the sense that the subject is prompted to produce a predetermined target word, whereas in free productive vocabulary tasks such as composition writing or oral presentation, or indeed Lex30, there is no compulsion to produce specific words. The test incorporates five frequency bands: the 2000, 3000 and 5000 word levels, the University word list level and the 10000 level (Laufer 1998). Laufer and Nation suggest various methods of scoring the test, but in their 1999 study they calculate scores by counting the number of correct answers given at each level and simply adding them together. This is the method we use here. The second validation tool used in this study is a straightforward translation task from the subjects’ L1, in this case, Chinese. Subjects were given a set of 60 Chinese (Mandarin) words and asked to translate them into English. To minimise the effects of synonyms and homonyms, the first letter of the correct answer for each item was provided. The set of 60 words consisted of 20 randomly selected from Nation’s first 1000 frequency list, 20 from the second 1000 and 20 from the third 1000 (Nation 1984). This meant that the target words were of varying difficulty, and were broadly comparable to the difficulty of the words used in the other tests. The Translation Test is clearly a task of productive vocabulary ability, and unlike the Productive Levels Test has the advantage that it is a context free task, which does not depend on subjects understanding the context the word is provided in. In the scoring of the test, subjects were awarded a point for every target word produced, regardless of the accuracy of spelling. We selected these two tests as tests of concurrent validity because they share certain characteristics with Lex30: –– all three tests work on the premise that vocabulary can be measured – i.e., that we can, to an extent, quantify the number of words a subject has in their L2 and that this number is somehow meaningful in terms of overall proficiency –– all three are tests of productive rather than receptive vocabulary, requiring subjects to write down words which are prompted in various ways (we should note here that the Productive Levels Test does require subjects to engage receptive skills too, in the comprehension of the context sentence) –– the use of frequency bands is central to the design of all three tests; the Productive Levels Test focuses on subjects’ knowledge of words at 5 different word bands, and the translation test on the 1000 – 3000 word bands, and Lex30 awards points for words produced from outside the 1000 level. 55 Chinese learners of English were used as test subjects. The subjects were all undertaking a preparatory “pre-sessional” programme of English language
Chapter 4. Exploring the validity of a test of productive vocabulary
improvement classes in preparation for entry to university in Britain. Their class teachers rated them from intermediate level to advanced, which normally means that we could expect them to know most of the target words in the translation test and the first two to three levels of the Productive Levels Test, and all of the cue words in the Lex30 test. The tests were administered during two class sessions, with subjects completing first the Lex30 test task and then the translation task in the first session. In the following day’s class, subjects were given the Productive Levels Test. Table 9. Correlations between test scores
Lex30 Productive Levels Test
Productive Levels Test
Translation test
.504 (p<.01)
.651 (p<.01) .843 (p<.01)
Table 9 shows that there were significant correlations between the results of the three tests. However, the correlations were not as high as we had expected on the basis of the common test factors listed above. While the scores from the Translation test and the Productive Levels Test correlate strongly, there is a much more modest correlation between these two tests and Lex30. This suggests that either the tests are in fact measuring different things, or that the tests vary in their degree of accuracy. Let us first attempt to explain the strong correlation between the Productive Levels test and the Translation test. All the words used in the translation test were from the 3000 most frequent English words. Although the Productive Levels Test targets words from each of 5 frequency bands, in reality the subjects in this experiment struggled to produce any target words at bands higher than 3000; almost all of the correct answers they produced were at the 2000 and 3000 levels. This means that the Productive Levels Test scores reflected to a very large extent – exclusively in many cases – subjects’ knowledge of the first 3000 words. This explains the high correlation with our translation test scores; in effect, the two tests were focussing on the knowledge of the same 3000 words. The Lex30 test, on the other hand, takes into consideration – and awards marks for – any words from outside the first thousand. By requiring subjects to produce words spontaneously rather than prompting them to produce pre-selected target words, the Lex30 test can give credit for knowledge of all infrequent words, no matter which frequency band they are categorised in. We know that the Lex30 scores consist mostly of words from the third thousand and beyond, with some contribution from the second thousand band (Meara & Fitzpatrick 2000). This means that the Lex30 scores are
Connected Words
less dependent only on the subjects’ knowledge of words in the first three thousand bands, than are the scores generated by the Productive Levels Test and the translation test. Clearly, further analysis of this feature of Lex30 is required. We still need to explain, though, the lack of a strong correlation between Lex30 and the other two tests, and we suggest that this is due to the fact that the tests are measuring different aspects of vocabulary knowledge. An expectation of high correlations between the tests assumes that all three tests measure productive vocabulary knowledge exclusively and completely. Vocabulary knowledge, though, is a rather more complex concept than this implies. To illustrate this, Table 10 lists Nation’s aspects of word knowledge (1990), with an indication of which aspects are measured by each of the three tests in this study. The table indicates that despite their superficial similarities we might expect correlations between the three tests to be modest – they are in fact measuring different aspects of productive knowledge. The modest yet significant correlation between Lex30 and the Translation Test and Productive Levels Test indicates that the tests are operating in the same broad area of knowledge, but the Lex30 test appears to be tapping into different aspects of productive vocabulary knowledge than the other tests. Table 10. Aspects of Word Knowledge (from Nation, 1990) tested by the Translation test (T), the Productive version of the Levels Test (P), and Lex30 (L) ASPECT OF WORD KNOWLEDGE (R=receptive, P=productive)
T P L
form: spoken form
R what does the word sound like? P how is the word pronounced?
form: written form
R what does the word look like? P how is the word written and spelled?
position: grammatical position
R in what patterns does the word occur? P in what patterns must we use the word?
y
position: collocations
R what words or types of words can be expected before or after the word? P what words or types of words must we use with this word?
y
function: frequency
R how common is the word? P how often should the word be used?
y
function: appropriateness R where would we expect to meet this word? P where can this word be used? meaning: concept meaning: associations
R what does the word mean? P what word should be used to express this meaning? y R what other words does this word make us think of? P what other words could we use instead of this one?
y
y
y y
y y
Chapter 4. Exploring the validity of a test of productive vocabulary
Discussion We began our exploration of the Lex30 test by identifying a need for an effective test of productive vocabulary. The design of the Lex30 test seemed to be an attractively simple way of meeting this need; it elicits vocabulary in an efficient way and processes the resulting corpus according to the sort of word frequency criteria which have been accepted as common currency by many language testers. However, the studies described above have left us with some important residual issues to discuss. The first of these are technical issues relating to the design of the test itself. In order to operate effectively, the Lex30 test has to achieve two broad objectives. Firstly it has to elicit a representative sample of vocabulary from the productive lexicon, and secondly it has to evaluate this vocabulary in an effective way. Our studies so far have given two major indications that the Lex30 test achieves the first of these aims satisfactorily. Our test-retest study showed that, although the word association texts produced by subjects at test times one and two contain many different lexical items, the frequency profile of these corpora are broadly the same. Secondly, our native speaker subjects score higher on average than all but the most proficient non native speaker subjects. These results also indicate that the elicited vocabulary is being measured with some accuracy. However, we believe that using a more up-to-date set of frequency bands might improve the accuracy of the Lex30 measure. To this end we are currently engaged in producing a revised version of the test , which uses the JACET 8000 wordlists (JACET 2003). One of the major advantages of Lex30 as a test of vocabulary is that it is easy to administer. This is especially the case since we have succeeded in automating the data collection stage in the testing process. A computer programme which automates the processing and scoring of the test is described in detail in Section 5 of this book, and we will report on experimental studies arising from this new format in due course. The second major issue to emerge from these studies is a more complex one, and relates to the construct on which the test is based. It seems straightforward to describe the vocabulary which is tested by Lex30 as “productive vocabulary”. However, subjects’ knowledge of the words they produce could vary widely. For some response words, subjects might only have a threshold level of knowledge, where they know the form of the word and can reproduce it reasonably accurately. For other words, they may have a much deeper knowledge, where they know about its form, use, register, collocations, meaning, associations and so on. This variation is not something that we would expect to find in the data produced by the Productive Levels Test, for example. That test seems to demand knowledge of form, meaning and collocation of target words, as well as understanding of the contextual cue sentence. As Table 10 exemplified, the concept which we are in the habit of referring
Connected Words
to as productive word knowledge, actually encompasses many subcategories of word knowledge, each of which learners will have acquired to varying depths, all of which are interrelated and all of which are in a state of potential change. This is a much more complex situation than, for example, the Free Productive, Controlled Productive and Receptive word knowledge distinction proposed by Laufer. In the light of this complexity it is vital to recognise that these so-called productive vocabulary tests address different aspects of word knowledge. If it is an overgeneralization, then, to call Lex30 a test of productive vocabulary knowledge, what exactly does it test? Producing a word in response to the Lex30 task certainly implies a minimal level of productive knowledge. In this context subjects do not have to demonstrate any collocational knowledge (the word does not have to be placed in a sentence) or even any semantic knowledge (they are not asked to explain the association link), but some knowledge of form is clearly necessary. Read (2000) distinguishes between two kinds of productive vocabulary knowledge: recall and use. His definitions make it clear that the Lex30 test evaluates recall ability rather than use ability. Recall, he says, is tested when subjects “are provided with some stimulus designed to elicit the target word from their memory”, whereas “use means that the word occurs in their own speech or writing” (p. 156). This presents us with the problem that use presupposes recall but recall does not presuppose use: we know that a word produced in response to the Lex30 task is known in a “recall” sense, but we have no indication of whether or not a subject can also “use” it. When we examine the constructs of tests which claim to measure productive vocabulary, then, we find that many of them do not measure the same things at all; productive vocabulary is a misleadingly simplistic label for an extremely complex construct. It seems likely that much more work is needed if we are to develop meaningful tools in this area. In the meantime, though, there is clearly a need, among teachers, learners and researchers, for an effective battery of test tools which can be used to gain an insight into the lexicons of individuals as well as shedding some light on the general behaviour of the L2 lexicon. We feel that the studies we have described in this paper indicate that Lex30 is a robust enough measuring tool to fill an important gap in the battery of tests currently available.
section 3
Word association networks Introduction The three chapters in this section represent another different take on word associations in an L2. By the early 1990s, I was beginning to get interested in the idea of vocabulary networks. This way of thinking about vocabularies is so pervasive in the literature that it is easy to forget that it is only one of many metaphors which we use to talk about words. It seemed to me that a lot of people were using the network idea in a very loose kind of way, without really working through its implications. I began to think that it might be interesting to formalise the idea of a vocabulary network, and see whether vocabulary networks really did have the properties that everyone was assuming they had. This thinking eventually led me to work on very simple model lexicons, and this work was published in a series of papers dealing with attrition and growth in vocabularies (Meara 2004, 2007a). A lot of this work involved graph theory – a set of mathematical ideas which are used to deal with network structures of many different types. Graph theory is widely used to describe and analyse an enormous range of real phenomena, and it has some very important practical applications. The basic idea is that certain relationships and processes can be represented as a system of points (known as nodes) connected together by lines (known as arcs). Many different structures can be represented in this formalism. Figure 1, for example, shows the main airline connections between cities in Sweden (in 1990). Each node represents a major city, and the arcs joining the nodes represent connecting services between the major cities. This basic graph could, however, also represent a set of quite different physical realities with the same underlying mathematical structure. For example, the graph in Figure 1 could also represent the architectural structure shown in Figure 2, where each node stands for a room, and each arc is a connecting door between the rooms. Or, again, the graph might represent transitions between certain keys in a piece of music, and so on. Vocabulary networks obviously lend themselves to this kind of analysis. All we need to do is to represent each word as a node, and each associational link as an arc, and we have a complex, but tractable graph. Some surprisingly neat ideas come out of this. As we saw in Section 2, there is a general assumption in the vocabulary literature that L2 words can be thought of
Connected Words
N
Jo
Su
Go
St
Ma
K
Figure 1. Direct intercity air routes in Sweden (in 1990).
Su
Jo
N
Go
St
Ma
K
Figure 2. An architectural design with the same graph structure as Figure 1.
as positioned on some sort of continuum (Faerch et al. 1984; Tréville 1988; Palmberg 1987; Melka Teichroew 1982). Some people described this as an active/passive continuum, while for others it was a receptive/productive continuum – European authors seem to have preferred the former designation, while the second was more common in North America. It follows from this that teaching vocabulary is largely about providing the motive power that moves words from the passive end of this continuum towards the active end, and preventing words at the active end of the continuum from sliding back into passivity. A graph theory approach suggests a rather different analysis. Consider the network fragment illustrated in Figure 3, which shows a small set of associated words in the weather domain. The graph shows that there are a
Section 3. Word association networks
snow ice
cold winter
hail
rain
fog
cloud
sleet
storm
Figure 3. A fragment of a word association network.
number of associational links between the words – though only the connections within the set are shown here. Most of the words have multiple connections with other words, and these connections run in both directions, both from and to each of the nodes in the network. One word stands out from this general pattern, however. SLEET is linked to the network, but only via two connections, both of which lead from SLEET to the rest of the network. SLEET has no incoming connections in Figure 3. It seems plausible to argue that the position SLEET finds itself in would be the typical position of a passive vocabulary item. If this word is activated by an external stimulus – for example, if we hear the word spoken, or come across it in a text – then the links with RAIN and COLD become activated too, and from there it is easy to recover the meaning of SLEET – its associations with COLD and RAIN capture the meaning of SLEET fairly succintly. However, if this external activation is not available, then there is no way of activating SLEET, since there are no afferent connections between SLEET and the rest of the vocabulary. This type of reasoning leads us to reject the idea of an active/passive cline for vocabulary. What it suggests is that the distinction between active and passive vocabulary is a clear-cut dichotomous one, rather than a cline or a continuum. Passive vocabulary is vocabulary which is linked to the rest of the network only by afferent links, and this makes it qualitatively different from the rest of the vocabulary network. This in turn suggests that making newly learned vocabulary items active is not just a question of nudging them along a continuum. Rather it involves a change of status which has something to do with building new associational links that connect from the rest of the vocabulary to the new word (cf. Meara 1990). This is just one simple illustration of the way thinking in network terms raises interesting questions about vocabulary. Our next step was to look in detail at network fragments of the kind illustrated in Figure 3, and to compare the association
Connected Words
networks that native speakers and learners generated when faced with sets of stimuli that shared semantic features. The networks generated by the L2 speakers in this task were generally very sparse, and the general conclusion we reached was that L2 vocabulary networks were not well connected at all. As I got more involved with network models, I began to wonder whether a more abstract approach might be more illuminating than one that focused on particular sets of words. Specifically, I began to wonder whether it might be possible to measure how connected a vocabulary network was – how many connections there were between one word and another; whether core vocabulary items were better connected than peripheral items; was vocabulary size a critical factor in the way links were formed; and so on. What made these questions interesting from a theoretical point of view was that I was beginning to develop some ideas about a two dimensional model of vocabulary acquisition (Meara 1996). One of these dimensions was a vocabulary size dimension – how big your vocabulary was – usually referred to in the research as vocabulary breadth. Most people at the time were also working with the idea of vocabulary depth – how well you know the words you know – but I was very uncomfortable with this idea. It seemed to push us in the direction of asking more and more detailed questions about fewer and fewer words, and I thought that this was a dead end for research. The fundamental problem here is that vocabulary depth is a property of individual words, not a property of the vocabulary as a whole, and in this respect, depth is very different from vocabulary breadth, which looks like a property of individual words, but is actually a property that is defined over the entire vocabulary. It looked as though our YES/NO vocabulary size tests would provide for the breadth dimension. What I needed for my two dimensional model was a property that reflected depth, and could act as a surrogate for depth, but was actually a characteristic of the entire vocabulary. Formal graph theory had developed a number of measures which looked as though they might be relevant, and I thought that they would be relatively straightforward to apply to vocabularies. Chapter 5, which dates from 1992, was my first serious foray into this area. In this chapter, I developed the idea that some simple-looking word association tasks might be able to throw light on the degree of connectivity between words in an L2 lexicon, and if this is true, then it might enable us to make interesting claims about the way L2 lexicons grow and how words become more densely interconnected as proficiency develops. The paper anticipated some very influential work on small world effects in language which came to prominence over a decade later (Watts & Strogatz 1998 and Ferrer i Cancho & Solé 2001). I am still undecided whether the work reported in Chapter 5 is a good paper or not – for a long time, I used it in my research methods classes as an example of the importance of mathematical
Section 3. Word association networks
models in Applied Linguistics, but also as an example of how difficult it was to apply these models in a sensible way. I thought the basic idea was sound, but I had seriously underestimated the difficulties of transferring a simple theoretical idea into an area where the data were particularly messy and difficult to interpret. With hindsight, I think that there were some good basic insights in this paper, but I was hampered by a limited understanding of the maths, and I made some bad decisions about methodology. We spent a lot of effort in the years that followed publication of Chapter 5 attempting to make the methodology more reliable, but always we ran into the problem that people generated simply bizarre associations in order to complete the experimental tasks. Asking judges to rate other peoples’ associations proved to be a particularly problematical aspect of this methodology. This work is extensively reported in Wilks (1999). After much trial and error, we eventually decided that we might get more tractable results if we used experimental tasks where subjects had to identify associations rather than generate them. This led to a series of studies in which we asked subjects to identify the associations they could find in small sets of randomly selected words. This turned out to be a much more reliable way of tapping into the association networks of L2 speaker, though it was achieved at the cost of losing active vocabulary production in favour of a more receptive task. Chapter 6 summarises our thinking in this area, and describes some computer programs that we designed to test these ideas further. Section 5 contains a manual for the latest version of these association tests. The work described in Chapter 6 presented subjects with sets of randomly selected words from a basic L2 vocabulary, and asked them to identify associated words within these sets. We knew that this task distinguished reliably between native speakers and L2 speakers, so that in a practical sense, the methodology “worked”. However, there were a number of practical issues over what was the optimum size of a set, how many sets we needed to test, whether we should ask subjects to identify just a single associated pair, or more than one if they could, whether the perceived strength of the associations made a difference to the data patterns, and so on. It was difficult to answer these questions without carrying out an impossibly large number of exploratory investigations. Eventually I decided that the only practical solution to this problem was to run a series of simulations in which we modelled association networks, and worked out how many association pairs we would expect to find in networks with different degrees of connectivity. In theory, this should have told us how complex our stimulus sets needed to be, but it turned out to be a much more interesting question than I had expected. It turns out that the probability of finding an associated pair in a set of N randomly selected words depends very much on how the simulation defines “an associated pair”. This idea is, of course, blindingly obvious once it has been pointed out, but
Connected Words
some of our early work in this area had made rather strict assumptions about what counted as an association, and adopting more lenient criteria suggested that the differences between native speaker and learner vocabularies might not actually be as great as we had assumed. These ideas, which have some important methodological implications for word association research with L2 speakers, are explored in detail in Chapter 7. My current thinking is that it is possible to use association recognition tests combined with some simple mathematical models as a way of estimating the number of associational links between words in the core of an L2 lexicon (Meara 2007b). At the moment, this idea is a very exploratory one, but if it works, then it would provide us with a way of quantifying the way L2 lexicons are organised, and this would open up large areas of research which relate vocabulary size, vocabulary depth and other aspects of vocabulary structure in an L2.
chapter 5
Network structures and vocabulary acquisition in a foreign language The starting point for this paper has nothing to do with vocabulary. It is in fact a classical study in social psychology carried out by Stanley Milgram (Milgram 1967). Milgram was interested in what has come to be known as the ‘small world problem’ – the curious coincidences which occur whenever groups of strangers meet, and find that they have acquaintances in common. From a purely statistical point of view, the fact that strangers have acquaintances in common is not so surprising as it feels when it happens in real life. Take a large country, say the USA with a population of about 250 million, and assume that everybody is acquainted with about 1000 other people. Given these stating assumptions, the probability of two random people already knowing each other works out at about 1 in 100,000. However, the probability of their having a friend in common is much higher than this, about 1 chance in 100. The chances of two strangers (call them A and D) being connected by a chain of two acquaintances (A knows B who knows C who knows D) is better than 99 chances in 100. Milgram demonstrated that these statistical arguments were a fairly close model of what happens in real life. He did this by selecting a random group of ‘start persons’, and giving each of these people a package which contained the name of another randomly selected ‘target person’. The start person was asked to forward his package to the appropriate target person, but to do this by handing the package on to a personal friend. The friend in his turn passed the package on to another friend, and so on until the package reached its destination. The question Milgram asked was: how many intermediate links are necessary for the package to reach its target? The answer to this question is: surprisingly few. In fact, Milgram found that the number of intermediate links needed varied from only two to ten, with an average of about five. The mathematics which underlies surprising findings of this sort is known as graph theory. If you reduce Milgram’s parcel problem to its bare essentials, you find that you are dealing with a network of people, where each person is directly linked to some of the other people in the network, but not all of them. All the people are joined to the network somehow or another, but there are no direct
Connected Words
connections between every person and every other person. We can draw this as a set of points on a sheet of paper, where each point represents a person, and a set of lines joining the points together if the two people concerned know each other. Diagrams of this sort are called graphs. Figure 1 shows a set of relatively simple graphs, which vary in the number of points they contain (the size of the graph), and the number of connections that each point has (the valency of the points). It will be fairly obvious that both size and valency affect a third parameter: the distance you have to travel between any two randomly selected points (the diameter of the graph). In Figure 1a, for instance, we have a graph of size 10, in which each point has a valency of 3. If we take a random node as the starting point, and work out what is the shortest distance between this point and all the other points, we find that three points can be reached in one step, and six points can be reached in two steps. The average distance is 1.67 steps [((3*1)+(2*6))/9]. In Figure 1b, we have increased the size of the graph to 30 points, but held the valency constant at 3. In this graph, starting from any random node, we have 3 points that can be reached in one step, six that can be reached in two steps, 12 that can be reached in three steps, and 8 points that can only be reached in four steps. The average path length is 2.68 steps: [((3*1)+(2*6)+(3*12)+(4*8))/29]. In Figure 1c, we have another graph of size 30, but in this graph, the valency of the points has been increased to 5. This produces a much more complex graph than 1b, but the diameter of 1c is much smaller than we found with 1b. In 1c, from any random starting point, we have five points that can be reached in a single step, sixteen that can be reached in two steps, and six points that require three steps. The average path length is 1.89 steps [((5*1)+(16*2)+(6*3))/29]. Masochists might like to experiment with other graphs of a similar kind. The general point to be taken here is that there are fairly straightforward mathematical relationships between graph size, the valency of the points and the average length of the paths between any two points on the graph. At this point, you might be wondering what all this has to do with vocabulary acquisition. The short answer to this question is that the mathematics of graphs can also be applied to vocabularies. The long answer is a bit more roundabout. One of the things that have been bothering me for some time is that vocabularies- even vocabularies in second languages – are very large. No-one knows for certain how many words proficient second language learners know: estimates vary wildly, from a couple of thousands to several tens of thousands, depending on who you take as the authority (for a detailed discussion of this issue, see Nation 1988). However, even if we accept the lower estimates, we still have the problem that a few thousand words is nonetheless a great deal of information. In particular, even a few thousand words is a lot of information relative to the types of tools we typically use
Chapter 5. Network structures and vocabulary acquisition in a foreign language
(a)
(b)
(c) Figure 1. Graphs of varying sizes and valencies.
for investigating vocabularies in L2 learners. Most empirical work on vocabulary acquisition uses a very small range of investigative tools: multiple choice tests are fairly common; think-aloud techniques are widely used; word-recognition tests are probably the third most common type of task. When people use these techniques
Connected Words
in their experiments, they typically look at the way learners handle twenty to forty items, and generalise from this sample to the vocabulary as a whole. There is nothing wrong with this, of course, except that twenty to forty items amount to a very small sample out of a target population of several thousands. To get round this problem, what we really need is a technique that allows us to look at the overall structure of vocabularies, rather than the fine detail provided by in-depth investigation of a handful of words. One obvious alternative to the test types mentioned above is the word association technique. Word associations have been used fairly widely in the literature on second language vocabulary acquisition, and they have the advantage that relatively large numbers of words can be tested in a relatively short space of time. In practice, however, most of the analysis of word association data has been qualitative, very much concerned with individual word and the responses they produce. The obvious parallel between word association networks and formal graphs, and the fact that these parallels allow us to talk about large-scale mathematical properties of vocabularies, seems not to have been exploited at all. What kinds of exploitation might be possible? Earlier we established that three linked parameters could be invoked to explain the sorts of results that Milgram found in his parcel experiment. If we accept that the mathematics of graphs also applies to vocabularies, then these same parameters should also be relevant in descriptions of lexical structure. Obviously, we would expect L1 vocabularies to be bigger than L2 vocabularies, but it is much less obvious what sorts of claims we could make about the average valency of items in an L2 vocabulary. The simplest hypothesis to test is that the valency of L2 vocabulary items is much smaller than is the case for L1 items. Now, we know that if we hold constant the size of the graph, small valencies will produce long diameters, and large valencies will produce small diameters. So we can now set up an experiment which is an exact parallel of Milgram’s, except that instead of using named people as source points and target points, we can use words. The subjects in the experiment are given two random words, and asked to construct a chain of associations between them. As long as we use moderately advanced non-native speakers, and use words which are clearly part of any speaker’s core vocabulary, we can probably assume that total vocabulary size is not a major factor in the task. The beauty of this design is that we are not concerned with the associations themselves; the data that we are interested in is simply how many steps are necessary to get from one word to another. If we are right to guess that low valencies are characteristic of L2 speakers, then we would expect the chains produced by our L2 speakers to be systematically longer than the chains produced by a matched set of L1 speakers.
Chapter 5. Network structures and vocabulary acquisition in a foreign language
Figure 2 shows the results of a study of this sort. Ten advanced non-native speakers of Spanish were asked to produce association chains between thirteen pairs of Spanish words. Several weeks later, we asked them to do the same task using pairs of English words. The word pairs used were: cold … desire; temptation … conqueror; window … hill; benefit … witness; generation … invention; oven …veil; chaos … editor; majority …prisoner; kindness … panic; sin … desert; publication … captivity; lover … thread; idleness … luggage. All these words occur in the first 6,000 most frequent words in English. The Spanish version of the test used translation equivalents of these word pairs in Spanish; these words fall in roughly the same frequency band as the English words (cf. Eaton 1961). The task was carried out in writing, and no time limit was imposed. For each subject, we calculated the mean number of steps that had to be taken to move from the source word to the target word. That is, a chain like cold => hot => passion => desire would score three points, while a chain like oven => hot => desert => Arab => veil scores four points. We made no attempt to check whether the associations were valid: each subject’s returns were taken at face value. 4
3
2
1
0
English L1
Spanish L2
Spanish L1
Figure 2. Mean chain length between pairs of randomly chosen words.
The results for English are very straightforward. The mean chain length is just over three steps. All the words in the chains are fairly frequent ones, and if we assume that they all come from a pool of about 8,000 words, then this chain length is about what you would expect if each word has a valency of 20. In other words, this result is broadly in line with the sort of data that Milgram reports. Unfortunately, the Spanish data is very different from what we expected. Remember that
Connected Words
our presupposition was that valencies in an L2 are smaller than the valencies in an L1, and we expected this to produce longer chains in the L2. In fact, the Spanish chains are reliably shorter than the English chains (t=4.22, p<.001 with 9df). The difference is not actually very large, but as we pointed out above, small differences in diameter can arise when the underlying valencies are very different. If we follow through the logic of our earlier argument, this data would force us to conclude that valencies in an L2 are actually higher than valencies in an L1. It is difficult to imagine any simple model in which such a conclusion would make sense, however. A more likely explanation is that the psychological cost of moving round a semantic network is higher in an L2 than in an L1, and that this encourages L2 speakers to produce short chains. This sort of idea does not figure in our earlier discussion of graphs, but it can be incorporated by adding weightings to the connections between points, instead of treating all the connections as equivalent. (The mathematics is similar to the types of calculations used to model traffic flow on road networks.) An alternative approach would be to argue that the linguistic structure of the English and Spanish lexicons are subtly different, and that there is something about the Spanish lexicon which makes it give rise to characteristically shorter chains. Some evidence to support this view appears in the rightmost portion of Figure 2. Here we have data produced by a group of L1 Spanish speakers, who were asked to produce association chains to the same thirteen pairs of Spanish words that were used with the L1 English speakers. These L1 Spanish speakers produced data which is virtually indistinguishable from the data produced by our original subjects when they performed in Spanish. Both sets of data are reliably different from the English data. We are now faced with a set of questions which are much more subtle than the simple parallels that we started out with. Our data clearly suggests that Spanish gives rise to systematically shorter association chains than English does. Obviously, we would not want to jump to a conclusion like this without a lot more work with larger groups of subjects, and much larger sets of source and target words. But if this speculation is a good one, then there are two possible explanations for it. The first is that Spanish may have a richer associational structure than English does (i.e., the characteristic association valency of Spanish words is higher than the characteristic association valency of English words). The second is that the effective size of a working vocabulary in Spanish is rather smaller than the equivalent figure for English. At the moment, neither of these ideas is easy to test. Some relevant figures for English exist (Nagy & Anderson 1984), but no comparable data for Spanish is available. As far as L2 speakers are concerned, the data we have here suggests that our subjects are accommodating their behaviour to what would be expected of a native
Chapter 5. Network structures and vocabulary acquisition in a foreign language
Spanish speaker. This leaves unanswered the question of what happens with less fluent non-native speakers, of course. More interestingly, it also leaves unanswered the question of how non-native speakers of English (especially Spanish speakers) behave when they are asked to produce association chains. Will they produce even shorter chains than they do in their L1, or will they accommodate to English, and produce longer chains? The first option seems unlikely, because it would imply that the associational links between words in the L2 lexicon were very dense indeed. The second option, however, would suggest that the Spanish speaker acquiring an English lexicon may be doing something very different from what English speakers do when they acquire Spanish.
Conclusions Clearly, the data we have presented here does not amount to very much. We have tested only a handful of subjects, and we have been forced to make a number of assumptions about vocabulary size and accessibility of vocabulary items which may turn out to be completely unjustified. Nevertheless, the real point of this paper is not affected by these obvious shortcomings. That point is that the way we study vocabulary acquisition in an L2 typically does not take account of the advantages that might accrue from studying vocabulary as a structure, rather than as a collection of individual words and meanings. Looking at vocabulary as a largescale structure forces us to ask questions that simply do not arise as long as we are concerned with a handful of individual words. I hope that I have shown in this paper that the techniques of formal graph theory might be a useful way of approaching vocabulary structure. I am acutely aware that my own understanding of this kind of mathematics is really very elementary indeed, and that this paper barely scratches the surface of what could be done. I hope that this preliminary excursion will suggest to other, more skilled readers that formal mathematical models of this type have an important role to play in the study of vocabulary acquisition in an L2.
chapter 6
V_Links Beyond vocabulary depth Recent work on vocabulary acquisition has tended to make a broad distinction between vocabulary breadth and vocabulary depth. Vocabulary breadth has generally been interpreted as the number of words that learners know, whereas vocabulary depth is generally taken to mean how well they know these words. Most of the research in this framework goes back to a seminal article by Richards published in 1976, though the ideas have been picked up and developed by other writers since that time (e.g., Nation 1990; Nation 2001; af Trampe 1983; Blum-Kulka 1981; Madden 1980; McNeill 1996; Haastrup & Henriksen 2000 and others). Richards’ paper identifies a number of different aspects of word knowledge and the most important of these are summarised in Table 1 below. Table 1. Aspects of word knowledge from Richards (1976) – Knowing a word means knowing the degree of probability of encountering a word in speech or print. For many words we also know the sort of words most likely to be found associated with the word. – Knowing a word implies knowing the limitations imposed on the use of the word according to variations of function and situation. – Knowing a word means knowing the syntactic behaviour associated with a word. – Knowing a word entails knowledge of the underlying form of the word and the derivatives that can be made from it. – Knowing a word entails knowledge of the network of associations between the word and the other words in the language. – Knowing a word means knowing the semantic value of the word. – Knowing a word means knowing many of the different meanings associated with the word. (P. 83).
A number of people have tried to develop formal tests which measure depth of vocabulary knowledge in these terms. Wesche & Paribakht (1996) for instance developed a rating scale approach, in which test-takers are invited to rate their knowledge of target words on a five point scale, generating definitions for the target words or sentences containing the target words to confirm their self-ratings
Connected Words
where appropriate. A further example of this approach is Schmitt and Meara (1997). This paper developed an instrument which assessed test-takers’ ability to generate derivative forms of target words, attempting to show that this ability was independent of vocabulary breadth. Other examples of vocabulary depth tests, which adopt the same general approach include Schmitt (1994) and Read (1995). This work clearly takes the idea of “knowing a word” some way further than the measures of vocabulary breadth which are currently available. These latter measures tend to be relatively superficial: Meara’s Yes/No tests, for example (Meara & Milton 2003) simply asks test-takers to say whether they can recognise that a word exists or not, and Nation’s Vocabulary Levels Test (Nation 2001 and Schmitt, Schmitt and Clapham 2001) requires test-takers merely to match words to simple definitions. The depth tests, in contrast, require test-takers to show that their knowledge of the target words is not limited to superficial knowledge of this sort. It seems to us, however, that this enterprise is fundamentally doomed. The problem is that testing vocabulary depth in this way requires us to carry out extensive testing of individual words, and this makes it all but impossible to design experiments which can tell us very much about the larger characteristics of whole vocabularies – a classic example of not being able to see the wood for looking at the trees. The logic of testing vocabulary depth using the vocabulary knowledge framework implies that we need to test very many words in ever-increasing detail, and this very quickly leads us into serious logistical problems which constrain the types of hypotheses that we can test. Suppose, for example, that we take Richards’ list at face value, and suppose that we want to test how well a group of L2 speakers knows a list of 50 target words. To carry out this work, we would need to develop a set of perhaps a dozen subtests for each of the words we are interested in – at least one subtest for each feature in the framework. If we want to test 50 words in this way, then this implies that we would need a minimum of 600 test items before we can make even basic statements about a student’s depth of vocabulary knowledge for these words. And this in turn assumes that we could develop a single test item able to assess depth of knowledge in a meaningful way. On purely logistic grounds, a test battery of this size is completely infeasible: in practical terms, it would be very difficult indeed to get large groups of learners to take a 600 item test. In any case, it is highly unlikely that we could develop single test items that would reliably access a learner’s depth of knowledge for target words – it is very difficult to think of any way of testing how well a learner knows the syntactic behaviour of a word using a single test item, for instance – and this implies that we would actually need several test items for each of the facets listed in Table 1. A “solution” which involved even more test items would result in even larger and even less feasible tests, and this in turn implies that we must reduce dramatically the number of target words we test. Suppose, then, that we reduce our hypothetical list of target words to 10 items, and suppose that we develop a set of 20 sub-tests for each word. Even a minimal list of
Chapter 6. V_Links
this sort would still require a battery of 200 subtests, and the nature of the material would probably require each subtest to be separately developed and validated. This does not feel like an attractive proposition to us. Furthermore, even if a testing program of this sort could be developed and deployed, we would still be left with the far from negligible problem of how we can generalise from our 10 target words to the rest of the vocabulary. Put in its simplest terms, then, the prevailing approach to depth of vocabulary knowledge requires us to develop more and more finely tuned tests for fewer and fewer words. We do not think that this is a productive way to go, and our own thinking has led us in a rather different direction. Most people would agree that the recent growth in vocabulary research has largely been driven by the development of simple tests for vocabulary breadth – though for reasons which will become clear later, we prefer to call it vocabulary size. Typically in a test of this sort we give the test takers a large number of words and evaluate whether they “know” these words or not. At first sight, this work looks as though we are primarily concerned with single words, but actually things are more complicated than this. If the target words are well-chosen, then we can extrapolate from the target words to an estimate of the test-taker’s overall vocabulary size, and most tests of vocabulary breadth do just this. Thus, although we are ostensibly testing individual words, what really interests us is using this data to generate a description of the test takers’ overall vocabulary size. Vocabulary size is not a feature of individual words: rather it is a characteristic of the test taker’s entire vocabulary. This is a subtle shift of focus but an important one, and it has considerable implications for the way we approach measures of vocabulary depth. We believe that the attempts made by researchers such as Wesche & Paribakht, and Schmitt focus in too much detail on knowledge of individual words, and neglect the larger picture. We believe that a better approach to vocabulary development would be to look at features which are characteristic of a learner’s whole lexicon, rather than features which are characteristic only of single words. Ideally what we would like is a characteristic which scales in much the same way as vocabulary size measures scale. Vocabulary size is a good measure, with highly desirable measurement characteristics: vocabulary size measures start at zero, and they have a wide range, typically several thousand, and this means that they are very easy to work with, and very easy to interpret. Ideally we would like to develop a “depth” characteristic with similar features.
An alternative to breadth and depth Our current view is that depth of vocabulary knowledge is rather more than the sum of the learners’ knowledge of the individual words in their vocabulary. Knowledge
Connected Words
of individual words contributes to depth of knowledge, but the really interesting feature of vocabularies is the way that the individual words that make them up interact with each other. These interactions are what distinguish between a mere vocabulary list and a vocabulary network. The basic idea, one that has been widely taken up by writers on vocabulary acquisition, e.g., Aitchison (1987) and McCarthy (1990), is that words in a vocabulary form some kind of linked network. Aitchison, for example, refers to a lexicon as “a gigantic multi-dimensional cobweb” (p. 72), while McCarthy talks in very similar terms. Although these authors do not develop these metaphors in any detail, we believe that we can approach the question of vocabulary depth by characterising the properties of this network rather than by focussing on the properties of its separate components. The difference between this view of vocabulary depth and the more traditional view is summarised in Figure 2. vocabulary breadth and depth
vocabulary size and organisation
Figure 2. Two ways of looking at a vocabulary Depth vs breadth and size vs complexity.
The left hand diagram in Figure 2 illustrates the way vocabulary breadth and vocabulary depth are currently conceptualised. Each word is shown as a bar. Words with more “depth” are shown as longer bars, while words with less “depth” are shown as shorter bars. Essentially, this is a list model. Adding new words (increasing breadth) has no implications for the other words in the list, and there is no intrinsic link between breadth and depth. The right hand diagram shows a more complex, network metaphor. In this model, “breadth”, or size, corresponds to the number of nodes in the network. The second dimension of this feature is the number of connections between the nodes. For this model, adding a new node (increasing “breadth”) does have implications
Chapter 6. V_Links
for the rest of the network, depending on how the new node is linked to the existing ones. Adding new links (increasing “depth”) also has implications for the rest of the network. The two metaphors are fundamentally different, and lead us to ask very different questions about the way “breadth” and “depth”, or in our terms, size and organisation, interact. Basically, we think that the breadth/depth opposition is an unfortunate one, that leads in unhelpful directions. We believe that it makes more sense to talk about size and structure or size and organisation instead. Our own research has been based on the idea that L2 lexicons are not as highly structured as the lexicons of L1 speakers. This seems like an intuitively plausible place to start: everyone agrees that L1 lexicons are highly developed and complex, while L2 lexicons are less well developed. In terms of our model, this should mean that L2 lexicons are smaller than L1 lexicons, and that the organisational links between the words that make up the L2 lexicon should be simpler than what we find in L1 lexicons. The obvious way to investigate these ideas is to use word association data. In experiments of this sort, we give L2 speakers a series of single words, and we ask them to report the first L2 word that comes into their heads. We can then assume that the reported associations are linked in much the same way as the nodes in Figure 2 are linked. We might expect native speaker networks developed in this way to be denser and more highly organised than similar networks generated by L2 speakers, and this would suggest that the complexity of the connections between words corresponds in some way to vocabulary “depth”. Words which show a complex array of connections will tend to be more deeply known than words which are linked more tenuously to other words. This deceptively simple idea turns out to be much harder to work with than you would expect. Word associations generated by L2 speakers are quite different from those produced by L1 speakers (cf. Riegel & Zivian 1972), but the differences are very hard to pin down reliably in small scale experiments. This is largely because L2 speakers seem to produce a much wider range of associations than L1 speakers do, but it is also difficult to disentangle the effects of L1 interference in L2 word association tasks. Most word association research relies on a methodology which requires test-takers to produce associations, and this tends to generate data which is particularly varied, and particularly difficult to work with. However, Wilks & Meara (2002) developed a sophisticated passive association recognition technique which allowed them to estimate the mean number of associational links between small sets of words. Their data showed that there were clear differences between native speakers and L2 speakers in this regard. In their approach, test-takers were provided with small sets of words and asked to decide whether any two words in each set were associated together. Not surprisingly, L1 speakers were more likely to find a link than L2 speakers were. Wilks & Meara computed the probability of
Connected Words
a link being found for these sets, and then used a complex modelling method to estimate the complexity of the connections in their subjects’ lexicons. The work we report in the next section of this paper is basically a development of Wilks & Meara’s methodology.
V_Links The testing tools that we describe in this section are a preliminary attempt to develop a measure of lexical organisation for English. The test is known as V_Links, and its current version is version 2.00. The test consists of a set of 20 items. Each item consists of a selection of 10 words. The words all come from the first 1000 words in English. The test items were developed from a larger number of randomly selected word sets so that each set contains a number of obvious and some less obvious associational pairs. The test-takers are presented with each of these 20 items on a computer screen, and for each item they are given one minute to identify any association pairs that they can find. They do this by clicking on the words in the display. Each pair is confirmed when the test takers indicate how strong the association is by clicking on a four point scale at the bottom of the display. The display then draws a link between the two members of the pair, with the strength of the link shown by differences in the line colour. (See Figure 3). We have trialled this basic idea in several different formats and the current version of our test works reasonably well. This version has a number of interest ing features. Firstly, it tests a large number of words in a relatively short space of time. Each of our 20 items contains 10 target words, so the whole test features a total of 200 words – one in five of the basic 1000 word core vocabulary. This figure is much larger than anything that could be attempted using an approach like VKS, and we think it gives us much greater insight into the way a vocabulary is organised than a smaller test could. In spite of this, the test takes only 30 minutes to administer. Secondly, each item has a possible 45 linked pairs, though in practice, the actual number of pairs identified is much smaller than this. Native speakers typically identify half a dozen word pairs as associational pairs for each item. Multiplying this up across all twenty items gives us a total of 120 providing us with a scale ranging from 0–120. This range seems to be large enough to clearly distinguish between native speakers and learners. Thirdly, the fact that the test makes use only of words which lie in the first thousand frequency band for English means that the test in its current form can be used with test takers whose level of English varies over a considerable range of proficiency. Obviously, the test is not suitable for absolute beginners who have a
Chapter 6. V_Links
Figure 3. Screen shot from V_Links.
very limited vocabulary, but it can be used with intermediate level learners, as well as advanced level learners, and the data we have collected so far suggests that the test may be sensitive enough to discriminate clearly between these cases. There are, of course, a number of outstanding problems which we still need to address, and these form the object of our current work with the test format. The most important of these problems is that our L2 test takers persistently identify as associates word pairs which are never selected by native speaker test takers. We had originally hoped that these cases would be few, and that we would be able to ignore them, but this appears not to be the case. Our current approach to this
Connected Words
problem has been to build up a database of the responses produced by a group of L1 speakers, and to accept as valid any response which appears more than once in this set – i.e., at least two native speaker respondents have made this association. This is not entirely satisfactory, as it fails to take account of L2 associations which arise as a result of specific local conditions – English loan-words used as trade names in Japan are a particular problem in this context – but in principle, the methodology could be adapted to take account of special cases such as these. Using a response database allows us to score the test automatically, and to provide instant feedback to test-takers. The second problem is the question of association strength. In our earlier versions of V_Links, we asked test-takers to identify any associated pairs, but did not ask them to say how strong or how obvious the association was. This made the task easy for the test-takers, but it sometimes produced data which was difficult to interpret. Some test-takers, for example, would claim there was an association between a pair like COW and SNAIL, on the grounds that both were animals, or between LOOK and WRITE on the grounds that both were verbs. In our current version of V_Links, test-takers have to indicate how strong they think each of their associations is, and we hope that this will allow us to weed out some of the more unsatisfactory associations in a principled way. Most people, for example, think that the association between DOG and CAT is stronger than the association between COW and SNAIL, and most people think that WRITE ~ PEN is a stronger association than WRITE ~ LOOK. However, this approach has thrown up other problems which we have not yet solved, notably a tendency for some test takers to claim that most associations are strong, while others appear to be very reluctant to identify strong associations, and prefer to use only the lower end of our four-point scale. The third problem that we are still working on is the question of timing. Ideally, we would like to have a measure of vocabulary organisation which is independent of other factors, such as speed of word recognition and fluency. For this reason, some of our earlier versions of V_Links did not impose any time limit on the test-takers, and used an open-ended format instead. This worked well with some speakers, but others seemed to take a perverse delight in exploring all the possible combinations of words in each set, and finding obscure links between them. We have reintroduced a timer into the current version, with the time allowed for each test item being amply sufficient for test takers to identify the most obvious associations. It is possible that this makes the test harder for students whose reading speed is poor, but we do not think so. A more important factor seems to be how fluent test-takers are in using a mouse, and we think that this problem will disappear as more and more people are accustomed to this mode of working with a computer.
Chapter 6. V_Links
Does V_Links work? The format we have described in this paper is the latest in a long series of trial versions which we have been working on for some time. V_Links clearly discriminates between native speakers and non-native speakers. In a large scale trial involving 147 L1 Japanese learners of English, the test showed a significant difference between these learners and a control group of native speakers, with the L2-speakers scoring about half the mean score for native speakers (t=3.25, p<.01). We expect that our current version of V_Links will perform even better than this early trial version. Data from the same group of subjects also suggests that there is only a very modest level of correlation between scores on the V_Links test and scores on a test of overall vocabulary size (r<0.3), and this is exactly what we would expect if lexical organisation and size are more-or-less independent features of L2 lexicons. Clearly, further work on this is needed, and we will be carrying out more studies of this sort when we have finalised the current version of V-Links.
Further work with V_Links In the earlier sections of this paper, we argued that the size/organisation approach to L2 vocabularies was potentially more productive than the breadth/depth approach. In this section, we will explore this idea in more detail. The size/organisation approach is part of a three dimensional approach to vocabulary development that was first outlined in Meara (1996). Meara argued that size, organisation and fluency were all important characteristics which impacted on lexical behaviour. We have had reliable tests for measuring vocabulary size for some time. If we are right in thinking that V_Links is an effective way of assessing lexical organisation, then we now have tests for measuring two of these basic dimensions in place, and this allows us to start asking some really interesting questions about the relationship between vocabulary size and vocabulary organisation. The basic question we can ask is whether organisation and size are correlated – i.e., whether the core vocabulary (the most frequent 1000 words) of a large lexicon is more structured than the same words are when they are part of a small lexicon. As we have seen, our preliminary results suggest that there is not a straightforward correlation between vocabulary size and vocabulary organisation. The question that then arises is just what is the relationship? Is it completely random or is it a complex non-linear relationship? The answer to this question is by no means obvious. There are, however, a number of plausible ways in which size and organisation might be related in a non-linear fashion.
Connected Words
One possibility is that people with similar sized vocabularies might differ in respect of how organised they are, i.e., we might find learners with similar vocabulary sizes, but very different degrees of organisation in their lexicons. If this turned out to be the case, then we might begin to ask how the different learner types identified by the dimensional approach differ in their language behaviour. We might expect learners with large, but weakly organised lexicons to behave differently from learners with similarly sized, but better organised lexicons – perhaps they would be less good at text comprehension, for example, or less good at understanding extended spoken input. Another possibility is that lexical organisation may be an insignificant factor as long as the lexicons in question are below a critical size threshold, but that organisation becomes increasingly important once this critical size is reached. For example, it might be the case that small lexicons show a wide disparity in organisation, while large lexicons are always highly organised. This idea in turn suggests that there might be a number of thresholds of this sort, and this would imply a complex relationship between size and organisation. Perhaps unstructured, or loosely structured lexicons can only grow to a limit, and cannot grow beyond this limit until they have restructured themselves. This would imply that lexicons might have growth phases and consolidation phases. We cannot think of any empirical work which supports this suggestion. However, it does fit well with some anecdotal accounts of vocabulary acquisition in L2 learners implying that learners feel their vocabulary reaches a sort of plateau from which it is difficult to make further progress. V_Links should allow us to investigate these questions by carrying out detailed longitudinal studies designed to work out how vocabulary size and vocabulary organisation are related over time. Work of this sort would also indicate how far different learners follow the same trajectory in the space defined by our twin dimensions. At the moment, we have very little idea how much learners vary in the way their vocabularies are organised, and almost no idea how lexical organisation might facilitate further lexical growth, or how it might impact on other aspects of language performance. However, we expect to find considerable individual differences between learners in this respect, and if this turns out to be the case, then tests like V_Links will play an increasingly important role in vocabulary research.
Conclusion In this paper, we have described our current thinking about lexical organisation, and shown how measures of vocabulary organisation offer a more interesting approach to the question of vocabulary development than the idea of “vocabulary
Chapter 6. V_Links
depth” does. We have described the current version of our tool, V_Links, and some of the preliminary investigations we have carried out using this tool. V_Links still has a way to go before it is fully functional, but we hope that this brief description of our current work will convince readers that type of approach embodied in V_Links has the potential to open up some seriously interesting avenues in vocabulary research.
chapter 7
A further note on simulating word association behaviour in an L2 Introduction Wilks & Meara (2002) reported data from an experiment in which we tested the ability of L1 English speakers to recognise associated pairs in small sets of French words. The material used consisted of a 40 item questionnaire. Each item in the questionnaire comprised a set of five words randomly chosen from the Français Fondamental list: approximately the first thousand most frequent words in French excluding grammatical items (Gougenheim et al. 1956). The participants were instructed to read each set of words and circle any two words in the set that they considered to be associated. A typical item might look like example one below: Ex. 1 blouse cheminée coûter feu tort
If they saw more than one pair of associated words in the set, the participants were instructed to circle only the two words with the strongest link. In Example 1, for instance, they might circle cheminée (chimney) and feu (fire). If they found no links between any of the words they were instructed to write nothing, and continue to the next item. Alongside this group of L1 English speakers, we also ran a group of L1 French speakers, who carried out the same task. Our intention was to compare the data of the L1 English speakers with the native speakers of French, and we expected, of course, to find that our L1 English speakers were less adept at identifying associated pairs than the L1 French speakers were. Not surprisingly, this turned out to be the case (t=6.47, p<.001). The data we reported are presented in table one below. Table 1. Mean hit rate per group
Mean hits Standard Deviation number of items number of Ss
Nonnative Speakers of French
Native Speakers of French
19.00 7.65 40 30
30.90 5.74 40 30
Connected Words
These data clearly confirm that there is a difference between the two subject groups, and the most obvious explanation of this difference is that the association network of the L1 group is “denser” than that of the L2 group, in the sense that L1 words have more associative connections than L2 words do. This density metaphor is one that frequently occurs in the literature on L2 word associations, but its implications are rarely developed. Aitchison (1987), for example, talks about the lexicon as “a gigantic multidimensional cobweb”, and most researchers appear content to operate on this descriptive level. Wilks and Meara, however, attempted to show that it was possible to move beyond imprecise metaphorical descriptions, and develop more specific quantitative models instead. We did this by comparing the experimental data with data generated by an association simulator. The simulator was a computer program that modelled a small lexicon in which each word was linked with a number of other words in the lexicon. The number of links between each word and the rest of lexicon – the NLinks parameter – could be varied, and Wilks and Meara showed that the probability of two associated words appearing in a small set of words varied with the value of this parameter. We then used this data to look again at the data generated by real subjects, and estimated what the real data implied about the density of interword connections in the mental lexicons of our test takers. Our initial guess had been that the L1 English speakers would have relatively few connections between words in their L2 lexicons, perhaps as few as four or five. However, the results generated by the simulator forced us to revise that estimate. We concluded that the data implied a much denser set of connections, even for L2 speakers, perhaps as many as 30 or 40 links for each word. Our 2002 paper considered the implications of this for the way we normally interpret word association data generated by L2 speakers, and we concluded that the density of connections between words would have to be considerably higher than most researchers assumed it to be. This had significant implications for the way we thought about word association networks in an L2.
Rethinking the simulator A number of people, notably Brent Wolter, the third author of this paper, pointed out to us that our simulator had in fact made some very severe assumptions about the way associations in a lexicon might work, and that this might have resulted in unrealistically high estimates for the number of links between words. In order to clarify this objection, we need to explain the detailed workings of the simulator. Basically the simulator consists of a large array of numbers. We set the size of this array at 1000, an arbitrary number, but one which is sufficiently large for us to argue that our results did not merely apply to very small lexicons. Each element in
Chapter 7. A further note on simulating word association behaviour in an L2
the array represents one “word” in the lexicon. The items are not in fact real words, merely numbers that stand in for real words. Next we set the NLinks parameter which determines the number of associations each word has. Let us suppose, for the purposes of illustration, that this parameter is set at is three. The program next selects three random numbers between 1 and 1000 for the first “word” in the lexicon. These random numbers point to three other “words” in the lexicon, and represent the three associates of word number 1. The program then repeats this selection process for all the other words in the lexicon. The result is a structure that looks like Table 2. Table 2. Part of a simulated lexicon with the association parameter set at 3 word 1 word 2 word 3 … word 999 word 1000
123 99 129
145 182 182
160 279 761
135 72
856 65
687 321
For this particular simulation, word 1 is linked with words 123, 145, and 160; word 2 is connected to words 99, 182 and 279 and so on. Other runs, making different random choices, would produce different numbers, but the basic structure would be the same. Setting the NLinks parameter at a different value, say 5, would generate five associations for each word rather than three. The critical next step is for the simulator program to select sets of five words and check whether or not they contain a pair of associated words (a “hit event”). In the 2002 paper we determined that a hit event had occurred if the number that identified one of the five selected words also appeared in the association list of another word in a set. This is shown in Example 2 below: Ex 2:
wd29 wd367 wd456 wd552 wd699
15 29 71 81 10
123 421 139 140 259
135 435 156 172 273
138 567 489 495 682
742 665 543 681 695
881 678 820 729 891
In Example 2, we have a set of five words, 29, 367, 456, 552 and 699, each directly linked to six other words in the lexicon. Word 29 appears in the association list for word 367, and this is taken to mean that there is a direct association between word 29 and word 367. Given these data, the simulator program would record a hit. Compare this with what we find in Example 3. Here, we have the same set of five target words, 29, 367, 456, 552, and 699, but we have altered the association lists,
Connected Words
and none of these words appears in the association list of the other words in the set. Given the data in example 3, the simulator program would record a NoHit event. Ex 3:
wd29 wd367 wd456 wd552 wd699
15 99 71 81 10
123 421 139 140 259
135 435 156 172 273
138 567 489 495 682
742 665 543 681 695
881 678 820 729 891
Our thinking at the time was that a strong associative link between two words would normally result in the index of one word appearing in the association list of the other. For instance, if word 29 had been BLUE, and word 367 had been SEA, we might well expect one of these words to appear in the association list of the other – associations of this type are easily identified in word association norm lists (e.g., Postman & Keppel 1970). In modelling terms, 29 might appear in the association list for word 367, and 367 might appear in the association list of word 29. Either event would have been sufficient for the program to register a hit. We assumed that most of the associations recognised by our subjects would be strong associations of this type, and that this interpretation of an associative link was a reasonable way of modelling association behaviour. Wolter argued that this is actually a very narrow interpretation of association behaviour. It is certainly the case that some common associations have close connections of this kind: RED elicits BLUE, for example, or BIG elicits LITTLE in this way. In terms of our model, this would mean that BLUE would appear in the association list for RED, and LITTLE would appear in the association list of BIG, and the word association norms confirm that this is case. However, not all associations are of this type. Test-takers are relatively consistent in the associations that they produce, but they appear to be much less consistent in the associations that they recognise. In recognition experiments, the number of idiosyncratic associations is generally rather high, and test-takers will often find associations between words which are only loosely connected with each other. For example, test takers will frequently identify pairs of words as associates, even when the pairs do not appear in the standard word association lists. Given a word set like RUN, SLEEP, BUS, BLUE, CLOUD, test takers will sometimes identify RUN~BUS as an associated pair (because they sometimes run for the bus), or SLEEP~BUS (because they usually sleep on the bus), or BLUE~BUS (because the buses in their town happen to be painted blue). Loose associations of this type are particularly likely to be reported if there are no other stronger associations available. This behaviour clearly implies a much looser definition of an association than the one we were working with in the 2002 paper. Wolter suggested that one obvious possibility for scoring looser associations of this sort would be to recognise that real associations between words often rely on
Chapter 7. A further note on simulating word association behaviour in an L2
a common link shared with a third word. For example, BIRD might be associated with AEROPLANE because they both FLY. This association feels like a strong one, even though BIRD is not commonly associated with AEROPLANE, and AEROPLANE is not commonly associated with BIRD. The Edinburgh Associative Thesaurus (Kiss, Armstrong, Piper & Milroy, 1973) reveals that no-one offered AEROPLANE as a response to BIRD or vice-versa in that data set. Wolter further pointed out that it would be relatively straightforward to reprogram our simulator so that its definition of a hit event reflected these looser associative connections. Specifically he suggested that the simulator might be reprogrammed to register a hit if any one of the words associated with word X also occurred in the list of associates for word Y. In the Example 4 , for instance, word 138 occurs in the association set for word 29 and for word 456, and this might be interpreted as showing that word 29 and word 456 are linked together indirectly in some way. Ex 4:
wd29 wd367 wd456 wd552 wd699
15 99 71 81 10
123 421 138 140 259
135 435 156 172 273
138 567 489 495 682
742 665 543 681 695
881 678 820 729 891
Suppose word 29 was BIRD, and word 138 was FLY and word 456 was AEROPLANE, then the simulator would register that both BIRD and AEROPLANE share a common associate FLY.
Results Wolter’s hunch that it would be easy to adapt our simulator to operate in this way is correct, and Table 3 presents a set of data that shows what happens when we make this adjustment. The data is the probability that the simulator program will register at least one hit in a set of five randomly selected words when the number of associated words for each entry in the lexicon varies from 1 to 20. Each figure in the table is based on 1000 simulated word sets. Table 3. Probability of registering an associated pair in a random set of five words #links per wd p(hit)
1 .037
2 .077
3 .143
4 .195
5 .310
6 .382
7 .442
8 .563
9 .654
10 .714
#links per wd p(hit)
11 .772
12 .819
13 .852
14 .909
15 .926
16 .957
17 .962
18 .979
19 .980
20 .983
Connected Words
It is immediately obvious that these probabilities are considerably higher than the figures we reported in Wilks & Meara (2002). The data suggests that the probability of a hit increases rapidly as the NLinks parameter varies between 2 and 14; for values of NLinks above 15, the probability of finding a hit in a set of 5 words is very close to 1. We can now use these figures to re-interpret the data we reported in Wilks & Meara (2002). That paper reported that L1 French speakers, the Native Speaker group, registered a hit on approximately 30 of their 40 trials, i.e., about 75%, but with a substantial standard deviation from this mean. The closest match to these figures in Table 3 is when the NLinks parameter is set to 11 links per word. W&M reported that their L1 English speakers, the Non-Native speaker group, returned a mean hit rate of 19/40=47.5% but with a very substantial standard deviation from this mean score. The closest match to these figures in Table 3 is when the NLinks parameter is set to 7 links per word. Table 4 shows W&M’s original data alongside some further data generated by the revised simulator program. In this simulation, we ran a set of 60 cases. For half these cases, the NLinks parameter was set at 7, and for the other half, the NLinks parameter was set at 11. Each case consisted of 40 five-word stimulus sets. In this way, the simulation is an exact parallel of the real data collected in W&M’s study. Table 4 shows that the mean scores in the simulated data and the real data are very close, but the simulator seems to generate smaller standard deviations which are about half the size of the standard deviations in the real data. Table 4. Real and simulated data, 60 cases, each tested on 40 five-word stimulus sets
real data simulation
NS group NLinks=11
Mean hits
Sd
30.90 29.90
5.74 3.17
NNS group NLinks=7
Mean Hits
Sd
19.00 18.56
7.65 3.54
These results suggest that our new model generates data which is a reasonably close fit to the data generated by real test-takers. However, the implications of this new data set suggest rather different conclusions from the ones we drew from our earlier attempt at modelling associations in L2 speakers. In our earlier model, the chances of finding a hit in a random set of five words is fairly low, and the only way to account for the relatively high numbers of matches reported by our test-takers is to argue that the number of links between words must be correspondingly high. In our new model, relatively high hit rates can be achieved with relatively small numbers of connections between the words in the lexicon. Clearly, this changes the game substantially. In our new model, it looks as though 11 links per word is
Chapter 7. A further note on simulating word association behaviour in an L2
sufficient to account for the native speaker data, while 7 links per word is sufficient to account for the data generated by the L2 speakers. The difference between L1 speakers and L2 speakers is still real, but it now appears to be much smaller than the difference we were suggesting in our earlier account.
Discussion Two general points seem to emerge out of this re-evaluation of our 2002 data. 1: Simulations as a test bed for research tools The first general point is that simulations clearly have a role in the development of testing instruments that has not been exploited hitherto in SLA research. Specifically, simulations can sometimes be useful because they allow us to identify what experimental conditions are necessary to test the hypotheses that we are working with. As a simple example of this problem, consider the large amount of research that has looked at incidental vocabulary acquisition in L2. Typically, this work has measured take up of vocabulary as a result of reading extended texts (cf. Huckin & Coady 1999), and it has done this by presenting subjects with a short list of target words that the subjects did not know when they started to read the text, and measured whether they did know them as a result of having read the text. On the face of it, this looks like a good design: if subjects acquire words by reading, then they should have higher scores on the post-test than they do on the pre-test. In practice, however, most studies of this type report only minimal increases in vocabulary knowledge. It is not difficult to work out why this result keeps recurring. Simply by quantifying what we mean by incidental vocabulary acquisition, we can easily see that some testing instruments will not be sensitive enough to pick up vocabulary growth even if it really exists. Let us suppose that there is 5% chance of a subject picking up a new word from a single encounter in a text (Nagy & Herman 1985). Now suppose that we run an experiment that involves 20 target vocabulary items each repeated once in the text that the subjects read. How much improvement would we expect to find between the pre and post test? The answer is about 5% of 20 words, that is, just one word would be expected to show an improvement. Clearly, a difference of this size is very likely to be swamped by variation within the subject group, and would be very unlikely to show up as a reliable improvement in a set of experimental scores. Even if we tested a hundred words in this way, we would still expect to find an increase of only five words between pre-test and post-test. This simple quantitative model, then, allows us say unequivocally that a testing instrument that uses a small number of target words (e.g., the twelve item
Connected Words
test used in Hulstijn 1992 expt 1) is not going to be adequate as a tool for evaluating this particular hypothesis. When it comes to evaluating more complex claims about lexical organisation, it becomes more difficult to develop relevant armchair simulations like the one described in the previous paragraph. For example, most people agree that the lexical organisation of L2 speakers is different from the lexical organisation of L1 speakers, and that the association structure of advanced L2 speakers will be rather different from that of less advanced L2 speakers, but it has proved rather difficult to pin down these differences in practice. The data reported in this paper suggests that word association tasks might be capable in principle of demonstrating a difference of this sort, but considerable care needs to be taken in designing experiments to test these claims. Suppose, for instance, that we have a general hypothesis that lexical organization becomes more complex when language learners spend an extended period in a country where their target language is spoken, and let us suppose that we want to test this hypothesis using a word association recognition task. Now let us suppose that we have a group of students who, before spending a period abroad, score 38% on a test like the one we described in Section III – i.e., they behave as if their L2 lexicons have an average of six links per word. If, as the simulations suggest, the difference between native speakers and advanced learners is typically about four additional associative links, then we might expect an extended period abroad would increase the average number of links slightly, so let us hypothesise that our learners increase their lexical links from six to seven words as a result of their stay abroad. With a test like the one Wilks and Meara (2002) used, this improvement ought to show up as an increase in the test scores from 38% to 44%. On a 50 item test, however, this difference is a mere three additional hits – an improvement that would be very difficult to detect in noisy data. Even with a 100 item test, the model suggests that we would expect to measure an improvement of only 6 additional hits. The point here is that relatively small differences in the overt behaviour of the test-takers could be pointers to quite significant changes in the underlying structure of their lexicons. Arguments like this suggest that developing good test instruments for evaluating hypotheses about vocabulary development may be more difficult than we have typically supposed. Simply comparing the associations of L2 learners and native speakers, using ad hoc lists of words, as much of the research in this area has done, begins to look like a very unsatisfactory approach to assessing L2 lexical competence. Indeed, blunt research tools of this kind may be intrinsically incapable of evaluating the hypotheses we think we are researching. Careful simulation studies provide a way of testing out the capabilities of these instruments before they are widely used in real experiments.
Chapter 7. A further note on simulating word association behaviour in an L2
2: The complexities of simulation research The second general point to emerge from this study is that working with simulations is perhaps not as straightforward as it might seem to be. Clearly, the results we get from a simulation study are only as good as the assumptions that go into the simulator. If the assumptions are totally wrong, then the results of the simulation that uses them will be completely meaningless. Fortunately, cases of this sort are generally easy to identify. More difficult to work with are cases, like the one reported here, where the assumptions built into the simulation model are not exactly wrong, but limit the possible range of outcomes in some significant way. In our 2002 simulations, we chose to model association processes by looking for stimulus words that occurred in the association list of other stimulus words. This was one of a number of plausible implementations that we could have chosen as a way of modelling the association process, and at the time, it did not strike us as problematical or contentious to model these processes in this way. With hindsight, it is obvious that the very large number of connections between words implied by the results of that simulation should have made us query this particular set of assumptions sooner. The fact that we did not do this perhaps suggests that researching with simulations is very different from the kind of research that we are used to in SLA. Typically, SLA research involves developing a loosely formulated but broadly testable theoretical position, and evaluating it by collecting data from groups of L2 learners. Because empirical data sets of this kind are generally hard to collect, they are highly valued, and we often accept data sets which are partial and fragmentary – small groups of subjects, small numbers of stimulus items, and so on – as a compromise. At the same time, empirical data sets of this sort are usually treated uncritically: the profession as a whole tends to take empirical findings very much at face value and it is rare to find data of this sort exposed to close critical scrutiny. Very often we end up accepting data that are really quite unsatisfactory as evidence in support of one theoretical position rather than another. In simulation research, the relationship between data and theory is much more complex. It is actually very easy to generate vast quantities of data, and because of this, it suddenly becomes very easy to evaluate many different theoretical positions in a way which is logistically impossible using experiments with real L2 learners. However, this facility comes at a price, and the price is that simulations require us to be absolutely explicit about the processes that we think we are modelling, and the actual implementation of a model becomes critical to the research process. The corollary of this is that we need to do much more ground work at the level of theory than we need to do when we are dealing with data collected from human test-takers. Specifically, we can afford to pay much more attention to the details of our theoretical assumptions, simply because the implications of these
Connected Words
assumptions can be evaluated much more easily. Potentially, this introduces a significant shift into the way we think about second language acquisition in general, and second language lexicons in particular. Simulations force us to explore the effects of building different assumptions into our theories – by modelling in different ways the processes we are interested in, we can explore in considerable depth exactly how these processes contribute to the performance of our model. This, in turn, can often force us to re-evaluate some of the fundamental assumptions that we are working with, and this process, when pushed hard enough, can significantly alter the way we look at things. In these studies, we have been attempting to model the underlying structures that generate word associations. Most work in this field takes it for granted that word associations are the result of direct connections between one word and another in a mental lexicon. Our first simulation showed that this specific assumption only works if we allow very large numbers of connections between words. Our second simulation showed that a model with fewer connections will also work, but only if we significantly redefine what we mean by an association. On balance, we think that the second model is a more plausible model than the one we developed in our 2002 paper: it is mathematically more tractable than the earlier model, and it suggests that the learning burden faced by L2 learners is perhaps not that big. However, it also suggests that previous work on L2 word associations may have been very naïve in assuming that word association behaviour was a direct reflection of the immediate connections between words. If this assumption is not correct, then much of the early work on word associations – and not just the work on L2 word associations – may need to be re-evaluated. This is clearly a far-reaching outcome, and bearing in mind the relatively small scale of these simulations, it clearly illustrates the power of this kind of work.
Conclusion Ideally, simulation research is an iterative process, in which we formulate many different models, and test them against each other in a competitive environment. Unfortunately, we do not really have a tradition of this kind of work in SLA. Research in L2 vocabulary acquisition, at least, is dominated by a relatively informal attitude towards theory, and the literature predominantly consists of a huge number of one-off studies, (see Meara 1993), which tend to reinforce the general lack of theory. These two factors have prevented the development of the kind of on-going, critical dialogue that is required to allow the vocabulary field to move forward in theoretical terms. The position is made all the more difficult by the fact that few people have the computational skills necessary to work critically through
Chapter 7. A further note on simulating word association behaviour in an L2
the implications of a simulation model, or to suggest how the behaviour of a model might be affected by changing its fundamental assumptions. The obvious solution to this problem would be for simulation research to become a standard component in the training received by young SLA researchers. We hope that this paper will stimulate some readers to think about providing training of this kind in future.
section 4
Bibliographical resources for word associations in an L2 Introduction This chapter lists all the main studies of L2 word associations that I am aware of. Most of the entries include a short abstract which highlights the key findings. Entries which do not include an abstract are items which have been cited by other researchers, but which I have failed to obtain a copy of. A few other entries have abstracts enclosed in brackets. These cases are mostly PhD theses whose abstracts have been provided by the authors, usually on a web site, and are generally less critical than the other abstracts in this chapter. Should the authors of these papers read this book, the Swansea Archive would be very grateful to receive copies of their work. The bibliography includes a number of key works which anybody working in this area should be aware of, mainly because they are methodologically innovative, or introduce new theoretical ideas. These works are marked with a ** sign in the text that follows. I have not been generous in awarding these stars, so some explanation of why I have singled out just a handful of papers seems to be in order. Wallace Lambert’s 1956 paper was the first piece of published work to use word association techniques with L2 speakers, and it sparked off a flurry of other studies which used this technique to evaluate the lexical performance of “bilinguals” – usually French Canadian subjects with varying degrees of proficiency in English and French. Surprisingly, this work was not taken up by mainstream L2 researchers, despite its methodological rigour and the meticulous reporting of the data. Lambert’s work was very influential on my own approach to L2 vocabulary performance, and was largely responsible for my interest in developing tests which were applicable to both L1 and L2 speakers. Politzer (1978) is included largely for negative reasons. This paper was the first paper to suggest that L2 word associations might be characterised by a predominance of syntagmatic responses, and that they might resemble the associations made by children to L1 stimulus words in this respect. Despite the fact that this result contradicted earlier studies (e.g., Davies & Wertheimer 1967), Politzer’s
Connected Words
work was extremely influential in subsequent L2 research. I suspect that this was at least partly due to the fact that Politzer was a Big Name in Applied Linguistics at the time, and his results were taken very much at face value by L2 researchers, and not subjected to much critical analysis because of this. Ultimately this line of research turned out to be a blind alley, but it remains important for historical reasons. The papers by Riegel represent a much more interesting approach to L2 word associations, but are cited in the research only rarely. I think there are two reasons for this. The first is that Riegel’s 1968 paper appears at first sight to be a dauntingly difficult mathematical approach to bilingual lexicons, and for this reason, most readers may perhaps feel that the work is rather inaccessible. In fact, the mathematical models that Riegel uses are not as difficult as they appear, and readers interested in getting over this initial obstacle might find the simplified account of Riegel’s work in Meara (2001) a good place to start. The second reason why Riegel’s work is not as influential as it should be is that the two other papers listed in this chapter deal mainly with associations in L2 German and L2 Spanish, rather than the more familiar English as L2 situation. In practice, this has meant that the methodological innovations pioneered by Riegel have not become standard in the research literature. They deserve to be much better known and more widely used. It is slightly embarrassing to include one of my own papers in this list of important sources. Wilks Meara & Wolter (2005) – reproduced as chapter six of this book – is one of a trio of studies which have attempted to go beyond the superficial data provided by word association experiments and explore what these data might imply in terms of the overall organisation and development of L2 lexicons. We argued that L2 word association data might provide a solution to the vexed question of how we might operationalise the idea of vocabulary depth – the lack of a decent measure of vocabulary depth is currently a major obstacle in the development of theories about lexical development. This paper makes use of computer simulation techniques which seem to me to open up whole new areas of research in second language vocabulary acquisition. As Hunt & Beglar (2005) point out, there is a serious shortage of theoretical models concerning vocabulary acquisition in an L2. The simulations that are described in this chapter are not easy to interpret, but they do offer a way of making explicit and overt some of the assumptions that vocabulary researchers appear to take for granted, and may help to generate new theories about the critical features of growing L2 vocabularies. One of the things that stands out from this bibliography is the large amount of work carried out in the period 1950–1980 by psychologists working with bilinguals, and learners of languages other than English. Much of the work on word associations has been published in obscure places, and for this reason the work is not as well known as it should be, but this does not really excuse the fact that this research is rarely cited in the context of theoretical accounts of vocabulary acquisition. Nation’s
Section 4. Bibliographical resources for word associations in an L2
2001 survey, for instance, which is in many respects the definitive summary of research on vocabulary teaching, mentions only a handful of these studies. I hope that this bibliography will make researchers more aware of the large amount of work that is available, and serve to broaden the range of references which appear over and over again in the more recent publications.
chapter 8
Word associations in a second language An annotated bibliography Amer, A.A.M. 1980. A Comparative Study of English and Egyptian Word Associations and their Implications for the Teaching of English to Egyptian Learners. Ph.D. dissertation, London University, Institute of Education. The core of this thesis is a word association test of 250 words administered to 385 English children and a parallel test in Arabic administered to 387 Egyptian school children. The data in the two languages are compared in terms of similarity of form class, commonality of responses and different types of response, and there is a lengthy discussion of the cultural differences which emerge in the responses to certain words. Amer suggests a number of ways in which these differences could be exploited as a natural part of language teaching. Appel, R. 1989. Het tweetalige lexicon: Woordassociaties van Turkse kinderen in het Turks en in het Nederlands. (The bilingual lexicon: Turkish children’s word associations in Turkish and Dutch). Toegepaste Taalwetenschap in Artikelen 30: 131–142. Appel asked 17 L1 Turkish children to produce multiple associations to a set of Dutch words and an equivalent set of Turkish words. The children produced more responses to Turkish stimuli, but more varied responses to Dutch stimuli. For each pair of stimulus words, about 35% of the responses were common to both languages. Some of the response sets are discussed in detail. Arkwright, T. & Viau, A. 1974. Les processus d’association chez les bilingues. (The process of association in bilinguals). Working Papers in Bilingualism 2: 57–67. Arkwright and Viau compared the ability of compound and coordinate bilinguals in English and French to recover key concepts when presented with a list of associations most frequently elicited by these words. Monolingual and bilingual lists were used, but no significant differences between the groups were found. This result contradicts the results of Lambert and Rawlings (1969). Bagger Nissen, H. & Henriksen, B. 2006. Word class influence on word association test results. International Journal of Applied Linguistics 16(3): 389–408.
Connected Words
Bagger Nissen and Henriksen collected word association responses to sets of Danish and English words from L1 Danish learners of English. Each set consisted of 15 nouns, 15 verbs and 15 adjectives. Two responses were collected for each stimulus word. Responses were classified as paradigmatic, syntagmatic, phonological or other. The results suggested that syntagmatic responses predominated in both the Danish and the English data. No major differences between subjects’ first responses and their second responses were recorded in either language. Some differences due to word class were found: noun stimuli tended to elicit more paradigmatic responses, and verbs were more likely to elicit a syntagmatic response. The paper discusses a number of methodological issues raised by this work, presents some possible explanations for why associations might be affected by the word class of the stimuli, and questions the usefulness of the syntagmaticparadigmatic shift idea. Barfield, A. 2005. Complications with collocations. JABAET Journal 9: 85–103. Barfield notes that research on collocational competence is hampered by a lack of reliable testing tools, and speculates whether a word association test might provide an alternative approach to this question. This idea is examined in a study where three L1 Japanese learners of English are tested on a series of six multiple word association tests and collocation tests for a set of 12 high frequency words in English. Variations in the test results are discussed in detail. Barfield argues that the word association data and the collocation tests are not entirely co-terminous, but taken together they seem to provide better insights about collocation development than conventional collocation tests do. Beheydt, L. 2007. Gestandardiseerde receptive woordschattoetsen. (Standardised tests of receptive vocabulary). In Tussen Taal, Spelling en Onderwijs, D. Sandra, R. Rymenans, P. Cuvelier & P. van Petegem (Eds). Gent: Academia Press. Beheydt notes the importance of reliable tests of vocabulary knowledge for L2 learners. He reviews the main features of a number of test types – translation tests, Yes/No tests, Eyckmans’ Recognition Based Vocabulary Test, several multiple choice formats and the Levels Test for vocabulary size, and a number of association based tests for assessing depth of word knowledge. These include Read’s Associates Test, and simlar tests developed by Neven, Schoonen and Qian. Bogaards’ Euralex French Test is also discussed. Beheydt, L. 2007. De dubbele pregnante contexttoets als productieve woordenschattoets voor Nederlands als vreemde taal. (The double pregnant context test as a productive vocabulary test for Dutch as a foreign language). In Nederlandistiek in Context, J. Fenoulhet, A.J. Gelderblom, M. Kriste, J. Lalleman, L. Missine & J. Pekelder (Eds). Amsterdam: Rozenberg.
Chapter 8. Word associations in a second language
Beheydt briefly describes the main types of vocabulary tests in current use. He goes on to outline a “double pregnant context” test in which items consist of two sentences each containing a gap that can be filled by a single lexical form. This test format has the advantage that it requires subjects to show knowledge of the various meanings of a word, the various grammatical features of a word, and the association patterns the word enters into. Some problems with the construction of test items are noted. Bol, E. & Carpay, J.A.M. 1972. Der Semantisierungprozess im Fremdsprachenunterricht: Lernpsychologie, Experimente und methodische Folgerungen. (The process of semanticization in foreign language teaching: the psychology of learning, experiments and methodological conclusions). Praxis des Neusprachlichen Unterrichts 19(2): 119–133. An account of an experiment using word associations made by native German speakers in foreign languages. The results of the investigation show that: (1) the more experience students have with the foreign language the less they respond with formal responses and translations; (2) students instructed according to the grammar-translation method are more likely to respond with formally similar responses; and (3) nearly 60 per cent of all responses are semantic responses, and these do not take a significantly longer time to produce than translations do. Carter, R. 1983. ‘You look nice and weedy these days’: Lexical associations, lexicography and the foreign language learner. Journal of Applied Language Study 1(2): 172–189. A general review of the notion of core vocabulary, together with a discussion of how information about coreness might be displayed in a dictionary designed for L2 speakers. Carter argues that meaning can be largely defined in terms of Osgood’s semantic differential, but that this system needs to be supplemented by an additional dimension of formality. Illustrations of how this might work in practice are provided. Champagnol, R. 1974. Association verbale, structuration et rappel libre bilingues. (Word associations, structuring and free recall in bilinguals). Psychologie Française 19: 83–100. An account of an experiment in which two groups of French schoolchildren learning English were tested on a word association task and a list-learning task. More associations were produced by the more advanced learners, and a greater number were produced by both groups when the language of response was French. For the list-learning task, fifteen learning trials were allowed, and measures of subjective organization were taken in addition to the number of words recalled correctly on each trial. Subjective organization was more consistent for
Connected Words
the advanced subjects, but there were no significant differences associated with language. Greater subjective organization appears to correlate with better recall scores. Champagnol suggests that the correlations between number of associations and recall score are difficult to interpret: fluency with associations correlates negatively with recall scores in both languages. Cohen, A. & Aphek, E. 1980. Retention of second language vocabulary over time: Investigating the role of mnemonic associations. System 8: 221–235. A general review of the effects of learning vocabulary by associational methods. Cohen and Aphek argue that these methods are almost always superior to other methods. An experiment is reported in which 26 learners of Hebrew were trained to generate associations to new words, and their use of these associations was tested over a one month period. The behaviour of two subjects is reported in detail, as are the associations made to two individual words. Overall, associations seem to play an important role in recall. However, even with training, subjects did not always choose to use associative methods to remember words, and this did not produce a marked deterioration in performance. Crable, E. 1975. A Comparison of Thought Processes as Measured by Paradigmatic Association, GRE, Grade Point Average and Faculty Ratings of Foreignnative-graduate Students. Ph.D. dissertation, University of Georgia. No abstract. Dalrymple-Alford, E. & Aamiry, A. 1982. Word associations of bilinguals. Psy chonomic Science 21: 319–320. Arabic/English bilinguals were given a word association test in which certain stimulus words were repeated either in the same language or in translation. Subjects were quite likely to give identical responses to repeated stimuli in the same language, but translation equivalents often elicited different responses, and the likelihood of stable responses across languages was thus low. The authors argue that the differences are too great to be explained merely in terms of random choice among a hierarchy of responses, and they see this finding as evidence against the single store hypothesis. Dalrymple-Alford, E.C. 1982. Associations of bilinguals to synonyms and translation equivalent words. Current Psychological Research 2: 181–186. English-French bilinguals were asked to produce word associations to single words, and then subsequently to either a) the same words, b) synonyms, c) direct translations, or d) translations of synonyms. The number of identical or thematically related responses was greatest when subjects responded to the same stimulus word (a), and greater for straight translations (b) than for other cases (c and d). The paper argues that these data are best interpreted as reflecting differences in the
Chapter 8. Word associations in a second language
semantic overlap between words, and that they do not support the existence of language-specific associative networks. Davis, B. J. & Wertheimer, M. 1967. Some determinants of associations to French and English words. Journal of Verbal Learning and Verbal Behavior 6: 574–581. A word association test using the continuous association method to stimulus words in English, French, and a set of ambiguous items. This study compares language of stimuli, language of instruction and part of speech across four groups of increasing proficiency. The results show that instructions in French produce a higher proportion of French responses, and that more fluent subjects gave a higher proportion of French words. The bulk of the responses were classified as paradigmatic. de Groot, A.M.B. 1992. Determinants of word translation. Journal of Experimental Psychology: Learning, Memory and Cognition 18: 1001–1018. De Groot outlines a model of lexical representations in which the meanings of words are represented by a cluster of meaning components. Different words, including words in different languages will share these components in different extent. De Groot argues that this simple model provides an explanatory framework for a large range of experimental data. In particular, results from studies of word translation, translation recognition, semantic priming, word association, translation priming, and relation assessment, can all be handled in this framework. Erdmenger, M. 1985. Word acquisition and vocabulary structure in third year EFL learners. IRAL 23(2): 159–164. 248 German speaking learners of English were asked to produce multiple word associations connected with TRAVELLING. Erdmenger claims that the resulting word association patterns are basically identical to what you would expect in the L1. He goes on to argue that there is a case for using L1 association patterns as a way of teaching L2 vocabulary. Fishman, J. & Cooper, R. 1969. Alternative measures of bilingualism. Journal of Verbal Learning and Verbal Behavior 8: 276–282. A series of tests in English and Spanish was given to a group of English/Spanish bilinguals; these tests included word naming, word associations, word frequency estimation, subjective self-assessment, and a series of phonologically oriented elicitation tests. Interrelationships between these variables, demographic variables, assessments of accentedness, English ability, Spanish ability and reading were computed, and seven principal factors were extracted. Self-assessment emerged as one of the best predictors on the last four variables, but the vocabulary tests also account for a large proportion of the total variance.
Connected Words
Fitzpatrick, T. 2000. Using word association techniques to measure productive vocabulary in a second language. Language Testing Update 27: 64–69. A brief summary of the work reported more fully in Meara and Fitzpatrick (2000). Fitzpatrick, T. 2003. Eliciting and Measuring Productive Vocabulary Using Word Association Techniques and Frequency Bands. Ph.D. dissertation, University of Wales, Swansea. An extended account of the development of Fitzpatrick’s Lex30 test. Fitzpatrick, T. 2006. Habits and rabbits: word associations and the L2 lexicon. EUROSLA Yearbook 6: 121–145. Fitzpatrick asked a group of L1 English speakers and a group of mixed L1 learners of English to generate a single word association response to a set of 60 words taken from the Academic Word List. After a retrospective interview with each subject, she then categorises each of these responses into 16 detailed response types, and compares the distribution of these response types across the two groups. Some significant differences between the two groups are identified. The data strongly suggest that L1 respondents are more likely to produce “position-based” responses (i.e., syntagmatic responses), and more defining synonyms. Contextual associations and loose conceptual associations were more likely to be produced by non-native speakers. Fitzpatrick, T. 2007. Word associations: Unpacking the assumptions. International Journal of Applied Linguistics 17(3): 319–331. Fitzpatrick reports a study of 30 L1 English speakers making word association responses to words from the Academic Word List. Although the responses they produce on two separate testing occasions are different, the overall pattern of the response distributions for individual subjects is remarkably consistent across the two testings. Fitzpatrick speculates that response type preference might be a significant factor in L2 word associations too. Fitzpatrick, T. & Meara, P.M. 2004. Exploring the validity of a test of productive vocabulary. Vigo International Journal of Applied Linguistics 1: 55–74. Fitzpatrick and Meara describe their Lex30 test. A test-retest study suggests that the test produces reliable scores, even though subjects produce different responses on each test occasion. Lex30 scores also correlate reasonably well with other vocabulary tests, specifically a test of receptive vocabulary size, a translation test and Laufer and Nation’s Controlled Productive Levels Test. The pattern of correlations among these tests is not straightforward, and leads the authors to question some of the assumptions underlying the concept of productive vocabulary. Fleming, G. 1966. Meaning, meaningfulness and association in the context of language teaching media. Praxis des Neusprachlichen Unterrichts 13(2). No abstract.
Chapter 8. Word associations in a second language
Franceschina, F.S.N. 2004. Review of S. Namei, The Bilingual Lexicon from a Develop mental Perspective: A Word Association Study of Persian-Swedish Bilinguals. Multilingua 23: 197–205. A detailed and extensive review of the work reported in Namei 2002. Gekoski, W. 1980. Language acquisition context and language organization in bilinguals. Journal of Psycholinguistic Research 9: 429–449. Compound and coordinate English-Spanish and Spanish-English bilinguals at three levels of proficiency were tested on a series of word association tasks. Response times and proportion of equivalent responses across languages were assessed. Results showed that compound bilinguals responded faster and gave more equivalent responses, but the differences were actually very small. Spanish dominant bilinguals responded more slowly than English dominant subjects, and they also gave fewer equivalent responses. Gekoski argues that these data do not offer any strong support for the usefulness of the compound/co-ordinate distinction. Gekoski, W. 1969. Associative and Translation Habits of Bilinguals as a Function of Language Acquisition Context. Ph.D. dissertation, University of Michigan. [A study of free and restricted word associations in English and Spanish, with special reference to the level of proficiency and type of bilingualism of the subjects. Compound bilinguals gave higher proportions of equivalent responses to equivalent stimuli in the two languages, but there were no significant differences associated with proficiency level and this variable. Native English speakers produced a higher percentage of response equivalents than native Spanish speakers, and also responded faster. All subjects responded fastest when stimulus and response language were the native language, and slowest when both were in the foreign language. Restricted associations produced faster response times and higher levels of response equivalence.] Dissertation Abstracts: 30. lb. 404. Gekoski, W. 1970. Effects of language acquisition contexts on semantic processing in bilinguals. Proceedings of the American Psychological Association 5: 487–488. A brief account of the work reported more fully in Gekoski 1969. Gerganov, E. & Taseva-Rangelova, K. 1982. The impact of association value and number of syllables of English words on memorization in teaching English to Bulgarian learners. Supstavitelno Ezikoznarie 7(4): 3–12. No abstract. Grabois, H. 1996. Word association methodology in a cross-linguistic study of lexicon. Papers in Second Language Acquisition and Bilingualism, Cornell Working Papers in Linguistics 14: 85–96. Grabois asked small groups of L1 Spanish, L1 French subjects, and bilingual subjects to make continuous associations to a set of 22 stimulus words in French
Connected Words
or Spanish. The results suggest that there may be some differences between the two language groups, but it is unclear whether the differences are completely reliable. Grabois interprets the data in terms of Vygotskyan approaches to psycholinguistics, and suggests that a more precise set of tools, capable of identifying the characteristics of lexical networks may be needed. Grabois, H. 1997. Love and Power: Word Associations, Lexical Organization and L2 Acquisition. Ph.D. dissertation, Cornell University. See next entry. Grabois, H. 1999. The convergence of sociocultural theory and cognitive linguistics. Lexical semantics and the L2 acquisition of love, fear and happiness. In Cultural constructions of emotional substrates, G.B. Palmer & D.J. Occhi (Eds), 201–233. Amsterdam: John Benjamins. Grabois worked with five groups of subjects: a group of L1 English speakers, and group of L1 Spanish speakers, and three groups of L1 English speakers with varying degrees of proficiency in Spanish. Subjects completed a word association task based on Szalay and Brant’s chaining procedure for four initial stimulus words: AMOR (love), FELICIDAD (happiness), MIEDO (fear) and MUERTE (death). Grabois constructed word association networks for the responses, and correlated these networks with the responses of the L1 Spanish speakers. Expert L2 speakers, with extensive residence in a Spanish speaking country achieved consistently higher correlations than the other groups. The paper also provides a qualitative analysis of the responses for each of the initial stimulus words. It concludes that long term residents in an L2 culture do reorganise their L2 lexicons so that they approximate those of L1 speakers. Grainger, J. & Beauvillain, C. 1988. Associative priming in bilinguals: Some limits of interlingual facilitation effects. Canadian Journal of Psychology 42: 261–273. Grainger and Beauvillain report two experiments in which presenting a word in one language can make it easier to recognise a related word in a second language. The size of this effect depends on the length of time between the presentation of the two words. When the gap is short, presenting a word in one language can affect recognition in the same language, but facilitiation between languages does not occur. With longer gaps, facilitation both between and within languages occurs, but the between language effect is weaker. Greidanus, T., Bogaards, P., van der Linden, E., Nienhuis, L. & de Wolf, T. 2004. The construction and validation of a deep word knowledge test for advanced learners of French. In Vocabulary in a Second Language: Selection, Acquisition and Testing [Language Learning & Language Teaching 10], P. Bogaards & B. Laufer (Eds), 191–208. Amsterdam: John Benjamins.
Chapter 8. Word associations in a second language
This paper reports the development of a French version of Read’s Word Associates Test. Two versions of the test were used with L1 Dutch speakers. The test appears to have good characteristics, and distinguishes between groups where we would expect it to. It also has modest correlations with two other tests of French vocabulary. Greidanus, T. & Nienhuis, L. 2001. Testing the quality of word knowledge in L2 by means of word associations: types of distractors and types of associations. Modern Language Journal 85: 567–577. Greidanus and Nienhuis asked L1 Dutch speakers to complete a word association test in French. The format of this test is similar to the format used in Read’s Associates Test. Results showed that advanced L2 speakers performed better than weaker subjects; stimulus sets containing semantically related distractors are harder than sets which do not contain these distractors; subjects tended to prefer paradigmatic responses over other types; high frequency words tended to generate better results than low frequency words. The authors argue that depth and breadth may be less independent than some people have suggested. Hammerly, H. 1974. Primary and secondary associations with visual aids as semantic conveyors. IRAL 12: 118–125. An account of two small pilot studies and one larger study in which 100 L1 English subjects were shown a single picture and taught the German word that named it. Subjects were subsequently asked what they thought the word meant, and what associations the picture evoked. Analysis of this data showed that the pictures were not reliably labelled in the same way by all participants, and that the primary association to the pictures was generally a word in the native language. Hammerly suggests that these data do not support the view that foreign language words can be effectively taught using picture stimuli alone. Heuer, H. 1973. Wortassoziationen in der Fremdsprachendidaktik. (Word associations in foreign language teaching). In Neusser Vorträge zur Fremdsprachen didaktik, W. Hüllen (Ed.). Berlin: Cornelsen-Velhagen & Klasing. A large scale study of the associations produced in English by 1400 German schoolchildren. Heuer describes association tests and criticizes the use that has been made of them. The responses of the subjects are analysed in some detail, particularly in terms of the proportion of responses that account for the primary and secondary responses of the groups, and the habit strength of the principal responses. The responses are also compared with those produced by American schoolchildren of a comparable age. The author concludes by suggesting that an association dictionary using key words which are congruent between native and foreign speakers might be produced.
Connected Words
Henriksen, B. 2008. Declarative lexical knowledge. In Vocabulary and Writing in a First and Second Language, D. Albrechtsen, K. Haastrup & B. Henriksen (Eds), 22–66. Basingstoke: Palgrave Macmillan. Henriksen provides a detailed critical review of studies of word association in an L2, and the factors which affect L2 word associations responses. She reports a study in which L1 Danish learners of English completed two word association tests. In task 1, subjects wrote down two associations for a set of 48 words from the Kent-Rosanoff list. In task 2, the word connection task, subjects saw a stimulus word and 10 other words five of which were connected to the stimulus word. Subjects were asked to identify these words. In general, the tasks discriminated low level learners from high level learners, but were less good at discriminating between higher level groups. Henriksen compares these data to NS base-line scores. She notes a number of problems in handling data of this sort, and highlights the need for reliable overall word association measures. Hinofotis, F.B. 1977. Lexical dominance: A case study of English and Greek. In Proceedings of the 1st International Conference on Frontiers in Language Proficiency and Dominance Testing, J.E. Redden (Ed.). Carbondale IL. A study of two Greek children resident in the United States. Three tests were used to assess lexical dominance: (1) a picture vocabulary test, where naming of pictures was required; (2) a continuous word association test, where the number of responses produced was scored; and (3) a restricted word association test which was scored in the same way. The adolescent boy subject scored practically identically in both languages for all tests. The younger girl subject showed strong English dominance, and there was some evidence that she had lost control of a large part of her native Greek vocabulary. Horowitz, L.M. & Gordon, A.M. 1972. Associative symmetry and second language learning. Journal of Educational Psychology 63(3): 287–294. Horowitz and Gordon argue that the goal of vocabulary learning is for the NL words to evoke the TL words quickly and effectively. In a paired associate paradigm, however, it should be easier to learn the NL words evoked by the TL stimuli. They suggest that words should be taught in this way, and that this learning should be followed by independent study of the TL words. This serves to make the TL words more available, and thus automatically enhances the availability of the NL-TL connections. Two experiments where subjects learned short lists of Japanese words broadly support this claim. Ife, A., Vives Boix, G. & Meara, P.M. 2000. The impact of study abroad on the vocabulary development of different proficiency groups. Spanish Applied Linguistics 4(1): 55–84.
Chapter 8. Word associations in a second language
The authors tested 36 L1 English learners of Spanish on a receptive word association test (A3VT) before and after an extended period of residence abroad. The A3VT test comprises 120 items, each consisting of three words. Subjects’ task is to identify the pair of words which have a strong associational relationship. Results show that scores on the test improve greatly as a result of residence abroad, and this improvement is particularly strong in more advanced subjects. Ito, M. & Nakata, K. 1993. Semantic and syntactic features of word associations in three languages by Japanese students. Bulletin of the Hakaba Summer Institute of Linguistics, 16–29. No abstract. Kent, J.-P. 1984. Woordassociatie en vreemde-talenonderwijs. (Word associations in foreign language teaching). Levende Talen 395: 525–530. This paper presents a general outline of word association theory, together with an extended set of examples from Dutch and Romanche, where response patterns do not fall into a one-to-one correspondence. Some ways of using words associations in class are outlined, and Kent concludes with some technical notes on the use of word associations in research: the choice of stimulus words, the number of words in a test, and alternative ways of eliciting responses. Kolers, P.A. Interlingual word associations. Journal of Verbal Learning and Verbal Behavior 2: 291–300. Bilingual English/German, English/Spanish and English/Thai subjects were given a word association task in which they responded to 55 English words and their NL equivalents in both English and their NL. Results showed that only about one third of the responses in one language translated those in the other. Of these, about two thirds were lexically similar in the interlingual tests. The proportions were higher with concrete nouns than with abstract words. Roughly parallel patterns of responses were found in English. Kruse, H., Pankhurst, J. & Sharwood Smith, M. 1987. A multiple word association probe in second language acquisition research. Studies in Second Language Acquisition 9(2): 141–154. Kruse and colleagues provide a detailed discussion of the way word associations have been used in second language research, and raise the question of whether they can be used as a measure of L2 proficiency. They report a study of 15 Dutch learners of English, in which the subjects produced multiple responses to 10 English stimulus words. Their responses were weighted for stereotypy. The results showed that the word association scores correlated only weakly with scores on a 40 item cloze test. The authors conclude that the word association test is not a good test of L2 proficiency.
Connected Words
Kudo, Y. & Thagard, D. 1999. Word associations in L2 vocabulary. University of Hawai’i Working Papers in ESL 17( 2): 75–105. No abstract. Lambert, W.E. ** 1956. Developmental aspects of second language acquisition. Journal of Social Psychology 43: 83–104. A series of experiments investigating the linguistic behaviour of three groups of subjects: undergraduates and graduates in French and native French speakers. Using continuous associations to French and English words, it was found that (1) stimulus words in French elicited more words in response as proficiency in French increased; (2) given a choice of languages to respond in, subjects respond in proportion to the relative strength of their languages; (3) greater proficiency in a language produces response patterns that are closer to those of native speakers; and (4) more proficient subjects respond with rarer words. Stereotypy and form class of responses and pronunciation were also investigated, but failed to produce significant differences that were related to proficiency. Lambert suggests that the results should be seen on two dimensions – a factor representing vocabulary knowledge and a factor corresponding to cultural awareness. Lambert, W.E. 1969. Psychological studies of the interdependencies of the bilingual’s two languages. In Substance and Structure of Language: Lectures Delivered before the Linguistic Institute of the Linguistic Society of America, J. Puhvel (Ed.). Berkeley CA: University of California Press. An informal account of a series of experiments on the linguistic performance of bilinguals. (For details see other entries under Lambert.) These experiments include Lambert’s work on word associations, semantic rating scales, aphasia in bilinguals and semantic satiation. Two experiments – immediate recall of word lists and a Stroop test are reported in detail. Lambert, W.E. & Moore, N. 1966. Word association responses: comparisons of American and French monolinguals with Canadian monolinguals and bilinguals. Journal of Personality and Social Psychology 60: 376–83. Lambert and Moore compare the word associations produced by a group of bilingual French Canadians and two groups of monolingual Canadians with the previously published norms for American and French subjects, using a translation of the Kent-Rosanoff List. Results showed that (1) in general, English responses were more stereotyped than the French ones; (2) there was a high degree of overlap in the responses of English Canadians and Americans (78 per cent), but that the other possible pairings showed a very low degree of overlap; and (3) American and French monolingual groups were highly distinct in their responses, and that English Canadians and French Canadians were also relatively dissimilar. The bilingual group
Chapter 8. Word associations in a second language
makes different responses in each of its two languages, but these responses are quite similar to those of the appropriate monolingual control group. Lambert, W.E. & Rawlings, C. 1969. Bilingual processing of mixed language associative networks. Journal of Verbal Learning and Verbal Behavior 8: 604–609. Twenty English/French bilinguals were placed into compound and co-ordinate groups and given a key concepts test with stimuli in English, in French, or in both languages mixed. The results showed that compound bilinguals score better than coordinates on the mixed language lists, but this difference is reduced if the items from each language are blocked in sublists. Lauerbach, H. 1979. Das Wortassoziationsexperiment als Forschungsinstrument der Fremdsprachendidaktik. (Word associations as a research tool in foreign language teaching). Die Neueren Sprachen 78: 379–91. An account of a word association test using German learners of English. The responses produced by these learners are compared with published norms for English and German, and major differences noted. Lauerbach argues that the word association test is particularly good at identifying words in the learner’s interlanguage that are liable to fossilisation. Machalias, R. Semantic networks in vocabulary teaching and their application in the foreign language classroom. BABEL 26(3): 19–24. Machalias provides a brief discussion of word associations in L2 learners, and goes on to describe a dozen ways of exploiting semantic networks for vocabulary learning in classrooms. Massad, C., Yamamoto, K. & Davis, O. 1970. Stimulus modes and language media: A study of bilinguals. Psychology in the Schools 7: 38–42. Eleven English/Spanish bilinguals produced single word association responses to line-drawings and words in English and Spanish. The results were classified as ‘sense-impression responses’ or not, and the number of such responses evoked was observed. More sense-impression responses were evoked by words than by pictures, and more were recorded when the response language was Spanish than when it was English. These differences were not significant, however. Meara, P.M. Schizophrenic symptoms in foreign language learners. UEA Papers in Linguistics 7: 22–49. Meara draws attention to some similarities between the abnormal language behaviour of schizophrenics and the behaviour of language learners when they perform in their second language. These similarities include peculiar word associations, low type-token ratios, speech that is unpredictable and a lack of sensitivity to syntactic structure.
Connected Words
Meara, P.M. 1978. Learners’ word associations in French. The Interlanguage Studies Bulletin 3(2): 192–211. This paper discusses the associations made by a group of 75 English learners of French to the 100 words of the Kent-Rosanoff list. The responses show very low stereotypy and a large number of clang associations. Most of the responses that appear to be native-like can be accounted for in terms of translations of English primary responses. Meara, P.M. 1983. Word associations in a second language. Nottingham Linguistics Circular 11: 28–38. This article reviews a series of experimental studies of the word associations made by L2 speakers. The general findings in this area are reviewed, and some unpublished studies on response stability summarised – learners’ responses are generally less stable than those produced by L1 speakers. Meara points to a number of methodological problems in the ways word associations are usually studied, notably the use of the standard Kent-Rosanoff list. Meara, P.M. Simulating word associations in an L2: The effects of structural complexity. Language Forum 33(2): 13–31. Meara reports a series of simulations in which he examines the probability of finding at least one associated pair in sets of five randomly selected words. He concludes that the critical factor in these simulations is the total number of connections in the model. He suggests that it might be possible to estimate this figure using some simple associative tasks, and that the figure might work as a surrogate measure for L2 lexical organisation. Meara, P.M. Simulation word associations in an L2: Approaches to lexical organisation. International Journal of English Studies 7(2): 1–20. Meara describes a set of simulations which explore the way different features of lexical organisation affect the probability of finding a pair of associated words in a set of five randomly selected words. The simulation is equivalent to giving subjects a set of five words and asking if they can identify a pair of associated words among them. The paper speculates that it might be possible to extrapolate from a simple test of this sort and derive some interesting claims about the number of links connecting words in L2 speakers’ lexicons. Meara, P.M. & Fitzpatrick, T. 2000. Lex30: An improved method of assessing productive vocabulary in an L2. System 28(1): 19–30. Meara and Fitzpatrick describe an easy-to-administer test of productive vocabulary. The test requires subjects to produce a set of word association responses to a small set of stimulus words. The stimulus words are chosen so that, with native speaker respondents, they typically generate a wide range of responses and a high
Chapter 8. Word associations in a second language
proportion of low frequency responses. They argue that a test of this sort might be able to sample non-native speakers’ productive vocabulary more effectively than some other test formats that are in current use. Meara, P.M. & Wolter, B. ** 2004. V_Links: Beyond vocabulary depth. Angles on the English Speaking World 4: 85–97. Meara and Wolter argue that the distinction that many people make between vocabulary breadth and vocabulary depth is not a productive one for vocabulary research. Depth of vocabulary can only be assessed by more and more detailed tests and the logistics of testing implies that this work can only be done only with fewer and fewer words. The paper argues that a much more productive way of looking at vocabularies would use features which are properties of the vocabulary as a whole, rather than properties of the individual words that a vocabulary is comprised of. Two dimensional characteristics seem particularly promising: vocabulary size and vocabulary organisation. Some ways of investigating the organisation dimension are discussed in detail. Mochizuki, M. 1997. A direction vocabulary tests should follow. Reitaku Review 3: 105–119. [In Japanese.] Mochizuki reviews the main types of vocabulary test in current use, with specific reference to the measurement of vocabulary depth. He argues that the Word Association Test is a promising format, and can profitably be developed to take advantage of the positive features of the other test types. The WAT could be made more effective by the introduction of non-words as a form of control. [TK] Namei, S. 2004. Bilingual lexical development: A Persian-Swedish word association study. International Journal of Applied Linguistics 14(3): 363–388. See next entry. Namei, S. 2004. The Bilingual Lexicon from a Developmental Perspective: A Word Association Study of Persian-Swedish Bilinguals. Ph.D. dissertation, Stockholm University. This book reports a detailed investigation of the word associations of L1 Persian speakers learning Swedish. Namei used a 100 word stimulus list, eliciting a single response for each stimulus word, and compares the results obtained from the target group with results from smaller groups of monolingual Swedish and Persian subjects. Namei shows that “phonological” responses occur in the data of both L2 learners and native speakers, and that these responses seem to arise as a function of how well subjects know the target words. Bilinguals appear to be more idiosyncratic in their responses in both languages than L1 speakers are: Namei argues that these untypical responses may be an indicator of growth in the lexicon, both in terms of breadth
Connected Words
and depth. Namei also claims that her data supports the idea of a syntagmaticparadigmtic shift, but she argues that this shift is also related to how well subjects know the individual words being tested. Opoku, J. Bilingual representational systems in free recall. Psychological Reports 57: 847–855. No abstract. Orita, M. 1999. Word association patterns of Japanese novice EFL Learners: A preliminary study. Annual Review of English Learning and Teaching 4: 79–94. Orita asked 44 L1 Japanese learners of English to produce word associations to 8 English words: a, an, bike, kitchen, order, yellow, happy, look. An analysis of the responses suggests that Nouns tended to produce paradigmatic responses, but that overall syntagmatic responses were more frequent than paradigmatic ones. There were many phonological parallels between the stimulus and the responses, and a number of responses appeared to be encyclopaedic, rather than semantic. Orita, M. 2002. Word associations of Japanese EFL learners and native speakers: Shifts in response type distribution and the associative development of individual words. Annual Review of English Language Education in Japan 13: 111–120. Orita asked 295 L1-Japanese learners of English to associate to a set of 60 high frequency words. These subjects fell into four groups differing in proficiency. A group of native speakers was also studied. The responses were classified into syntagmatic responses, paradigmatic responses, phonological and other responses. Orita claims that the most advanced group showed a shift towards the pattern of responses produced by native speakers. However, this pattern was not found reliably with all the words. Orita, M. 2002. Proficiency, lexical development and the mental lexicon: investigating the response type distribution of word associations of Japenese EFL learners and native speakers. Research Reports Yatsushiro National College of Technology 24: 113–124. Orita gave a 60 item word association test to 152 L1 Japanese learners of English. Results supported the view that more advanced subjects tend towards making predominantly paradigmatic responses. However, a detailed analysis by word showed that not all items contributed to this shift in response type. 33 words showed a syntagmatic to paradigmatic shift, but the others did not. Orita, M. 2002. Word associations of Japanese EFL Learners and native speakers: shifts in response type distribution and the associative development of individual words. Annual Review of English Language Education in Japan 13: 111–120.
Chapter 8. Word associations in a second language
Orita gave a 60 item word association test to four groups of L1 Japanese learners of English. The most advanced groups showed a shift towards paradigmatic responses, and generally responded in a more native-like fashion. However, only one third of the stimulus words showed this pattern of shift across the groups, and some words generated paradigmatic associations even in the least proficient group. Palmberg, R. 1990. Improving foreign language learners’ vocabulary skills. RELC Journal 21(1): 1–10. Palmberg briefly reviews current thinking on the development of vocabulary in an L2, and outlines a number of activities based on word associations which could be used to enhance lexical control. Piper, T.H. & Leicester, P.F. 1980. Word association behavior as an indicator of English language proficiency. ERIC Document ED 227 651. Piper and Leicester asked L1 Japanese speakers and L1 English speakers to complete a word association task in English. They claim to find significant differences between beginning ESL learners and native speakers in the way they respond to the stimuli. Intermediate level learners are different from the native speakers in the way they respond to Verbs and Adjectives, but not to Noun stimuli. They conclude that word association behaviour might be used as an indicator of general language proficiency. Politzer, R.B. ** 1978. Paradigmatic and syntagmatic associations of first year French students. In Papers on Linquistics and Child Language: Ruth Hirsch Weir Memorial Volume, V. Honsa & M. J. Hardman-de-Bautista (Eds), 203–210. Berlin: Mouton. An account of the word associations of 203 first year college students to twenty French and twenty English words. Politzer shows that the ratio of paradigmatic to syntagmatic responses is considerably higher in English than in French. This finding is related to other aspects of language proficiency, in particular, a high number of paradigmatic responses is associated with a high level of grammatical skills. Politzer argues that drills and pattern practice may be conducive to a high level of syntagmatic responses. Pajoohesh, P. 2007. A probe into lexical depth: What is the direction of transfer for L1 literacy and L2 development? Heritage Language Learning and TESOL: Special Issue of Heritage Language Journal. This study reports on the piloting of a version of Word Association Test especially designed for school-age children. The participants, 60 students in Grades 5 and 7 from two Toronto schools, came from very diverse language backgrounds with
Connected Words
a wide range of schooling experience in their home countries (L1 literacy) and Canada. Their performance is investigated with regard to the contributing factors; language proficiency (based on teacher’s ranking), first language background, and L1 literacy. Ramsey, R.A. 1981. A technique for interlingual comparison: The LEXIGRAM. TESOL Quarterly 15(1): 15–24. Ramsey provides a brief discussion of the size of the lexicon of a native speaker, and goes on to discuss how restricted word associations can be used to assess the structure of the lexicon. He illustrates the use of lexigrams – graphic representations of the strength of restricted associations made to a particular word – with a detailed analysis of responses made to the word ABORTION by native English speakers, and to the corresponding word in Spanish and Catalan by speakers of those languages. Randall, M. 1980. Word association behaviour in learners of English as a second language. Polyglot 2(2): B4–D1. 26 EFL students produced multiple associations to 50 words from the Kent-Rosanoff list, and repeated this task after nine weeks of an intensive English course. Changes in association patterns were observed. Results showed that the stereotypy scores of the group as a whole increased and there was some evidence to suggest that the learners’ responses are more native-speaker like at the end of the nine-week period than at the beginning. These changes may also be related to increasing proficiency. Read, J. 1993. The development of a new measure of L2 vocabulary knowledge. Language Testing 10(3): 355–371. Read describes a vocabulary test in which subjects are presented with a set of items based on word associations. Each item contains one target word, and eight other words of which four are related by association to the target word. subjects’ task is to identify these four words. Read provides an IRT analysis of the test data, and reports data from subjective accounts from the test-takers. Both suggest that the test format is a reliable and efficient way of measuring vocabulary depth. Read, J. 1995. Refining the word associates format as a measure of depth of vocabulary knowledge. New Zealand Studies in Applied Linguistics 1: 1–17. An earlier version of the paper summarised in the following entry. Read, J. Validating a test to measure depth of vocabulary knowledge. In Validation in Language Assessment, A. Kunnan (Ed.). Mahwah NJ: Lawrence Erlbaum. Read reports two experiments using his word associates test. In Experimentt 1, 84 mixed L1 speakers completed this test (WA) and a definition matching test (MT) similar to the Levels Test. The correlation between WA and MT was 0.82, and
Chapter 8. Word associations in a second language
four misfitting persons were identified. In Experiment 2, 38 subjects took these two tests and an additional interview (INT). Correlations betweeen the three test scores were: WA~INT: 0.76; MT~INT: 0.92; WA~MT: 0.85. These results are discussed in terms of Messick’s theory of test validity. Riegel, K.F. ** 1968. Some theoretical considerations of bilingual development. Psychological Bulletin 70(6): 647–670. In the first part of this paper, Riegel develops a mathematical model of the development of two languages in individual speakers. The model is a word-based one which assumes that relations between words develop at a rate that is affected by the number of words available to the network and the rate at which new words are acquired. The effects of introducing a second language at different times are studied. In the second part, he distinguishes five levels of bilingual development. Levels I and II are basic, in that the FL words in the total network are minimal; levels III and IV show the FL words beginning to form relationships among themselves, but allow only relationships of equivalence between items in the FL and the NL; at level V the words of both the FL and the NL form proper interconnections. Riegel suggests that few bilinguals actually get beyond stage IV. He further argues that some of these claims can be tested empirically using word association tests. Part three discusses the above arguments in the light of the data found in the Michigan restricted association norms. Riegel, K.F., Ramsey, R. & Riegel, R. ** A comparison of the first and second languages of American and Spanish students. Journal of Verbal Learning and Verbal Behavior 6: 536–544. 24 native English speakers and 24 native Spanish speakers performed seven restricted association tasks in both languages, using 35 words from the Kent-Rosanoff list. The main findings show that fewer responses were made in the subjects’ L2 and that for both groups, response variability was greater in Spanish than in English. Spanish subjects used the same words more often in different tasks, and the authors interpret this as showing a lack of conceptual clarity, which they ascribe to lack of exposure to formal language training. Riegel, K.F. & Zivian, I.W.M. ** 1972. A study of inter- and intralingual associations in English and German. Language Learning 22(1): 51–63. 24 English-speaking learners of German were required to produce free word associations and eight types of restricted associations to 40 nouns. Stimuli and responses were either in English or in German. Results showed that German responses were more varied than English responses despite the fact that German was the second language of the subjects. Response variability is lower for interlingual conditions than for intralingual conditions. Intralingual responses are more varied
Connected Words
under restricted associations, but this does not hold for interlingual conditions. Intralingual responses were primarily paradigmatic while a higher proportion of interlingual responses were syntagmatic. Ruke-Dravina, V. 1971. Word associations in monolingual and multilingual individuals. Linguistics 74: 66–85. L1 Swedish, L1 Latvian and bilingual subjects carried out a continuous association task to four words (the Swedes in Swedish, and the other two groups in Latvian). The responses of each group are compared and contrasted, and a number of striking differences noted. The paper argues that many of these differences are determined by the structure of the two languages. There was some evidence that the bilingual group produced chains of responses that switched languages randomly. Whenever the monolingual groups switched languages, e.g., to German or French, this usually involved a translation of the preceding stimulus. Sanford, K. & Svetics, I. 1994. Word associations-types and language proficiency. PALM 8(2): 63–76. Sanford and Svetics asked four small groups of mixed L1 learners of English to make single associations to a list of stimulus words. This list was based on the Kent-Rosanoff list, with a few words being replaced by words likely to have culturally loaded responses. Responses for 32 of the words which generated a high level of commonality in a small group of Native Speaker subjects were selected for analysis. Responses to these words were categorised as syntagmatic responses, sound responses, semantic responses or paradigmatic responses, with a residual Don’t Know category. The authors report that there is a fairly strong correlation between group and the number of syntagmatic responses, and a negative correlation between group proficiency and sound responses. High level subjects had a higher number of semantic responses than lower proficiency groups, and native speakers. Don’t Knows decline with proficiency. The authors note that these linear effects have not been recorded before, and they suggest that they might be used as the basis of an assessment of proficiency. Schmitt, N. 1998. Quantifying word association responses: What is native-like? System 26: 389–401. Schmitt describes a methodology for evaluating the word association responses of non-native speakers. He first assembled a set of 100 normative responses to each of 17 stimulus words, and ranked the responses in order of frequency. Non-native-speaker responses are scored according to a complex weighting system based on these norms. Schmitt claims that the system allows him to identify four levels of association response, with the highest two levels being classified as “native-like”.
Chapter 8. Word associations in a second language
Schmitt, N. 1999. The relationship between TOEFL vocabulary items and meaning association, collocation and word class knowledge. Language Testing 16(2): 189–216. Schmitt asked 30 mixed L1 EFL learners to complete a small TOEFL vocabulary test and a series of further tests designed to measure how well they knew the target words. One of these additional tasks was a word association task, in which subjects have to generate three associations for each of the six target words. Schmitt shows that TOEFL items which generated correct responses also generated higher word association scores, but overall the level of native-like associations was low. Schmitt concludes that answering a TOEFL item correctly does not necessarily imply that subjects will associate to these words in a native like way. TOEFL vocabulary items do not seem to be a very good indicator of associative word knowledge. Schmitt, N. & Meara, P.M. 1997. Researching vocabulary through a word knowledge framework: Word associations and verbal suffixes. Studies in Second Language Acquisition 19(1): 17–36. Schmitt and Meara asked 95 L1 Japanese subjects to complete a set of tasks with 20 English stimulus words. The tasks included receptive and productive word association tasks. All the tasks were repeated after 9 months. The results suggested that subjects’ overall vocabulary size as measured by the Levels Test increased by about 330 words. Subjects were able to generate only a small number of associations for the 20 target words, and improvement over time for both the productive and the receptive association tasks was minimal. Schoonen, R. & Verhallen, M. 1998. Kennis van woorden: De toetsing van diepe woord kennis. (Word knowledge: Testing deep word knowledge). Pedagogis chen Studien 75: 153–168. Schoonen and Verhallen asked 745 Dutch children to complete a variant of Read’s Associates Test. Each item contained a target word and six other words of which 3 were associates of the target, and the test consisted of 30 items of this sort. Results showed that L1 Dutch and L1 Friesian subjects scored higher than subjects of Surinamese or Antillean backgrounds, and both groups outperformed L1 Turkish or L1 Arabic subjects. A detailed analysis of the test items showed a high level of test reliability. The paper argues that the test might provide a workable way to operationalise the construct of deep word knowledge. Sheng, L., McGregor, K.K. & Marian, V. 2006. Lexical-semantic organization in bilingual children: Evidence from a repeated word association task. Journal of Speech, Language, and Hearing Research 49: 572–587.
Connected Words
Sheng and colleagues asked 12 Mandarin-English bilinguals and 12 L1 English children to generate three association responses to a set of 36 stimulus words. The responses of the bilinguals were similar in both languages, but there was a suggestion that the bilinguals might produce more paradigmatic responses to verb stimuli than the monolingual children did. Swartz, M.L., Kostyla, S.J., Hanfling, S. & Holland, V.M. 1990. Preliminary assessment of a foreign language learning environment. Computer Assisted Language Learning 1: 51–64. This paper describes LEXNET-INSITU, a hyper-text based program designed to teach vocabulary using context and associative networks. An informal evaluation of the system is reported, and a number of suggestions for improving it are discussed. Söderman, T. 1988. Word Associations of Foreign Language Learners and Native Speakers – a shift in response type and its relevance for a theory of lexical development. MA Thesis, Åbo Akademi. Söderman collected word associations from intermediate and advanced Swedish speaking learners of English, and analysed them in terms of the proportion of syntagmatic and paradigmatic associations they produced. She claims to find evidence in more advanced learners of a shift towards paradigmatic associates. This shift is reminiscent of a similar shift found in young children. Söderman, T. 1989. Word associations of foreign language learners and native speakers – a shift in response type and its relevance for a theory of lexical development. Scandinavian Working Papers on Bilingualism 8: 114–121. An informal report of work described more fully in Söderman 1988. Söderman, T. 1993. Word associations of foreign language learners and native speakers – different response types and their relevance to lexical development. In Problems, Process and Product in Language Learning, B. Hammarberg (Ed.). Stockholm: University of Stockholm. Two studies investigating the word associations of advanced learners of English. Study 1 suggests that there is a tendency for lower level learners to produce fewer paradigmatic responses than more advanced learners. Study 2, however, suggests that this effect is really a property of how well subjects know the stimulus words. Words which are less well known tend to produce a lot of unusual responses, even in advanced L2 speakers. Söderman, T. 1993. Word associations of foreign language learners and native speakers: The phenomenon of a shift in response type and its relevance for lexical development. In Near-native proficiency in English, H. Ringbom (Ed.). Åbo: Åbo Akademi. See previous entry.
Chapter 8. Word associations in a second language
Sökmen, A. 1993. Word association results: a window to the lexicon of ESL students. JALT Journal 15(2): 135–150. Sökmen reports on the associations made by 198 L1 Japanese learners of English to 50 items from the Kent-Rosanoff list. The common responses are listed, and compared with the Minnesota Norms. The effects of gender, proficiency level, L1 and age are reported. Significant differences are associated with all four variables. Sökmen reports that about 50% of her responses are affective associations, rather than the more usual type of response. Taylor, I. 1971. How are words from two languages organized in bilinguals’ memory? Canadian Journal of Psychology 25: 228–240. An account of French/English bilinguals’ performance on a word association task using continuous association to eighteen English and French words. Experimental conditions were varied by allowing students to respond in only one language, to switch response language as they pleased, or by requiring them to switch language frequently as instructed. Rapid obligatory switching produced fewest associations, free switching produced performance as good as monolingual responding. In the free switching condition, the probability of changing language was relatively low. Taylor, I. 1976. Similarity between French and English words – a factor to be considered in bilingual language behaviour. Journal of Psycholinguistic Research 5(1): 85–94. A study of the word association responses made by bilinguals to words in English and French, using the continuous association method. The findings show that when an English word and its French equivalent are physically similar (e.g., animal/ animal, or comfortable/confortable) there is a tendency for similar responses to be produced in both languages. When the words are dissimilar (e.g., church/église, or sad/triste) the responses do not show the same degree of overlap. van Ginkel, C.I. & van der Linden, E.H. 1996. Word associations in foreign language learning and foreign language loss. In Approaches to Second Language Acquisition, K. Sajavaara & C. Fairweather (Eds), 26–33. Jyväskylä: University of Jyväskylä. The authors asked small groups of L1 French and L1 Dutch speakers to produce continuous associations to a set of five French stimulus words: maison, soleil, stylo, malade and boulangerie. The results were compared in terms of number of responses, response variability and response overlap. The authors find that L1 Dutch “forgetters” – subjects who had not used French much in the previous two years – did not differ significantly from more active learners. No effects of attrition were identified in the data.
Connected Words
van Hell, J.G. & de Groot, A.M.B. 1998. Conceptual representation in bilingual memory: Effects of concreteness and cognate status in word association. Bilingualism: Language and Cognition 1(3): 193–211. van Hell and de Groot asked 80 L1 Dutch learners of English to generate word associations to a set of Dutch and English stimulus words. Responses were made either in Dutch or English at T1, and in the other language at T2, one month later. A third administration repeated the first one. Results suggest that concrete words and cognates were more likely to generate equivalent responses, and this effect was most notable in Noun stimuli. Also measured was how many of the responses generated in the T1 test were repeated at T3. Identical responses were generated about 45% of the time. The paper discusses these results in terms of a distributed model of lexical representations. White, C. 1988. The role of associational patterns and semantic networks in vocabulary development. English Teaching Forum 26(4): 9–11. White briefly reviews some research on word association patterns in an L2, and argues that teaching methods that exploit word association structure may be an efficient way of teaching vocabulary. Some class exercises which use word associations are described. Wikberg, K. 1980. Lexical semantics and its application to second and foreign language vocabulary learning and teaching. In AFinLA Year Book, 1980 Papers in Language Learning and Language Acquisition, K. Sajavaara, A. Rasanen & T. Hirvonen (Eds), 119–130. Jyväskylä: AFinLA. A brief discussion of some recent work on vocabulary learning in an L2. Topics touched on include: (a) Palmer’s algorithm for presenting vocabulary; (b) a short critique of the choice of vocabulary in the Council of Europe threshold level materials; (c) lexical structure and its relation to vocabulary learning; and (d) word association studies. Wilks, C. & Meara, P.M. 2002. Untangling word webs: Graph theory and the notion of density in second language word association networks. Second Language Research 18(4): 303–324. Wilks and Meara discuss some early attempts to measure the density of connections in L2 lexicons. They argue from simulation data that L2 lexical networks may actually be much denser than people usually take them to be, and that a more complex interpretation of the notion of lexical density is needed. Wilks, C., Meara, P.M. & Wolter, B. ** 2005. A further note on simulating word association behaviour in a second language. Second Language Research 21(4): 359–372.
Chapter 8. Word associations in a second language
Wilks and colleagues describe a series of simulation studies in which they model the probability of subjects finding at least one association among a set of five randomly selected stimulus words. The paper queries some of the assumptions made in Wilks and Meara’s earlier paper, and reinterprets the data reported in that paper. The simulations suggest that modelling complex processes such as word association behaviour is not at all straightforward, and that the results can vary dramatically with different implementations of the basic procedures. Wolter, B. 2001. Comparing the L1 and L2 mental lexicon. A depth of individual word knowledge model. Studies in Second Language Acquisition 23: 41–69. Wolter gave a small group of L1 Japanese learners of English a word association test consisting of 45 words covering a range of frequency values. Their responses were classified as paradigmatic, syntagmatic, clang or other. Subjects’ knowledge of the stimulus words was also tested using Wesche and Paribhakt’s Vocabulary Knowledge Scale. Wolter compares the pattern of responding with data generated by a small group of L1 English speakers responding to the same 45 word stimulus list, and a second list of 45 words that L1 speakers are not likely to know well. He suggests that the response patterns can be largely accounted for in terms of subjects’ depth of knowledge of individual words, and suggests that the traditional distinction between syntagmatic and paradigmatic responding needs to be reassessed. Wolter, B. 2002. Assessing proficiency through word associations: is there still hope? System 30: 315–329. Wolter reviews previous attempts to use word association tests as a measure of overall proficiency in an L2, and reports a study in which 30 learners of English generated multiple responses to a set of 20 stimulus words. Scores on this test were correlated with results from a C-test. The correlations for both weighted and unweighted scoring methods are significant but modest (about 0.44 and 0.46 respectively). Wolter argues that further research in this area needs to screen the stimulus words more effectively, develop a principled way of scoring the responses, and be more precise about what measures of L2 proficiency are used. Wolter, B. 2006. Lexical network structures and L2 vocabulary acquisition: the role of L1 lexical/conceptual knowledge. Applied Linguistics 27: 741–747. Wolter argues that L2 learners’ lexicons are not very different from their L1 lexicons as far as paradigmatic associations are concerned. However, languages differ significantly in their syntagmatic associations, and Wolter argues that this aspect of L2 lexical acquisition may involve lexical restructuring in L2 learners. This position is at odds with the positions developed in earlier work which generally suggests that
Connected Words
learners’ paradigmatic word associations are the principal index of development in an L2. Yokokawa, H., Yabuuchi, S., Kadota, S., Nakanishi, Y. & Noro, T. 2002. Lexical networks in L2 mental lexicon: Evidence from a word association task for Japanese EFL learners. Language Education and Technology 39: 21–39. 407 L1 Japanese subjects were asked to generate word association responses to a set of English words. The associations that they generated were broadly similar in L1 and L2, with a tendency for English stimuli to elicit a higher proportion of syntagmatic responses. These results are discussed in terms of the “word association” model and the “concept mediation” model of word recognition in L2 lexicons. Zareva, A. 2005. Models of lexical knowledge assessment of second language learners of English at higher levels of language proficiency. System 33(4): 547–562. Zareva asked small groups of English learners to self-rate their knowledge of a set of English words, take a test of vocabulary size and generate associations to a set of target words. A set of multiple regression analyses suggests that a two factor model covering vocabulary size and self-assessment of vocabulary knowledge provides the best set of predictors for these groups. The results are discussed in terms of Henriksen’s three-dimensional model of vocabulary acquisition. Zareva, A. Structure of the second language mental lexicon: How does it compare to native speakers’ lexical organisation? Second Language Research 23(2): 123–152. Zareva asked a total of 87 subjects to complete a VKS-type test for 73 English words, and to supply up to three associations for each of these words. The responses of the two groups of L2 speakers (n=29+29) were compared with those of the 29 L1 English speakers in terms of quantitative factors – total associations, response communality, number of different responses – and qualitative features – mainly response type. For the former, the NS group and the advanced learners performed in a similar way, both significantly better than the weaker L2 group. There were no significant differences between the groups on the qualitative measures. Zareva, A., Schwanenflugel, P. & Nikolova, Y. 2005. Relationship between lexical competence and language proficiency: Variable sensitivity. Studies in Second Language Acquisition 27(4): 567–596. The authors asked a group of L1 English speakers and two groups of EFL learners to self-rate their familiarity with a set of 73 words, and to generate word associations to the words that they claimed to know. They then compared the self-ratings to a series of other measures: vocabulary size, word frequency, native-likeness of the word associations, within group consistency of the associations and number of associations.
Chapter 8. Word associations in a second language
Results suggest that some of these features might distinguish between different levels of proficiency – native-like commonality of responses seems to be a good candidate for this, but the complexity of the scoring system makes it difficult to interpret the data. Zareva and colleagues suggest that metacognitive awareness is not proficiency dependent. Zimmerman, R. 1986. Semantics and lexical error analysis. Englisch, Amerikanische Studien 2: 294–505. Zimmerman argues that traditional linguistic semantics does not provide a proper basis for the analysis of lexical errors. He suggests that psycholinguistic approaches, notably word associations, network analyses, tip of the tongue phenomena, and frame analysis might be more useful, and illustrates how these approaches might be used with examples from English errors produced by German speakers.
section 5
Software applications This section contains manuals for three software applications which I hope will be of use to people doing research in the field of L2 word associations. Chapter 9 contains a manual for the Lex30 program described in Section 2. Chapter 10 contains a manual for V_Six, a development of the V_Links programs that are described in Section 3. Chapter 11 contains a manual for WA_Sorter, a small utility program that sorts and counts word association data, and presents it in a standard format. All three programs are web-based applications that do not require you to download or install any programs onto your own computer. They should run on any computer that is attached to the Internet, and they should run equally well on Windows machines, Apples and machines running the Linux operating system. The latest versions of all three programs can be found on http://www.lognostics. co.uk/tools/ The programs may look slightly different from the versions described in this book, as we update them regularly in response to feedback from users. All changes are fully documented on the web-site.
chapter 9
Lex30 v3.00 The manual Introduction Lex30 is an exploratory test designed to estimate the productive vocabulary of L2_English speakers. The test is based on work carried out by Tess Fitzpatrick. A full account of this work can be found in Fitzpatrick (2003). Assessing productive vocabulary is a very difficult task. Most researchers approach this task by getting students to produce short texts, either spoken or written, and then estimating their productive vocabulary by extrapolating from these texts. The problems with this approach are huge. Long texts might provide enough data for us to begin estimating the vocabulary size of their authors, but is difficult to elicit long texts from learners, especially when they are low-level learners. Furthermore, the texts that are elicited in most L2 writing tasks are predominantly made up of highly frequent words, which tell us little about the extent of the testees’ vocabulary; and in any case, the vocabulary elicited is highly constrained by the topic of the text task. These three features conspire to make the most commonly used methods of estimating productive vocabulary less informative than we would like them to be. The approach we have adopted in the Lex30 test is to elicit ‘texts’ from learners which are lexically dense. We do this by using a word association task. The Lex30 test presents the test-takers with a set of 30 stimulus words, and asks them to produce associations to these stimuli – up to four responses for each word. The stimulus words are specially selected so that they elicit unusual, infrequent words in native speakers. Once we have collected these responses, the program combines them into a 120-word ‘text’, and then scores this text by looking for infrequent response words. Our assumption is that beginning learners will not generally produce low frequency responses in this task, and that the presence of low frequency words in a test-taker’s response set indicates that they have an extended productive vocabulary. Clearly, the Lex30 test is not a sophisticated measure of productive vocabulary. It makes fairly minimal assumptions about the way L2 learners use words. Its main advantages are practical, rather than theoretical. Unlike standard essay tasks,
Connected Words
where test-takers vary enormously in the quantity they write and the quality of what they produce, Lex30 is more or less standardised. The Lex30 task has a high level of face validity, and the task itself is easy to explain. You do not need to have an extensive vocabulary to complete the test, and because the stimulus words are all high frequency, common words in English, Lex30 can be used with test-takers who vary widely in the level of their English proficiency. The test is easy to administer – it typically takes 15 minutes or so – and it is scored automatically, again, a significant advantage over the more traditional essay task. More importantly, perhaps, it generates positive reactions from the test-takers – a stark contrast with the reactions we get when we ask test-takers to produce other types of written work. In spite of this simple approach, Lex30 does seem to work. It correlates moderately with our standard test of receptive vocabulary, X_Lex (Meara & Milton 2003), and it seems to discriminate between groups at different levels of proficiency. It also appears to be stable over relatively short test intervals – despite the fact that test-takers rarely produce identical responses in successive tests, they do tend to generate similar scores over short time intervals. All these features suggest that Lex30 may be a useful way of sampling productive vocabulary, at least in low stakes situations where high levels of accuracy are not called for. Users should be aware that Lex30 v3.00 is not a definitive test. The current version is part of our on-going research into vocabulary measures, and this particular test is very exploratory and experimental. However, since we released our first reports, (Meara & Fitzpatrick 2000) Lex30 has generated a lot of interest in the research community, and we know that earlier versions of the test format are being used in a number of large scale research projects. We are therefore releasing this version of Lex30, in the hope that the new standard format will be of use to researchers who can test it out in situations that are very different from the ones we have access to. The exploratory nature of this version means that data generated by the test should be treated with appropriate caution.
Changes from earlier versions Lex30 v3.00 includes a number of changes from previous versions of Lex30. The main change is that this version of the test is designed as a web-based test, whereas previous versions ran on stand-alone computers. We have introduced this change as a response to Microsoft’s continual upgrading of its Windows operating system. Lex30 v3.00 should be usable on any computer connected to the Internet, including Macs and Linux machines, and we hope that this will introduce a level of stability which was missing from the earlier versions. As part of this shift, we have now automated the scoring system, so that the program automatically scores the
Chapter 9. Lex30 v3.00 The manual
data that is presented to it. Scoring no longer requires a separate program, and this considerably streamlines the way Lex30 works. Two other, rather more important changes have also been introduced. The most important of these changes is the way that Lex30 handles non-responses. Previous versions of Lex30 allowed test-takers to leave blanks in the response set, and ignored these items during scoring. In v3.00, when the test-taker leaves a blank in the response sheet, the program now adds a null-response to the data in order to make up the missing data. In this way, the program requires test-takers to produce four responses for each stimulus. Null responses count as non-scoring words. The effect of this change is that the test-takers’ response texts are all standardised for length at 120 responses. The second change inttroduced in v3.00 is that we have replaced Nation’s word lists (Nation 1986) with the JACET 8000 list (Ishikawa, Uemura, Kaneda, Shmizu, Sugimori, Tono, Mochizuki & Murata 2003). The primary reason for doing this was that most of our evaluations of Lex30 used L1 Japanese learners of English, for whom the JACET list seemed more appropriate. However, there are a number of differences between the JACET lists and Nation’s list, and we think that the JACET list works better for Lex30. There is a case for using other word lists with Lex30, and we are currently looking at the implications of using a strict, non-lemmatised word list like the lists based on the British National Corpus. The main advantages of this that it would greatly increase the level of objectivity in the scoring, and deal with some anomalies which arise when other word lists are used. For the moment, however, we are intending to work with the JACET list, but revisions of the Lex30 dictionary will be implemented from time to time, and this will be reflected in the program’s version numbers. A list of the Lex30 words currently used by the program will be found at the end of this manual.
Installing Lex30 v3.00 Lex30 v3.00 does not need to be installed on your computer, as it runs over the internet. Make sure that your computer is online, then open your browser, and point it to: http://www.lognostics.co.uk/tools/lex30.htm. This will take you straight to the Lex30 front page, which looks like Figure 1.
Using Lex30 v3.00 The Lex30 opening screen asks you to enter the testee’s name and their email address. Both these fields can be left blank.
Connected Words
Figure 1. Lex30 The opening screen.
The start screen gives a single example of the sort of item which is tested in Lex30. Full instructions for testees, in a range of different languages, can be downloaded by clicking on the icon in the top left hand corner of the screen. Once you are ready to begin the test, click on the CONTINUE button in the bottom left hand corner of the screen. This takes you to the Lex30 main page, which looks like Figure 2. The main page of Lex30 v3.00 consists of a single web page containing 30 stimulus words. Each stimulus word is followed by four text boxes. Test-takers should write in each of the four text boxes a word which the stimulus word makes them think of. Multi-word responses are acceptable. It is also acceptable to provide the same response to more than one stimulus word. Boxes can be left blank if the stimulus words only evoke one or two response words. Use the TAB button, or the mouse, to move from one response box to the next. It takes about 15 minutes to complete the test. Unlike earlier versions of the Lex30 test, Version 3.00 is not timed, but you should encourage test-takers to work as quickly as possible, and data produced by test-takers who take a very long time on the test should be treated as unreliable.
Chapter 9. Lex30 v3.00 The manual
Figure 2. The Lex30 main page.
When the form has been completed, click on the CONTINUE button in the bottom left hand corner of the screen. When you do this, Lex30 will score your responses automatically. Lex30 comes with a small in-built dictionary of about 3000 words, which will normally allow the program to recognise most of the responses generated by L2 speakers. Occasionally, however, you will find that a test-taker makes a response that the Lex30 dictionary does not recognise. These words are all reported in the next screen, the confirmation page, which can be seen in Figure 3. In this Figure, Napoleon Bonaparte has completed the Lex30 test. However, the program has failed to recognise 10 of the responses he made. These are listed on the screen. Each of these responses needs to be evaluated by a teacher. Lex30 assumes by default that all the words it does not recognise are scoring words, but this judgement can be overridden if necessary. In Figure 3, for instance, Josephine, Paris and Corsica are proper names, and should be designated as non-scoring words. Exile and defeat are both low frequency words, and they count towards the score. Armie and freind are misspellings of real words, but friend is a high frequency word, and so does not score. Army is a low-frequency word, and does score. 1812 is a numeral, and does not score. The tricky item here is ouverture. It might be a mis-spelling of overture (as in the 1812 overture), and if we were sure
Connected Words
Figure 3. Lex30 Confirmation Page.
of this, then we might want to award a point to this word. On the other hand, it might simply be an example of an L1 word being used as a response. In this case, it would not be a scoring word. The process of overriding the default scoring generated by Lex30 is discussed in more detail in the next section. For all non-scoring words, click on the NO button. The other words can be left as they are. When the confirmation process is complete, click the CONTINUE button at the bottom left corner of the screen. This moves LEX30 on to its report screen, which is shown in Figure 4. This screen simply reports the number of scoring items recorded by the program. In this case, Napoleon Bonaparte scored a total of 10 points. This data can be printed out by using your browser’s print function.
Interpreting the Lex30 scores Lex30 scores can in theory vary from 0 to 120. In practice, scores tend to be considerably lower than the maximum score. A good native speaker score is about 60 points. Note that Lex30 does not attempt to provide an accurate measure of the total productive vocabulary that the test-takers have at their disposal. It produces a
Chapter 9. Lex30 v3.00 The manual
Figure 4. Lex30 Report screen.
score which we think might be related to this total, but should be treated with appropriate caution. The scores are probably reliable enough to allow for comparisons between groups.
Exiting Lex30 To exit Lex30 close your browser. Alternatively, you can set up another test by clicking on the link in the bottom left hand corner of the screen.
Notes for researchers 1. Overriding the Lex30 scoring Although the scoring system used by Lex30 is largely automated, you will find that some test-takers produce a small number of responses which are not recognised by the program. ‘A small number’ here means a dozen or fewer. The dictionary used by Lex30 has been extended to include the most common responses that test-takers make to the Lex30 stimuli, and it is unusual for test-takers to produce responses that we have not anticipated. If you find that your students consistently produce responses that are not in the Lex30 dictionary, then contact us.
Connected Words
Overriding the Lex30 scoring is something that needs to be done with caution. Generally speaking, proper names and their derivatives should not count as scoring responses. Numerals should also be excluded from the list of scoring words. The main problem occurs with responses that are mis-spelled, and in this case, the tester will need to make a decision about whether to allow these responses to count or not. Obviously, allowing mis-spellings to count will raise the test-takers’ scores, while a stricter approach will lower them. For most purposes, it probably does not matter which approach you choose, as long as you are consistent. More problematical are responses which are genuine words that Lex30 does not recognise. In these cases, the tester has to make a judgement call as to whether the response is a scoring word or not. The principle to use here is that ordinary, frequent words do not score. Lex30 awards one point to every response word which does not appear in the first 1,000 words of English, as defined by the JACET list. A copy of this list in included at the end of this manual, and you will find it useful to familiarise yourself with the words it contains. Broadly speaking, words which are not found in this list should be treated as scoring items, and for the sake of consistency, it is probably not advisable to make adjustments to the list except when there is a very pressing case for doing so. You will also need to decide how you are going to handle multi-word responses. Lex30 will generally not recognise these responses, which fortunately tend not to appear very often. The best approach is to simplify these items and score only the least frequent of the words they contain. E.g., If a test-taker responds to the stimulus word attack with death or glory, glory would count as a scoring word in its own right. 2. Data collection Version 3.00 of Lex30 does not keep a record of the data it collects, and unlike v2, it does not update its dictionary the more it is used. This avoids some security issues with web-based tests, and it prevents people from sabotaging the program. If you need to record the data collected by your test-takers, then the simplest method is probably to print out the completed test page before submitting it for evaluation. You can do this by using your browser’s print function. You can also use your browser’s print function to print out the list of words that Lex30 has failed to recognise. 3. Paper and Pencil versions of Lex30 You can produce a paper and pencil version of Lex30 by printing out the Lex30 main page. The data generated in this format will need to be processed manually. This process can be tedious if you are handling large amounts of data. In cases like this, it is worthwhile using a voice recognition programme such as Naturally Speaking to enter the data. This will considerably reduce the time you need to process a batch of data in pencil and paper format.
Chapter 9. Lex30 v3.00 The manual
4. Parallel versions of Lex30 Parallel versions of Lex30 are in preparation. 5. Using Lex30 with your own stimulus lists It is possible use Lex30 with stimulus word lists other than the one provided with the program. In order to this, you will need some knowledge of HTML programming, and you should not attempt this unless you know what you are doing. Use your browser to access the Lex30 source code. The way you do this will depend on the browser that you use. In Internet Explorer, the source code looks something like this: LEX30 V3.00b
... Save this code to your own computer. Next, find the section of the code labelled #stimulus word list, which should look like this: #stimulus word list ... The words at the end of each line – attack, board, close, and so on – are the stimulus words which are displayed by the Lex30 program. If you change these words, then the program will display a different set of stimuli. Once you have made any changes you require, then save the file again. Finally, point your browser to the new file, and the Lex30 start page will display. You should then be able to access the Lex30 scoring program in the usual way.
Connected Words
This facility should be used very sparingly, and you should make sure that you understand the principles which underlie the selection of Lex30 stimulus words before you attempt it. Lex30 results obtained with an altered stimulus word list will not necessarily have the same properties as scores from the official program. Please contact the Lex30 team if you are planning to work with alternative word lists. 6. Feedback and Suggestions If you need to refer to Lex30 in your own bibliographies, use these data: PM Meara and T Fitzpatrick (2008). Lex30 v3.00. Swansea: lognostics. The Lex30 team welcomes suggestions, comments and feedback on this program. You can email comments to Paul Meara at [email protected]. We always welcome your ideas for improvements. Please let us know if you use Lex30 in your research work – it helps us keep track of how the program is being used, and lets us contact you when we bring out an update. We need to know if the program is responding in the way we think it should, and the only way we can do this is if you tell us.
Waiver Lex30 is made available freely to bona fide researchers, in exchange for evaluations and comments. Lex30 is distributed as as part of an ongoing research and development program at the University of Wales Swansea. The authors accept no liabilities of any kind arising from the use of this program. The development team is not responsible for any claims arising out of your use, or misuse of the program. By using this program, you are agreeing to these terms.
Chapter 9. Lex30 v3.00 The manual
The JACET LISTS 1K: these words are recognised by Lex30, but do not score a ability able about about above above accept access according account achieve across act act action activity actually add addition admit advantage advice affair affect after after again against age ago agree agreement air all all allow almost along already also although always among amount an analysis and animal announce another answer answer any anyone anything anyway appear application apply approach appropriate area argue argument arm army around around arrive art as as as ask aspect association assume at attempt attention attitude authority available avoid award aware away baby back back bad bank base basic basis be bear because become bed before before begin behaviour behind believe benefit better between beyond big bill bit black blood board body book both both box boy break bring brother build building business but buy by call campaign can capital car care carry case catch cause cell central centre century certain certainly chance change change chapter character charge child choice choose church circumstance city claim claim class clear clearly client close close club colour come commission committee common community company compare competition computer concern concerned condition conference consider contain continue contract control control cos cost cost could council country couple course court cover create culture cup current customer cut dark data date daughter day dead deal deal death decide decision defence degree demand department depend describe design design despite detail determine develop development die difference different difficult difficulty direction director discover discuss discussion disease division do doctor dog door doubt down draw drive drop due during duty each early early easy eat economic economy education effect effort either election element else encourage end end energy enjoy enough ensure enter environment especially establish even evening event ever every everyone everything evidence exactly example exist expect experience explain express eye face face fact factor fail fall family far far father feature feel feeling few few field fight figure fill film final finally financial find fine finish fire firm floor fly follow following food foot for force force foreign forget form form former forward free friend from front full function fund further future game garden general general generally get girl give glass go goal good good government great ground group grow growth
Connected Words
had hair half hand happen happy hard hard has have he head health hear heart heavy help help her her here herself high him himself his history hit hold home home hope horse hospital hotel hour house how however human husband I idea identify if image important improve in in include including income increase increase indeed indicate individual individual industrial industry information instance instead institution intend interest international into introduce investment involve issue it item its itself job join just keep kill kind king know knowledge labour land language large last late later law lead leader learn least leave left leg legal less less let letter level library lie life light like like likely line list listen little little live local long long look look lose loss lot love love low machine main maintain major majority make man manage management manager many market material matter may may maybe me mean means measure meet meeting member memory mention method might mile military mind minister minute miss model modern moment money month more more morning most most mother move movement much much music must my myself name national natural nature near nearly necessary need need never new news next next nice night no no no nor normal not note nothing now number obtain obviously occur of off off offer office officer often oil okay old on on once once one only only open open operate operation opportunity or order organisation original other other other our out outside over over own page paper parent part particular particularly party pass patient pattern pay people per percent performance perhaps period person personal pick picture piece place place plan plan plant play player please point point police policy political poor popular population position possible pound power practice prepare present present president press pressure prevent previous price prime principle private probably problem procedure process produce product production professional profit programme project property proposal prove provide provision public publish pull pupil purpose put quality question quickly quite raise range rate rather reach read ready real really reason receive recent recently record record red reduce refer reflect refuse region relate relation relationship remain remember remove replace report report represent require research resource respect response responsibility rest result return reveal right right right rise risk road role room round round royal rule run
Chapter 9. Lex30 v3.00 The manual
sale same save say scheme school science sea season seat secretary section sector security see seek seem sell send sense series serious serve service set set several shall share share she shop short should show show side sign significant similar simple simply since since single sit site situation size skill small smile so social society some someone something sometimes son soon sorry sort sound source space speak special specific spend staff stage stand standard start state state statement station stay step still stone stop story street strong structure student study style subject success successful such suddenly suffer suggest summer support support suppose sure surface system table take talk task tax teach teacher team technique technology tell tend term test than thank that that the their them themselves then theory there there therefore these they thing think this those though though thought through throughout throw thus time title to to to today together too top top total towards town trade train training treat treatment tree true try turn type under understand union unit university unless until up up upon us use use used user usually value variety various version very view village visit voice wait walk wall want war watch water way we wear week well well what whatever when when where where whether which while white who whole whole whom whose why wide wife will win window wish with within without woman wonder word work work worker world would write wrong yeah year yes yesterday yet you young your yourself 2K: these words are recognised by Lex30 and score 1 point absence absolutely academic accident accompany account achievement acid acquire active actual additional address address administration adopt adult advance advise afford afraid afternoon afterwards agency agent ahead aid aim aim aircraft alone alone along alright alternative alternative amongst ancient annual anybody apart apparent apparently appeal appeal appearance appoint appointment approach approve arise arrange arrangement article artist assembly assess assessment asset associate assumption atmosphere attach attack attack attempt attend attract attractive audience author average award aye background bag balance ball band bar base battle beat beautiful bedroom before beginning belief belong below below beneath beside best bind bird birth block bloody blow blue boat bone border bottle bottom brain branch breath bridge brief bright broad budget burn bus busy cabinet call candidate capable capacity card care career careful carefully cash cat category cause chain chair chairman challenge channel characteristic charge cheap
Connected Words
check chemical chief circle citizen civil clean clear climb close closely clothes coal code coffee cold colleague collect collection college combination combine comment comment commercial commit commitment commode communication comparison complete complete completely complex component concentrate concentration concept concern conclude conclusion conduct confidence confirm conflict congress connection consequence conservative considerable consideration consist constant construction consumer contact contact content context contrast contribute contribution convention conversation copy corner corporate correct count county cover creation credit crime criminal crisis criterion critical criticism cross crowd cry cultural currently curriculum cut damage danger dangerous date debate debt decade declare deep deep defendant define definition deliver demand democratic demonstrate deny deputy derive description desire desk destroy detailed device dinner direct direct directly disappear discipline display display distance distinction distribution district divide document domestic double down drawing dream dress dress drink drink drive driver drug dry ear earn earth easily east edge editor educational effective effectively egg elderly elsewhere emerge emphasis employ employee employer employment empty enable enemy engine engineering enough enterprise entire entirely entitle entry environmental equal equally equipment error escape essential establishment estate estimate eventually everybody examination examine excellent except exchange executive exercise exercise exhibition existence existing expectation expenditure expense expensive experience experiment expert explanation explore expression extend extent external extra extremely facility factory failure fair fairly faith fall familiar famous farm farmer fashion fast favour fear fear fee female file finance finding finger first fish fit fix flat flight flow flower focus football for forest formal foundation freedom frequently fresh front fruit fuel fully funny future gain gas gate gather generate generation gentleman god gold grant grant green grey growing guest guide gun half hall hand handle hang hardly hate head heat hell hence hide high highly hill his historical hole holiday hope hot household housing huge human hurt ignore illustrate imagine immediate immediately impact implication imply importance impose impossible impression improvement incident increased increasingly independent index influence influence inform initial initiative injury inside inside insist institute instruction instrument insurance intention interested interesting internal interpretation interview introduction investigate investigation invite iron island issue
Chapter 9. Lex30 v3.00 The manual
joint journey judge judge jump justice key key kid kitchen knee lack lady largely last late latter laugh launch lawyer lay lead leadership leading leaf league lean legislation length liability liberal lift light limit limit limited link link lip literature little living loan location lodging lord lovely lunch magazine mainly male male manner map mark mark market marriage married marry mass master match match matter meal meaning meanwhile measure mechanism media medical membership mental merely message metal middle milk mind mine ministry mistake module motion motor mountain mouth move murder museum name narrow nation necessarily neck negotiation neighbour neither network nevertheless newspaper nobody nod noise none no-one normally north northern nose note notice notice notion nuclear nurse object objective observation observe obvious occasion odd offence offer official official onto opinion opposition option order ordinary organise organization origin otherwise ought ourselves outcome output outside overall own owner package pain paint painting pair panel park parliament partly partner passage past past past path pay payment peace pension perfect perform permanent persuade phase phone photograph physical planning plastic plate play pleasure plenty plus pocket politics pomegranate pool positive possibility possibly post potential potential powerful practical prefer presence present press pretty previously primary priority prison prisoner program progress promise promote proper properly proportion propos prospect protect protection provided pub public publication push quarter question quick quiet race radio railway rain rapidly rare reaction reader reading realise reality realize reasonable recall recognise recognition recognize recommend recover reduction reference reform regard regional regular regulation reject relative relatively release release relevant relief religion religious rely remind repeat reply representation representative request requirement respond responsible rest restaurant result retain return revenue review revolution rich rid ring ring rise river rock roll roof route row run rural safe safety sample satisfy scale scene scientific scientis score screen search search second secondary secure select selection senior sentence separate separate sequence seriously servant session settle settlement severe sex sexual shake shape sheet ship shoe shoot shot shoulder shout shut sight sign signal significance silence
Connected Words
sing sir sister skin sky sleep slightly slip slow slowly smile soft software soil soldier solicitor solution somebody somewhat somewhere song sound south southern speaker species speech speed spirit sport spot spread spring standard star start status steal step stick stock store straight strange strategy strength strike strike strongly studio study stuff substantial succeed sufficient suggestion suitable sum sun supply supply surely surprise surround survey survive switch talk tall tape target tea teaching tear technical telephone television temperature terms terrible test text thanks theatre theme thin threat threaten through ticket tiny tomorrow tone tonight tool tooth total totally touch touch tour track tradition traditional traffic train transfer transfer transport travel treaty trend trial trip troop trouble trust truth turn twice typical unable under understanding undertake unemployment unfortunately united unlikely until upper urban used useful usual variation vary vast vehicle very via victim victory video violence vision visit visitor vital volume vote vote wage walk warm warn wash wave weak weapon weather weekend weight welcome welfare west western whereas while whilst widely wild will wind wine wing winner winter withdraw wonderful wood working works worry worth writer writing yard youth Instruction for test takers: English: In this test, you will see a list of 30 English words. Each word will make you think of several other words in English. Write these words in the boxes alongside each word. EXAMPLE
ANIMAL
elephant
tiger
farm
wild
chapter 10
V_Six v1.00 The manual Introduction V_Six is an exploratory test designed to estimate how interconnected are the basic words contained among a set of high frequency words in an L2 lexicon. The test is based on work originally carried out by Brent Wolter and Clarissa Wilks, described in more detail in Section Three of this book, and in Wilks, Meara & Wolter (2005). It will be obvious from the discussion in Section Three that finding practical ways of assessing how inter-connected are the words in a person’s mental lexicon is not at all straightforward. And even when you have a format which achieves a reasonable degree of face-validity, there are considerable problems in interpreting the data which the test delivers. V_Six is no exception to this. However, our view is that the current state of research into L2 lexical networks and depth of vocabulary knowledge is at something of an impasse, and that the research methods that are currently in use do not really allow us to investigate the way networks grow and develop. In these circumstances there is a case for developing new experimental tools, in the hope that they might contribute to breaking the logjam. We hope that this version of our V_Links test will be more accessible – and less daunting – than the original version, and that this new format will allow researchers to test some of the ideas that underpin it in situations that are very different from the ones we have access to. The exploratory nature of V_Six means that data generated by the test should be treated with appropriate caution, however.
Changes from earlier versions of V_Links V_Six v1.00 looks very different from the V_Links programs described in Chapter 6. The main reason for this is that V_Six is designed as a web-based test, whereas previous versions ran on stand-alone computers. V_Six is designed to run over the internet, and unlike V_Links, it does not need to be downloaded to your computer and installed as a free-standing program. We have introduced this change as a response to Microsoft’s continual upgrading of its Windows operation system.
Connected Words
V_Six should be usable on any computer connected to the Internet, including Apple and Linux machines, and we hope that this will introduce a level of stability which was missing from some of our earlier work. However, the move to web-based delivery turned out not to be straight forward. Web-based programs are a lot less flexible than stand-alone programs, and for this reason, the V_Six interface is programmed rather differently from the original V_Links programs described in Chapter 6. Whereas V_Links presented users with sets of 10 words, and asked them to find any connections among the words, V_Six uses smaller stimulus sets, and is more specific about the links it seeks to elicit. These compromises are not entirely satisfactory. The visual appearance of the test makes it slightly less intuitive for users than V_Links was, and we have had to dispense with some of the interesting variations that V_Links was able to explore. V_Six does not allow testers to set a time limit on the test, for example. To a large extent, these changes were dictated by the limitations of the HTML language that underlies web pages. I could have avoided these limitations by using additional plug-ins, but I decided in the end that simplicity and transportability should be given more weight than flashy design. A further change is that whereas V_Links allows testees to indicate any connections among a set of randomly selected words, V_Six uses a much more constrained set of stimulus items, and each of the stimulus words is repeated a number of times. The rationale for this change is explained in the notes for researchers section below.
Installing and using V_Six Installing V_Six v1.00 V_Six v1.00 does not need to be installed on your computer, as it runs over the internet. Make sure that your computer is online, then open your browser, and point it to: http://www.lognostics.co.uk/tools/V_Six/ Note that this address is case sensitive, and the last element is V_Six, with an underscore. This will take you straight to the V_Six front page, which looks like Figure 1 below. Using V_Six v1.00 The V_Six opening screen asks you to enter the testee’s name and their email address. Both these fields can be left blank. The start screen gives a single example of the sort of item which is tested in V_Six. Full instructions for testees, in a range of different languages, can be downloaded by clicking on the icon in the top left hand corner of the screen.
Chapter 10. V_Six v1.00 The manual
Figure 1. V_Six: the opening screen.
Once you are ready to begin the test, click on the CONTINUE button in the bottom left hand corner of the screen. This takes you to the V_Six main page, which looks like Figure 2 below. The main page of V_Six v1.00 consists of a single web page containing 50 stimulus sets. Each stimulus set consists of two boxes, one coloured yellow, the other coloured blue. Each box contains three words. There is also a button labelled No Links along-side each set of boxes. For each item, test-takers are asked if they can find a word in the yellow box which has an association link with one of the words in the blue box. If they find a pair of words that are linked in this way, then they signal this by clicking on the buttons alongside the two words. If they don’t find a link, then they should click on the button marked No Link. It takes 15 to 20 minutes for test-takers to complete the test. Unlike earlier versions of the V_Links test, V_Six v1.00 is not timed, but you should encourage test-takers to work as quickly as possible. The general rule is that if they cannot find a pair of linked words immediately, then test-takers should not strain to find an association pair come what may. Native speakers of English will normally identify an obvious association for about two thirds of the items tested. Data produced by test-takers who take a very long time on the test should be treated as unreliable.
Connected Words
Figure 2. The V_Six main page.
When the form has been completed, click on the CONTINUE button in the bottom left hand corner of the screen. When you do this, V_Six will score your responses automatically. V_Six compares the responses each test-taker makes with a collection of responses made by a group of native speakers of English. The program reports two scores for each test-taker, as shown in Figure 3. The upper score is the number of items for which the test-taker reported finding a pair of connected words. The lower score is the total number of items found, minus those cases where native speakers do not normally report an association. In Figure 3, Napoleon Bonaparte scored a total of 15 points, but only 10 of those answers were recognised as valid by the program. This data can be printed out by using the browser’s print function. Interpreting the V_Six scores V_Six scores can in theory vary from 0 to 50. In practice, scores tend to be considerably lower than the maximum score. A good native speaker score is about 30 points. For a fuller discussion of what these scores mean, see the next section.
Chapter 10. V_Six v1.00 The manual
Figure 3. V_Six Report screen.
Exiting V_Six To exit V_Six close your browser. Alternatively, you can set up another test by clicking on the link in the bottom left hand corner of the screen.
Notes for researchers The V_Six stimulus set My original intention for this program had been to use purely randomised sets of stimulus words based on different frequency bands, as we had done in the original V_Links test (Meara & Wolter 2004). However, this approach turned out to be an unsatisfactory one. Specifically, I had enormous difficulty with randomly generated stimulus sets based on a 1K frequency list, which consistently failed to produce items which contained convincingly associated pairs. To throw some light on this problem, I carried out a series of studies in which I randomly selected pairs of words from large sets of words based on the JACET lists (Ishikawa et al. 2003). For each pair of words, I made a judgement about whether there was a strong association between the words, using an intuitive, but fairly strict criterion for making this determination. Basically, I asked whether, given two words, A and B, word A was likely to appear in a list of associations made to word B or vice-versa. I did this for pairs of words taken from each of the thousand word
Connected Words
bands listed in the JACET lists. I also made up pairs of words where the two words came from different frequency bands. In this way, I was able to work out the probability of a pair of words belonging to an associated pair. This work is reported in more detail in Fitzpatrick & Meara (Forthcoming). Surprisingly, this probability turned out to be quite low – about 2% for the high frequency bands, declining to 1% with lower frequency words. This figure was a lot lower than I had expected, and it has some serious implications for the design of experiments which ask test-takers to identify associated words. Basically, if the probability of finding a pair of associated words when you select words at random is around 1%, then you would expect to find only one such pair in a list of a hundred stimulus pairs. This figure is far too low to generate a viable test: a test that was capable of distinguishing reliably between native speakers and learners would require hundreds of items, and this is logistically impractical. Similar logic ruled out other formats as well. For instance, we could have used a format where we presented a stimulus word together with a set of randomly selected words, and asked whether the test-taker could identify a link between the stimulus word and any of other words, like this: BACK open pick book paper wipe blue warm green boat bread Clearly, this format is more likely to produce an associated pair than an item which consists of just two words and the test-taker has to provide a YES or a NO answer. However, even with this more generous format, we still need unrealistically large numbers of words to produce items which regularly generate connected pairs. The probability of finding an associated pair in these item types depends on the number of words in the response set. If the probability of a random pair of words being associated is .02, and you have 10 words, as in the example above, then the probability of finding a match is .2, or 20%. This means that a one hundred item test would still only generate 20 items containing an associated pair. The obvious solution to this problem is to allow test takers to mark any association they find in a random set of words. For example, given the set open pick book paper wipe blue warm green boat bread there are 45 possible pairs of words: open~pick, open~book, open~paper … boat~bread. And if there is 2% chance of finding an associated pair when we select words at random, then we would expect a pair of connected words to appear relatively often in stimuli of this type. Basically, we would expect an item in this format to be almost certain to contain at least one associated pair.
Chapter 10. V_Six v1.00 The manual
This is quite close to the sort of results we had been getting with our original V_Links program. Unfortunately, despite a number of different attempts, I was not able to find a satisfactory way of displaying a set of 10 words and allowing test takers to choose two of them within the constraints imposed by the HTML format. An alternative approach in which I provided test-takers with a list of ten words and asked them to indicate if they were unable to find a connected pair in the set was also tested. Unfortunately, this turned out to be a very counterintuitive task – even with some training, test-takers found it difficult to comply with the instructions. The format adopted in V_Six is therefore something of a compromise. Assuming that there is a 2% chance of finding a connection between two randomly selected words, then the probability of finding an associated pair of words in this format is 18%. (There is a 6% chance that word A in the first box is connected to one of the three words in the second box. Similarly there is a 6% chance that word B is connected to one of these words, and likewise Word C.) This is a considerable improvement on the odds of finding a connected pair when two words are chosen at random, but it is still not really good enough to give us a fully viable test format. In order to solve this problem, I adopted a further compromise, and experimented with a number of stimulus sets which contained rather more associations than you would get if you selected words at random. It proved difficult to develop a convincing set of criteria for these alternative stimulus sets. For example, a set of highly imageable words (taken from Cortese & Fuggett 2004) initially looked promising, but it was difficult to relate these words to the frequency lists that are used in preparing English language textbooks, and many of the individual items turned out to be words that were not normally known by the learners we were testing. The V_Six v1.00 stimulus set consists of 100 words selected from the first 1K level in the JACET lists. The words are divided into two sets of 50 words, which are listed in Table 1. Each word in the first set of fifty words, is connected to at least two of the words in the other set of fifty words, and vice versa. Thus, for example, start is linked to stop and end, while give is linked to order and take. However, these links are not the only ones to be found among the selected words, and in practice, the stimulus set contains many more examples of associations. True, for example, seems to be associated with life, story, lie, love, friend, and fact. In Figure 4, I have plotted out all the connections that I was able to identify that linked words in set one with words in set two. The figure suggests that there are about 250 associative links in this set of words. A set of 100 words allows for (100*99)/2=4950 pairs of words, of which we would expect about 2% to be proper associations – i.e., We would expect about 100 of these random pairings to
Connected Words
Table 1. The V_Six v1.00 stimulus set start
begin
sudden
police
order
take
kill
drop
eye
hour
look
short
leave
support
game
lose
way
death
white
fall
pay
night
side
window
open
read
letter
heart
work
easy
plan
play
part
break
king
village
shop
buy
spend
keep
new
man
girl
boy
teacher
face
true
tell
back
line
stop
end
thought
officer
give
time
dead
catch
glass
watch
long
stay
behind
team
win
find
life
black
water
rise
day
dark
door
close
book
write
love
heavy
hard
simple
produce
act
piece
rule
country
town
sell
money
save
change
old
woman
friend
school
head
fact
story
lie
front
finish
be associated. However, by putting the words into two distinct sets, and counting only the associations that run between the sets, we can significantly reduce the number of possible pairs. The number of possible links between the two 50-word sets is 50*50=2500. Given that we have identified 250 association links, this means that the probability of finding a link between two randomly selected words, one from each sublist, is around 0.1, or 10%. This stimulus set seems to be considerably more inter-connected than we would expect a completely randomised set of words to be. For the V_Six format, the probability of finding an associated pair in two sets of three words is about 90%. Checking the responses V_Six collects the responses generated by a test-taker, and checks them against a list of “approved” responses for each item. The “approved” responses are a compilation of the answers provided by a small group of L1_English speakers. Thirty-two of the stimulus items seem to contain at least one clearly associated word pair,
Chapter 10. V_Six v1.00 The manual
Figure 4. The association structure of the V_Six stimulus set.
while 18 of the items do not. These are items 8, 15, 16, 17, 18, 27, 28, 30, 33, 34, 35, 36, 37, 39, 40, 49, 50. The second of the scores reported by V_Six records how many of the responses reported by the test-taker are included in this “approved” list. Test-takers whose scores show a very large discrepancy between their total responses and their “approved” responses will need to be treated with caution. Interpreting the V_Six data V_Six produces scores which vary between 0 and 50, though in practice the upper limit is rather lower than this. Previous versions of the test were able to distinguish reliably between native speakers and L2 learners but at the time of writing we haven’t tested this new format for reliability. I think it might be possible to use the raw scores on the V_Six test to make an estimate of the overall number of connections to be found within the 1K word list for a particular speaker (cf. Meara 2007). The mathematics of this is not straightforward, and it depends on a number of assumptions about the distribution of typical response patterns in tasks of this sort. The critical datum here is the number of items for which the test-takers report that they cannot find a pair of connected words. We can use this figure to estimate the number of times they would be able to report one, two, three, or more responses, and from this is it possible to make
Connected Words
a rough estimate of the overall pattern of linkages in this part of their vocabulary. This work is currently rather speculative, and so I will not report it in detail here. However, readers should be aware that the V_Six scores are considerably richer than they look at first sight. Data collection Version 1.00 of V_Six does not keep a public record of the data it collects, and it is not possible for users to change the stimulus sets or add responses to the “approved” response list. This avoids some security issues with web-based tests, and it prevents people from sabotaging the program. If you need to record the data collected by your test-takers, then the simplest method is probably to print out the completed test page before submitting it for evaluation. You can do this by using your browser’s print function. You can also produce a paper and pencil version of V_Six by printing out the V_Six main page. The data generated in this format will need to be processed manually. Parallel versions of V_Six Parallel versions of V_Six are in preparation.
Feedback and suggestions If you need to refer to V_Six in your own bibliographies, use these data: P.M. Meara & O. Abboud. 2008. V_Six v1.00. Swansea: lognostics. The V_Six team welcomes suggestions, comments and feedback on this program. You can email comments to Paul Meara at [email protected]. We always welcome your ideas for improvements. Please let us know if you use V_Six in your research work – it helps us keep track of how the program is being used, and lets us contact you when we bring out an update. We need to know if the program is responding in the way we think it should, and the only way we can do this is if you tell us. Waiver V_Six is made available freely to bona fide researchers, in exchange for evaluations and comments. V_Six is distributed as part of an ongoing research and development program at the Swansea University. The authors accept no liabilities of any kind arising from the use of this program. The development team is not responsible for any claims arising out of your use, or misuse of the program. By using this program, you are agreeing to these terms.
Chapter 10. V_Six v1.00 The manual
The approved association list This table lists the “approved” associations which are recognised by V_Six v1.00. This list underlies the network shown in Figure 4. It is likely that further research with this format will involve alterations and adaptations to this list, and to the list of stimuli. Any changes of this sort will be documented on the web-site. Comments on this list are welcome. start begin sudden police order take kill drop eye hour look short leave support game lose way death white fall pay night side window open read letter heart work easy plan play
stop end day finish stop end end thought change thought officer stop officer give give time life time dead dead catch water friend give catch glass black head time glass watch day watch long behind find love hard simple old time long stay life day book change head story stay behind love country town money school behind team win friend stop end catch watch team win rule change time win find life money friend head give long behind find life water door change front dead catch watch life black door black water dead behind rise love heavy rise day book money time long day dark school team dark door front glass door close end stay water door close book thought book write write love stop dead win black love heavy hard change head stop team find life day heavy hard produce win life hard simple piece money find life day produce town stop time dead write hard act
Connected Words
part break king village shop buy spend keep new man girl boy teacher face true tell back line
time stay hard act piece time glass win water day piece rule country save head life country town school town sell close book book town sell time money save give time save change old life day love town money change old love old woman country woman friend school old old friend school school head book life book save friend fact front life love friend fact lie time story lie stop give stay water door head story lie front stop time long stay life water write hard rule head story front finish
Instruction for test takers: English: In this test there are 50 questions. Each question contains 3 words in a yellow box, and 3 words in a blue box.. If you can see a word in the yellow box that is connected to a word in the blue box, then click on the two words. If you can’t find any connected words, click on No Link. Example:
In this example, you might find a link between elephant and mouse. When you have finished the test click the continue button.
chapter 11
WA_Sorter The manual Introduction One of the main practical problems which arises in the context of word association research is the sheer quantity of data that you have to deal with. If you ask a hundred people to respond to each of the one hundred words in the Kent-Rosanoff list (Kent & Rosanoff 1910), for example, then you end up with 10,000 data points. This is an enormous quantity of data if you have to process it manually, and the only practical way of dealing with data like this is to find a way of processing it automatically. The obvious way of doing this is to use a computer to sort and process the data for you. It is relatively straightforward to use a spreadsheet to process word association data, but this way of doing things is error prone and incredibly tedious. The program described in this chapter is a relatively simple web application that removes some of the drudgery of processing word association data.
WA_Sorter WA_Sorter is a program that sorts and collates word association data into a standard report format.
Data preparation You will need to do some editing work on your data before you can use WA_Sorter to process it. Basically, WA_Sorter works with sets of 10 stimulus words at a time. If your data set is larger than this, then you will need to break it up into smaller blocks. You should edit your data so that it looks like this: PM09b-1, bear,elephant,frog,goat,hedgehog,kangaroo,lion,monkey,ostrich,zebra,
Connected Words
This first line of data consists of a set of 10 stimulus words, preceded by a label which describes the data being processed. Note that each item in this line is separated by a comma. We will refer to this line as the stimulus line. The rest of your data should also be edited so that it looks like this: BonNa-1,river,grey,croak,butt,spikes,hop,roar,nuts,bury head,stripes, DanGe-1,king,Africa,French,horn,prickle,Australia,Africa,coconut tree,feather,stripe, RobMa-1,gruff,grey,pond,mountain,field,Australia,tiger,nuts,feather,stripe, Here you have one line of data for each experimental subject, and each line contains the responses made to the stimulus words in the stimulus line. Each line contains 10 responses, preceded by a code which identifies the experimental subject and the stimulus set. In the example above, BonNa-1 is the code for Napoleon Bonaparte, stimulus set 1. Napoleon produced the following responses: BEAR~river, ELEPHANT~grey, FROG~croak, GOAT~butt, HEDGHOG~spikes, KANGAROO~hop, LION~roar, MONKEY~nuts, OSTRICH~feather, and ZEBRA~stripes. Note that each response is followed by a comma, including the last one, and there are no spaces between the response words. If Napoleon had failed to make a response for one of the words, KANGAROO, for instance, then his data line would read: BonNa-1,river,grey,croak,butt,spikes,@,roar,nuts,bury head,stripes, Where @ is conventionally the symbol for a non-response. You can use another symbol if you wish, but you need to be consistent. You may also wish to develop standard symbols for illegible responses, and for other responses that require special treatment. You will also need to decide what to do with spelling mistakes, L1 responses to L2 words, grammatically marked words, multi-word responses and so on. Again, it doesn’t much matter what you decide here, as long as you are consistent in your treatment of these items. It might seem counter-intuitive to spend a lot of time re-editing your data in this way, but my own experience is that it is a complete waste of time to process large quantities of word association data by hand. Unless you are very focussed, you are very likely to make mistakes in the data processing, and it is almost impossible to rectify these mistakes after the event. Working closely with the data as I have suggested here is considerably less error prone, and has the additional advantage that it lets you get to know the data intimately. If you are dealing with very large amounts of data, then you might find it useful to invest in a program which will allow you to record your data by reading it aloud into a microphone, rather than typing it into a word processor directly. This
Chapter 11. WA_Sorter
eliminates simple typing errors, and allows you to enter data three or four times quicker than you can type it. Dragon’s Naturally Speaking program can transcribe single words with about 99% accuracy. Note that WA_Sorter does not carry out any checks to ensure that you have put your data into the right format, or that you have the right number of words in each line. If you miss a word out by accident, WA_Sorter doesn’t know this, and will give you results that don’t make sense. It is worth spending time making sure that you get the data right when you are preparing it.
Data processing Once you have your data in this standard format, then you can use WA_Sorter to process it for you. Open the WA_Sorter program at the site below: http://www.lognostics.co.uk/tools/WA_Sorter/ This address is case sensitive, and the last element is WA_Sorter with an underline. Make sure that your computer is connected to the internet, then click on the link. This should take you to the WA_Sorter home page, which looks something like Figure 1 on the next page. What you actually see will depend on which browser you are using – WA_Sorter looks slightly different when you run it using Firefox, or on a Linux machine. The WA_Sorter interface is extremely simple. All you need to do is to copy your stimulus line into the box marked paste your stimulus words here. Next, paste your data into the large box labelled paste your responses here. When you have done this, click on the button labelled continue. WA_Sorter will sort out your data so that all the responses for each of the stimulus words are listed together in order of frequency. How long this takes will depend on the size of your data set, and the speed of your internet connection. WA_Sorter will typically process a set of 100*10 responses in about 5 seconds. The output from WA_Sorter is shown in Figure 2 on the next page. You can print this data out by using your browser’s print function, or you can save the data to a file by copying the data and pasting it into a word processor. Figure 2 lists each of the stimulus words, followed by the responses generated by our subject sample in order of frequency. Thus: Kangaroo elicited two different responses, Australia twice and hop once. Obviously, this illustration is a much smaller data set than you would normally get in a word association experiment. A group of 100 subjects would usually be expected to produce 20 or thirty different responses to a single stimulus world, rather than the three we have shown here. Figure 3 shows data from a larger set of 20 experimental subjects.
Connected Words
Figure 1. The WA_Sorter Home Page.
Figure 2. The WA_Sorter Report Page.
Chapter 11. WA_Sorter
Figure 3. Part of a larger data set.
Common errors Make sure that each data line contains 10 words and an ID code. Make sure that each data point is followed by a comma, including the last one. Make sure that your data doesn’t include any blank lines. Make sure that your data doesn’t include trailing blanks at the end of a line.
Feedback and suggestions If you need to refer to WA_Sorter in your own bibliographies, use these data: P.M. Meara. 2008. WA_Sorter v1.00. Swansea: lognostics. We welcome suggestions, comments and feedback on this program, and requests for other utility programs that will facilitate word association work. You can email comments to Paul Meara at [email protected] . Please let us know if you use V_Six in your research work – it helps us keep track of how the program is being used, and lets us contact you when we bring out
Connected Words
an update. We need to know if the program is responding in the way we think it should, and the only way we can do this is if you tell us.
Waiver WA_Sorter is made available freely to bona fide researchers, in exchange for evaluations and comments. WA_Sorter is distributed as part of an ongoing research and development program at the Swansea University. The authors accept no liabilities of any kind arising from the use of this program. The development team is not responsible for any claims arising out of your use, or misuse of the program. By using this program, you are agreeing to these terms.
References af Trampe, P. 1983. Foreign language vocabulary learning – a criterion of learning achievement. In: Psycholinguistics and Foreign Language Learning, H. Ringbom (Ed.), 1–7. Åbo: Åbo Akademi. Aitchison, J. 1987. Words in the Mind. An Introduction to the Mental Lexicon. Oxford: Blackwell. Anderson, R.C. & Freebody, P. 1981. Vocabulary knowledge. In: Comprehension and Teaching: Research Reviews. J. Guthrie (Ed.), 77–117. Newark DE: International Reading Association. Baba, K. 2003. Test Review: Lex30. Language Testing Update 32: 68–71. Bachman, L.F. 1990. Fundamental Considerations in Language Testing. Oxford: OUP. Bauer, L. & Nation, I.S.P. 1993. Word families. International Journal of Lexicography 6: 253–279. Beck, J. 1981. New vocabulary and the associations it provokes. Polyglot 3(3): C7–F14. Blum-Kulka, S. 1981. Learning to use words: Acquiring semantic competence in a second language. In: Hebrew Teaching and Applied Linguistics, M. Nahir (Ed.). New York NY: University Press of America. Bousfield, W.A. 1953. The occurrence of clustering in the recall of randomly arranged associates. Journal of General Psychology 49: 229–240. Brown, R. & Berko, J. 1960. Word associations and the acquisition of grammar. Child Development 31: 1–14. Canale, M. & Swain, M. 1980. Theoretical bases of communicative approaches to language teaching and testing. Applied Linguistics 1(1): 1–47. Chomsky, N. 1959. Review of B.F. Skinner, Verbal Behavior. New York NY: Appleton-Century-Crofts, 1957. Language 35: 26–58. Cortese, M.J. & Fugett, A. 2004. Imageability ratings for 3,000 monosyllabic words. Behavior Research Methods, Instruments, and Computers 36: 384–387. Crothers, E. & Suppes, P. 1967. Experiments in Second-Language Learning. New York NY: Academic Press. Davies, A., Criper, C. & Howatt, A. 1984. Interlanguage. Edinburgh: EUP. Davis, B.J. & Wertheimer, M. 1967. Some determinants of associations to French and English words. Journal of Verbal Learning and Verbal Behavior 6: 574–581. Deese, J. 1965. The Structure of Associations in Language and Thought. Baltimore MD: Johns Hopkins Press. Eaton, S. 1961. An English-French-German-Spanish Frequency Dictionary. New York NY: Dover. Engber, C.A. 1995. The relationship of lexical proficiency to the quality of ESL composition. Journal of Second Language Writing 4: 139–155. Ervin, S. 1961. Changes with age in the verbal determinants of word association. American Journal of Psychology 74: 361–372. Faerch, C., Haastrup, K. & Phillipson, R. 1984. Learner Language and Language Learning. Clevedon: Multilingual Matters. Fellbaum, C. 1998. WordNet: an electronic lexical database. Cambridge MA: The MIT Press. Ferrer-i-Cancho, R. & Solé, R.V. 2001. The small world of human language. Proceedings of the Royal Society of London. Series B, Biological Sciences 268(1482): 2261–2265. Fitzpatrick, T. 2000. Using word association techniques to measure productive vocabulary in a second language. Language Testing Update 27: 64–69.
Connected Words Fitzpatrick, T. 2003. Eliciting and Measuring Productive Vocabulary Using Word Association Techniques and Frequency Bands. Ph.D. dissertation, University of Wales Swansea. Fitzpatrick, T. & Meara, P.M. Forthcoming. L2 Vocabulary Acquisition: Learning from Case-studies. Clevedon: Multilingual Matters. Griswold, R.E, Poage, J.F. & Polonsky, I.P. 1971. The SNOBOL4 Programming Language, 2nd Edn. Englewood Cliffs NJ: Prentice-Hall. Gougenheim, G., Michéa, R., Rivenc, P. & Sauvageot, A. 1956. L’élaboration du français élémentaire. Paris: Didier. Haastrup, K. & Henriksen, B. 2000. Vocabulary acquisition: acquiring depth of knowledge through network building. International Journal of Applied Linguistics 10(2): 61–80. Harary, F. 1969. Graph Theory. Reading MA: Addison-Wesley. Heatley, A. & Nation, I.S.P. 1998. VocabProfile and RANGE. Wellington: School of Linguistics and Applied Language Studies, Victoria University. Hockett, C.F. 1958. A Course in Modern Linguistics. New York NY: Macmillan. Huckin, T. & Coady, J. 1999. Incidental vocabulary acquisition in a second language: A review. Studies in Second Language Acquisition 21: 181–193. Hughes, J. 1981. Stability in the Word Associations of Non-native Speakers. MA Project, Birkbeck College, London. Hulstijn, J. 1992. Retention of inferred and given word meanings: Experiments in incidental vocabulary learning. In: Vocabulary and Applied Linguistics, P. Arnaud, P. & H. Béjoint (Eds), 113–125. London: Macmillan. Hunt, A. & Beglar, D. 2005. A framework for developing EFL reading vocabulary. Reading in a Foreign Language 17(1): 23–59. Ishikawa, S., Uemura, T., Kaneda, M., Shmizu, S., Sugimori, N., Tono, Y., Mochizuki, M. & Murata, M. 2003. JACET 8000: JACET List of 8000 Basic Words. Tokyo: JACET. JACET Basic Words Revision Committee (Eds). 2003. JACET List of 8000 Words (JACET 8000). Tokyo: JACET. Jenkins, J. & Palermo, D. 1964. A note on scoring word association tests. Journal of Verbal Learning and Verbal Behavior 3: 171–175. Jiménez Catalán, R.M. & Moreno Espinosa, S. 2005. Using Lex30 to measure the L2 productive vocabulary of Spanish primary learners of EFL. Vigo International Journal of Applied Linguistics 2: 27–44. Kent, G.H. & Rosanoff, J.A. 1910. A study of association in insanity. American Journal of Insanity 67: 37–96 & 317–390. Kiss, G.R., Armstrong, C. & Milroy, R. 1993. An Associative Thesaurus of English. Wakefield: EP Microfilms. Kiss, G., Armstrong, C., Milroy, R. & Piper, J. (Eds). 1973. An Associative Thesaurus of English and its Computer Analysis. Edinburgh: EUP. Kruse, H., Pankhurst, J. & Sharwood Smith, M. 1987. A multiple word association probe in second language acquisition research. Studies in Second Language Acquisition 9(2): 141–154. Lambert, W.E. 1955. Measurement of the linguistic dominance of bilinguals. Journal of Abnormal and Social Psychology 50: 197–200. Lambert, W.E. 1956. A multiple word association probe in second language acquisition research. Studies in Second Language Acquisition 9: 141–154. Landauer, T.K. & Dumais, S.T. 1977. A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review 104(2): 211–240.
References
Laufer, B. 1995. Beyond 2000: A measure of productive lexicon in second language. In: The Current State of Interlanguage, L. Eubank, L. Selinker & M. Sharwood Smith (Eds), 265–272. Amsterdam: John Benjamins. Laufer, B. 1998. The development of passive and active vocabulary in a second language: Same or different? Applied Linguistics 19: 255–271. Laufer, B. & Nation, I.S.P. 1995. Vocabulary size and use: Lexical richness in L2 written production. Applied Linguistics 16(3): 307–322. Laufer, B. & Nation, I.S.P. 1999. A vocabulary size test of controlled productive ability. Language Testing 16(1): 33–51. Lesser, R. 1974. Word association and availability of response in an aphasic subject. British Journal of Psychiatry 125: 355–367. Lightbown, P.M., Meara, P.M. & Halter, R. 1998. Contrasting patterns in classroom lexical environments. In: Perspectives on foreign and second language learning, D. Albrechtsen, B. Henriksen, I. Mees & E. Poulsen (Eds), 221–238. Odense: Odense University Press. Lotka, A.J. 1926. Statistics – the frequency distribution of scientific productivity. Journal of the Washington Academy of Science 16: 317–325. Madden, J.F. 1980. Developing pupils’ vocabulary skills. Guidelines 3: 111–117. Madsen, HS. 1983. Techniques in Testing. Oxford: OUP. McCarthy, M. 1990. Vocabulary. Oxford: OUP. McNeill, A. 1996. Vocabulary knowledge profiles: Evidence from Chinese-speaking ESL teachers. Hong Kong Journal of Applied Linguistics 1: 39–64. Meara, P.M. 1978. Learners’ word associations in French. Interlanguage Studies Bulletin 3(2): 192–211. Meara, P.M. 1981. Vocabulary acquisition: A neglected aspect of language learning. Language Teaching and Linguistics Abstracts 14: 221–246. Meara, P.M. 1984. The study of lexis in interlanguage. In: Interlanguage, A. Davies, C. Criper & A. Howatt (Eds). Edinburgh: EUP. Meara, P.M. 1988. Learning Words in an L1 and an L2. Polyglot 9(3): D1–E14. Meara, P.M. 1990. A note on passive vocabulary. Second Language Research 6: 150–154. Meara, P.M. 1993. Vocabulary in a second language, Vol. 3. Reading in a Foreign Language 9. Meara, P.M. 1996. The dimensions of lexical competence. In: Performance and Competence in Second Language Acquisition, G. Brown, K. Malmkjaer & J. Williams (Eds). Cambridge: CUP. Meara P.M. 2001. The mathematics of vocabularies. In: Language, Learning and Literature: Studies presented to Håkan Ringbom, M. Gill, A.W. Johnson, L. Koski, R. Sell & B. Wårvik (Eds), 151–166. Åbo: Åbo Akademi. Meara, P.M. 2004. Modelling vocabulary loss. Applied Linguistics 25(2), 137–155. Meara, P.M. 2005. Designing vocabulary tests for English, Spanish and other languages. In: The Dynamics of Language Use: Functional and Contrastive Perspectives, C. Butler, M.A. Gómez González & S. Doval Suárez (Eds), 271–285. Amsterdam: John Benjamins. Meara, P.M. 2007. Simulating word associations in an L2: Approaches to lexical organisation. International Journal of English Studies 7(2): 1–20. Meara, P.M. 2007a. Growing a vocabulary. In: EUROSLA Yearbook 7, L. Roberts, A. Gürel, S. Tatar & L. Martı (Eds), 49–65. Amsterdam: John Benjamins. Meara P.M. 2007b. Simulating word associations in an L2: The effects of structural complexity. Language Forum 33(2): 13–31. Meara, P.M. & Bell, H. 2001. P_Lex: A simple and effective way of describing the lexical characteristics of short L2 texts. Prospect 16(3): 5–24.
Connected Words Meara, P.M & Buxton, B. 1987. An alternative to multiple choice vocabulary tests. Language Testing 4(2): 142–154. Meara, P. & Fitzpatrick, T. 2000. Lex30: An improved method of assessing productive vocabulary in an L2. System 28: 19–30. Meara, P. & Fitzpatrick, T. 2004. Lex30 v 2.0. Swansea: Lognostics. Meara, P.M. & Ingle, S. 1986. The formal representation of words in an L2 speaker’s lexicon. Second Language Research 2: 160–171. Meara, P. & Jones, G. 1988. Vocabulary size as a placement indicator. In: Applied Linguistics in Society, P. Grunwell (Ed.), 80–87. London: CILT. Meara, P. & Jones, G. 1990. Eurocentres Vocabulary Size Test 10Ka. Zurich: Eurocentres Learning Services. Meara, P.M. & Milton, J.L. 2003. X_Lex: The Swansea Vocabulary Levels Test. Newbury: Express. Meara, P.M. & Wolter, B. 2004. V_Links: Beyond vocabulary depth. Angles on the English Speaking World 4: 85–97. Melka, F.J. 1997. Receptive vs productive aspects of vocabulary. In: Vocabulary: Description, Acquisition and Pedagogy, N. Schmitt & M. McCarthy (Eds), 84–102. Cambridge: CUP. Melka Teichroew, F.J. 1982. Receptive vs. productive vocabulary: A survey. Interlanguage Studies Bulletin 6(2): 5–33. Milgram, S. 1967. The small world problem. Psychology Today 1(1): 60–67. Miller, K.M. 1970. Free association responses of English and Australian students to 100 word from the Kent Rosanoff word association test. In: Norms of Word Association, L. Postman & G. Keppel (Eds). New York NY: Academic Press. Moreno Espinosa, S. & Jiménez Catalán, R. 2004. Assessing L2 young learners’ vocabulary: Which test should researchers choose? Paper delivered at BAAL/CUP Workshop: Vocabulary Knowledge and Use: Measurements and Applications, UWE, Bristol. Morgan, B.Q. & Oberdeck, L.M. 1930. Active and passive vocabulary. In: Studies in Modern Language Teaching, E.W. Bagster-Collins (Ed.), 213–221. New York NY: Macmillan. Morrison, R. 1981. Word Association Patterns in a Group of Bilingual Children. MA Project, Birkbeck College, London. Nagy, W. & Anderson, R.C. 1984. How many words are there in printed school English? Reading Research Quarterly 19(3): 305–330. Nagy, W. & Herman, P. 1985. Incidental vs instructional approaches to increasing reading vocabulary. Educational Perspectives 23: 16–21. Nation, I.S.P. 1983. Testing and teaching vocabulary. Guidelines 5(1): 12–25. Nation, I.S.P. (Ed.). 1984. Vocabulary Lists: Words, affixes and stems. English Language Institute Victoria University of Wellington Occasional Paper 12. Nation, I.S.P. 1988. Teaching and Learning Vocabulary. Wellington: Victoria University. Nation, I.S.P. 1990. Teaching and Learning Vocabulary. Boston MA: Heinle and Heinle. Nation, I.S.P. 2001. Learning Vocabulary in another language. Cambridge: CUP. Naves, T. & Miralpeix, I. 2002. Short-term effects of age and exposure on writing development. In: Fifty years of English Studies in Spain (1952–2002), I. Palacios Martínez, M. López Couso, P. Fra López & E Seoane Posse (Eds). Santiago de Compostela: ANDEAN. Nunan, D. 1995. Language Teaching Methodology. Hemel Hempstead: Prentice Hall. Palermo, D. & Jenkins, J. 1964. Word Association Norms. Minneapolis MN: University of Minnesota Press. Palmberg, R. 1987. Patterns of vocabulary development in foreign language learners. Studies in Second Language Acquisition 9: 201–220.
References
Politzer, R.B. 1978. Paradigmatic and syntagmatic associations of first year French students. In: Papers on Linquistics and Child Language: Ruth Hirsch Weir Memorial Volume, V. Honsa & M.J. Hardman-de-Bautista (Eds), 203–210. Berlin: Mouton. Postman, L. & G. Keppel (Eds). 1970. Norms of Word Association. New York NY: Academic Press. Randall, M. 1980. Word association behaviour in learners of English as a second language. Polyglot 2(2): B4–D1. Rapaport, D., Gill, M.M. & Schafer, R. 1968. Diagnostic Psychological Testing, revised edn. New York NY: International Universities Press. Read, J. 1995. Refining the word associates format as a mesure of depth of vocabulary knowledge. New Zealand Studies in Applied Linguistics 1: 1–17. Read, J. 2000. Assessing Vocabulary. Cambridge: CUP. Richards, J.C. 1976. The role of vocabulary teaching. TESOL Quarterly 10: 77–89. Riegel, K.F & Zivian, I.W.M. 1972. A study of inter- and intralingual associations in English and German. Language Learning 22(1): 51–63. Rimmer, W. 2000. What is it to test a word? Language Testing Update 28: 25–27. Rosenzweig, M.R. 1957. Etudes sur l’association des mots. Année Psychologique 57: 23–32. Rosenzweig, M.R. 1961. Comparisons among word-association responses in English, French, German and Italian. American Journal of Psychology 74: 347–360. Rosenzweig, M.R. 1970. International Kent-Rosanoff word association norms emphasizing those of French male and female students and French workmen. In: Norms of Word Association, L. Postman & G. Keppel (Eds). New York NY: Academic Press. Ruke-Dravina, V. 1971. Word associations in monolingual and multilingual individuals. Linguistics 74: 66–85. Schmitt, N. 1994. Vocabulary testing: Questions for test development with six examples of tests of vocabulary size and depth. Thai TESOL Bulletin 6(2): 9–16. Schmitt, N. 2000. Vocabulary in Language Teaching. Cambridge: CUP. Schmitt, N. & Meara, P.M. 1997. Researching vocabulary through a word knowledge framework: Word associations and verbal suffixes. Studies in Second Language Acquisition 19(1): 17–36. Schmitt, N., Schmitt, D. & Clapham, C. 2001. Developing and exploring the behaviour of two new versions of the Vocabulary Levels Test. Language Testing 18(1): 55–89. Selinker, L. 1984. The current state of IL Studies: An attempted critical summary. In: Interlanguage, A. Davies, C. Criper & A. Howatt. Edinburgh: EUP. Singleton, D. 1999. Exploring the Second Language Mental Lexicon. Cambridge: CUP. Swan, M. & Walter, C. 1990. The New Cambridge English Course. Cambridge: CUP. Szalay, L. & Deese, J. 1978. Subjective Meaning and Culture: An Assessment through Word Associations. Hillsdale NJ: Lawrence Erlbaum Associates. Söderman, T. Word associations of foreign language learners and native speakers – a shift in response type and its relevance for a theory of lexical development. Scandinavian Working Papers on Bilingualism 8: 114–121. Tréville, M.C. 1988. Faut-il enseigner le vocabulaire dans la langue seconde? In: L’enseignement des langues seconds aux adultes: recherches et practiques, R. LeBlanc, R. Compain, L. Duquette & H. Séguin (Eds). Ottawa: Presses de l’Université d’Ottawa. Vives Boix, G. 1995. The Development of a Measure of Lexical Organisation: The Association Vocabulary Test. Ph.D. dissertation, University of Wales Swansea. Waring, R. 1999. The Measurement of Receptive and Productive Vocabulary. Ph.D. dissertation, University of Wales Swansea.
Connected Words Watts, D.J. & Strogatz, S.H. 1998. Collective dynamics of ‘small-world’ networks. Nature 393: 440–42. Wesche, M. & Paribakht, T.S. 1996. Assessing vocabulary knowledge: Depth vs. breadth. Canadian Modern Language Review 53(1): 13–40. West, M. 1953. A General Service List of English Words. London: Longman Green and Co. Wilks, C. 1999. Untangling Word Webs: Graph Theory Approaches to L2 Lexicons. Swansea: University of Wales. Wilks, C. & Meara, P.M. 2002. Untangling word webs: graph theory and the notion of density in second language word association networks. Second Language Research 18(4): 303–324. Wilks, C, Meara, P.M. & Wolter, B. A further note on simulating word association behaviour in a second language. Second Language Research 21(4): 359–372. Zareva, A. 2005. Models of lexical knowledge assessment of second language learners of English at higher levels of language proficiency. System 33: 547–562.
Index
A Aamiry, A. 104 A3VT Test 111 Abboud, O. 156 Academic Word List 106 af Trampe, P. 73, 165 afferent connections 61 Aitchison, J. 49, 76, 86, 165 Amer, A.A.M. 101 Anderson, R.C. 34, 70, 165, 168 Aphek, E, 104 Appel, R. 101 Arabic 37, 101, 104, 121 Arkwright, T. 101 Armstrong, C. 89, 166 association strength 80 Associates Test 102, 109, 118, 121 assonance 22 attrition 59, 123 B Baba, K. 31, 45, 165 Bachman, L. 46, 49, 51, 53 backwards associations xiii Bagger Nissen, H. 101 Barfield, A. 102 Bauer, L. 37, 44, 165 Beauvillain, C. 108 Beck, J. 27, 165 Beglar, D. 98, 166 Beheydt, L. 102 Bell, H. 31 Berko, J. xi, 165 bibliometrics xiii Blum-Kulka, S. 73, 165 Bogaards, P. 108 Bol, E. 103 Bousfield, W.A. xi, 165 breadth 62, 73, 74, 76, 109, 115 British National Corpus 133 Brown, R. xi, 165
Bulgarian 107 Buxton, B. 29, 168 C C-Test 125 Canada 118 Canale, M. x, 165 Carpay, J.A.M. 103 Carter, R. 103 Catalan 118 Champagnol, R. 103 Chinese 54 Chomsky, N. x, 165 clang associations 1, 5, 7, 12, 13, 22, 27, 114, 125 Clapham, C. 74, 169 co-ordinate bilinguals 101, 107 Coady, M. 91, 166 cognates 13, 124 Cohen, A. 104 collocations 102 commonality 101, 126 communicative competence x compound bilinguals 101, 107 concept mediation 126 continuous associations 105, 110, 120, 123 continuum 60, 61 controlled productive vocabulary test 34, 53, 58, 106 Cooper, R. 105 Corder, S.P. ix core vocabulary 62, 81, 103 Cortese, M.J. 153, 165 Crable, E. 104 Criper, C. ix, 165 Crothers, E. x, 165 Council of Europe 124 D Dalrymple-Alford, E. 124 Danish 102, 110 Davies, A. ix, 165
Davis, B.J. 97, 104, 165 Davis, O. 113 de Groot, A.M.B. 105, 124 de Saussure, F. 21 de Wolf, T. 108 Deese, J. xi, 165, 169 deep word knowledge 108, 121 density 124 depth 62, 73, 74, 75, 76, 98, 109, 115, 117, 118, 125 derivational affixes 44 diameter of a graph 66 dictionary design 103, 109 dimensions 29, 62, 81, 112, 115, 126 Dumais, S.T. xv, 166 Dutch 101, 109, 111, 121, 123, 124 E Eaton, H. 69, 165 Edinburgh Associative Thesaurus 36, 89 Engber, C.A. 165 Erdmenger, M. 105 errors 27, 28, 127 Ervin, S. xi, 165 Euralex 102 F Faerch, C. 60, 165 Fellbaum, C. xv, 165 Ferrer-i-Cancho, R. xv, 62, 165 Finnish 26 Fishman, J. 105 Fitzpatrick, T. xvi, 53, 55, 106, 131, 132, 165, 166, 168 Fleming, G. 106 fluency 81, 104 forgetting 123 form class 112 fossilisation ix, 113 français fondamental 7, 18, 24, 85
Index Franceschina, E.S.N. 107 free productive vocabulary test 34 Freebody, P. 34, 165 French xi, xvi, 1, 7, 8, 9, 11, 12, 27, 85, 90, 97, 101, 103, 104, 105, 107, 108, 109, 112, 113, 114, 117, 120, 123 Friesian 121 Fuggett, A. 153, 165 G Gekoski, W. 107 gender 123 Gerganov, E. 107 German xi, 1, 98, 103, 105, 109, 111, 113, 119, 120, 127 Gill, M.M. xi, 169 Gordon, A.M. 110 Gougenheim, G. 7, 18, 166 Grabois, H. 107, 108 Grainger, S. 108 grammar translation 103 graph 66 graph theory xiv, 59, 65, 70, 124 Greek 110 Greidanus, T. 108, 109 Griswold, R.E. 21, 166 growth 59 H Haastrup, K. 73, 165, 166 habit strength 109 Hammerly, H. 109 Hanfling, H. 122 Harary, F. xv, 166 Heatley, A. 166 Hebrew 104 Henriksen, B. 73, 101, 110, 166 Herman, P. 91, 168 Heuer, H. 109 Hinofotis, F.B. 110 Hockett, C. x, 166 Holland, V.M. 122 homonyms 54 Horowitz, L.M. 110 Howatt, A. ix, 165 Huckin, T. 91, 166 Hughes, J. 26, 166 Hulstijn, J. 92, 166
Hunt, A. 98, 166 I Icelandic 37, 51 Ife, A. 110 imageability 153 incidental vocabulary acquisition 91 inflectional suffixes 44 interlanguage ix, 113 Ingle, S. 1, 168 IRT analysis 118 Ishikawa, S. 133, 151, 166 Ito, M. 111 J JACET 8000 57, 133, 141, 151, 153 Japanese 81, 111, 116, 117, 121, 122, 125, 126, 133 Jenkins, D. xi, xii, 166, 168 Jiménez Catalán, R.M. 31, 45, 166, 168 Jones, G. 34, 37, 168 K Kadota, S. 126 Kaneda, M. 133, 166 Kent, J-P. 111 Kent-Rosanoff list xii, xvii, 5, 7, 8, 24, 25, 112, 114, 118, 119, 120, 122, 159 Keppel, G. xii, 1, 24, 38, 88, 169 Kiss, G. 36, 89, 166 knowing a word 19, 56, 73 Kolers, P.A. 111 Kostyla, S.J. 122 Kruse, H. xii, 111, 166 Kudo, Y. 112 L Lambert, W.E. xi, 97, 101, 112, 113, 166 Landauer, T.K. xv, 166 language loss 123 Latvian 23, 120 Lauerbach, H. 113 Laufer, B. 31, 33, 34, 35, 40, 53, 58, 167 Leicester, P.F. 117 lemmatisation 44 Lesser, R. xi, 167
Lightbown, P.M. 33, 167 list learning 103 loan words 80 Lotka, A.J. xiii, 167 Lotka’s Law xiii M Machalias, R. 113 Madden, J.F. 73, 167 Madsen, H.S. 46, 167 malapropisms 28 Mandarin 54, 122 Marian, V. 121 Massad, C. 114 McCarthy, M. 76, 167 McGregor, K.K. 121 McNeill, D. 73, 167 Melka Teichroew, F.J. 30, 60, 168 Messick, S. 119 Michéa, R. 166 Michigan restricted association norms 119 Microsoft 132, 147 Milgram, S. 65, 68, 69, 168 Miller, G.A. xv Miller, K.M. 168 Milroy, R. 89, 166 Milton, J.L. 74, 132, 168 Minnesota norms 123 Miralpeix, I. 31, 168 mnemonic associations 104 Mochizuki, M. 115, 133, 166 Moore, N. 112 Moreno Espinosa, S. 31, 45, 166, 168 Morgan, B.Q. 31, 168 Morrison, R. 26, 168 multiple associations xii, 111, 118 multiple choice tests 67, 102 Murata, M. 133, 166 N Nagy, W. 70, 91, 168 Nakanishi, Y. 126 Nakata, K. 111 Namei, S. 107, 115, 116 Nation, I.S.P. x, 31, 33, 34, 35, 36, 37, 38, 49, 54, 56, 66, 73, 74, 98, 133, 165, 167, 168 Naturally Speaking 138, 161 Naves, T. 31, 168
Index Neven, A. 102 Nienhuis, L. 108, 109 Nikolova, Y. 126 noise 19 noisy data 92 non-linear realtionships 81 norms of association 24, 88, 113 Noro, T. 126 numerals 138 Nunan, D. 33, 168 O Oberdeck, L.M. 31, 168 one-off research xiv, 94 Opoku, J. 116 Orita, M. 116, 117 Osgood, C.E. 102 P Palermo, D. xi, xii, 166, 168 Palmberg, R. 35, 60, 117, 168 Pankhurst, J. xii, 111, 166 paradigmatic associations xi, xiv, 1, 5, 8, 13, 21, 102, 109, 116, 117, 120, 122, 125, 126 paired associates 110 Pajoohesh, P. 117 Paribakht, T.S. 73, 75, 125, 170 part of speech 105 passive vocabulary 29 periods abroad 92, 110 peripheral vocabulary 25 Persian 115 Phillipson, R. 165 phonological associations 102, 115 Piper, T.H. 87, 117, 166 Poage, J.E. 2, 166 Politzer, R. xi, 97, 98, 117, 169 Polonsky 2, 166 Postman, L.A. xii, 1, 24, 38, 88, 169 primary responses 9, 36 Productive Levels Test 54, 55, 57 productive vocabulary xvi, 30, 33, 40, 58, 106 proper names 138 Q Qian, D.D. 102
R Ramsey, R.A. 118, 119 Randall, M. 26, 118, 169 Rapaport, D. xi, 169 Rawlings, C. 101, 113 Read, J. 74, 102, 109, 118, 121, 169 receptive vocabulary 29, 30 response commonality 101, 126 restricted associations 110, 118, 119 rhyme 22 Richards, J.C. 73, 169 Riegel, K.F. xi, 1, 77, 98, 119, 169 Riegel, R. 119 Rimmer, W. 45, 169 Rivenc, P. 166 Romanche 111 Rosenzweig, M.R. xii, 7, 11, 13, 14, 169 Ruke-Dravina, V. 23, 120, 169 S Sanford, K. 120 Sauvageot, A. 166 Schafer, R. xi, 169 schizophrenia 113 Schmitt, D. 74, 169 Schmitt, N. x, 74, 76, 120, 121, 169 Schoonen, R. 102, 121 Schwanenflugel, P. 126 secondary responses 9 self-rating 73, 105, 126 Selinker. L. ix, 169 semantic differential 102 semantic satiation 112 Sharwood Smith, M. xii, 111, 166 Sheng, L. 121, 122 Shmizu, S. 133, 166 simulations 63, 86, 87, 90, 91, 92, 93, 94, 95, 98, 114, 124, 125 single store hypothesis 104 single subject studies 31 Singleton, D. x, 169 Skinner, B.F. x, 165 slips of the tongue 28 small world networks 62 SNOBOL 2
Solé, R.V. xv, 62, 165 Spanish xi, 70, 98, 105, 106, 107, 109, 111, 113, 118, 119 spew test 35 stability xi, 25, 114 stereotypy xi, 112, 114, 118 Strogatz, S 62, 170 Stroop Test 112 Suppes, P. x, 165 Svetics, I. 120 Swain, M. x, 165 Swan, M. 33, 165, 169 Swartz, M.L. 122 Sweden 59 Swedish 23, 115, 120, 122 Sugimori, N. 133, 166 synonyms 54, 104 syntagmatic associations xi, xiv, 1, 5, 7, 13, 21, 102, 106, 116, 120, 122, 125, 126 syntagmatic-paradigmatic shift 102, 116, 117, 122 syntax 18 Szalay, L. xi, 108, 169 Söderman, T. xi, 122, 169 Sökmen, A. 123 T Taseva-Rangelova, K. 107 Taylor, I. 123 tertiary responses 9 Thai 111 think aloud methods 67 Threshold Level 124 tip of the tongue phenomenon 127 TOEFL 121 Tono, Y. 133, 166 traffic flow models 70 translation test 53, 55 Tréville, M-C 60, 169 Turkish 101, 121 Type-Token Ratio 113 U Uemura, T. 133, 166 University Word List 35 V valency 66, 68, 70 van der Linden, E. 108, 123 van Ginkel, C.I. 123
Index van Hell, J.G. 124 Verbal Behaviour x, 165 verbal learning x Verhallen, M. 121 VKS 125, 126 Viau, A. 101 Vives Boix, G. xiii, 110, 169 VocabProfile 37 vocabulary accessibility 29 vocabulary knowledge framework 73, 74 Vocabulary Knowledge Scale 125, 126 Vocabulary Levels Test 74, 102, 121 vocabulary organisation 29 vocabulary size 29, 62, 68, 75, 76, 126 Vygotsky, L. 107
W Walter, C. 33, 169 Waring, R. 35, 169 Watts, D.J. 62, 170 weighting systems xii, 120, 125 Wertheimer, M. 97, 105, 165 Wesche, M. 73, 75, 125, 170 West, M. 18, 170 White, C. 124 Wikberg, K. 124 Wilks, C. xvi, 63, 77, 78, 85, 86, 90, 98, 124, 125, 147, 170 Windows Operating System 132, 147 Wolter, B. xvi, 86, 88, 89, 98, 115, 147, 151, 168, 170, 124, 125
word class 102 word recognition tests WordNet xv X X_Lex 132 Y Yabuuchi, S. 126 Yamamoto, K. 114 Yes/No vocabulary tests 29, 30, 37, 38, 39, 40, 62, 74, 102 Yokokawa, H. 126 Z Zareva, A. 29, 126, 127, 170 Zimmerman, R. 127 Zivian, I.W.M. 1, 77, 119, 169
In the series Language Learning & Language Teaching the following titles have been published thus far or are scheduled for publication: 25 Abraham, Lee B. and Lawrence Williams (eds.): Electronic Discourse in Language Learning and Language Teaching. 2009. x, 346 pp. 24 Meara, Paul: Connected Words. Word associations and second language vocabulary acquisition. 2009. xvii, 174 pp. 23 Philp, Jenefer, Rhonda Oliver and Alison Mackey (eds.): Second Language Acquisition and the Younger Learner. Child's play? 2008. viii, 334 pp. 22 East, Martin: Dictionary Use in Foreign Language Writing Exams. Impact and implications. 2008. xiii, 228 pp. 21 Ayoun, Dalila (ed.): Studies in French Applied Linguistics. 2008. xiii, 400 pp. 20 Dalton-Puffer, Christiane: Discourse in Content and Language Integrated Learning (CLIL) Classrooms. 2007. xii, 330 pp. 19 Randall, Mick: Memory, Psychology and Second Language Learning. 2007. x, 220 pp. 18 Lyster, Roy: Learning and Teaching Languages Through Content. A counterbalanced approach. 2007. xii, 173 pp. 17 Bohn, Ocke-Schwen and Murray J. Munro (eds.): Language Experience in Second Language Speech Learning. In honor of James Emil Flege. 2007. xvii, 406 pp. 16 Ayoun, Dalila (ed.): French Applied Linguistics. 2007. xvi, 560 pp. 15 Cumming, Alister (ed.): Goals for Academic Writing. ESL students and their instructors. 2006. xii, 204 pp. 14 Hubbard, Philip and Mike Levy (eds.): Teacher Education in CALL. 2006. xii, 354 pp. 13 Norris, John M. and Lourdes Ortega (eds.): Synthesizing Research on Language Learning and Teaching. 2006. xiv, 350 pp. 12 Chalhoub-Deville, Micheline, Carol A. Chapelle and Patricia A. Duff (eds.): Inference and Generalizability in Applied Linguistics. Multiple perspectives. 2006. vi, 248 pp. 11 Ellis, Rod (ed.): Planning and Task Performance in a Second Language. 2005. viii, 313 pp. 10 Bogaards, Paul and Batia Laufer (eds.): Vocabulary in a Second Language. Selection, acquisition, and testing. 2004. xiv, 234 pp. 9 Schmitt, Norbert (ed.): Formulaic Sequences. Acquisition, processing and use. 2004. x, 304 pp. 8 Jordan, Geoff: Theory Construction in Second Language Acquisition. 2004. xviii, 295 pp. 7 Chapelle, Carol A.: English Language Learning and Technology. Lectures on applied linguistics in the age of information and communication technology. 2003. xvi, 213 pp. 6 Granger, Sylviane, Joseph Hung and Stephanie Petch-Tyson (eds.): Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching. 2002. x, 246 pp. 5 Gass, Susan M., Kathleen Bardovi-Harlig, Sally Sieloff Magnan and Joel Walz (eds.): Pedagogical Norms for Second and Foreign Language Learning and Teaching. Studies in honour of Albert Valdman. 2002. vi, 305 pp. 4 Trappes-Lomax, Hugh and Gibson Ferguson (eds.): Language in Language Teacher Education. 2002. vi, 258 pp. 3 Porte, Graeme Keith: Appraising Research in Second Language Learning. A practical approach to critical analysis of quantitative research. 2002. xx, 268 pp. 2 Robinson, Peter (ed.): Individual Differences and Instructed Language Learning. 2002. xii, 387 pp. 1 Chun, Dorothy M.: Discourse Intonation in L2. From theory and research to practice. 2002. xviii, 285 pp. (incl. CD-rom).