Cognition, Vol. 5, No. 3

Cognition, @Elscvier 5 (1977) 189 - 214 Sequoia S.A., Lausanne 1 - Printed in the Netherlands Procedural Semantics*...

20 downloads 467 Views 6MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Cognition, @Elscvier

5 (1977) 189 - 214 Sequoia S.A., Lausanne

1 - Printed

in the Netherlands

Procedural

Semantics*

PHILIP N. JOHNSON-LAIRD University

of Sussex

Abstract The aim of this paper is to present an outline of a theory of semantics based on the analogy between natural and computer programming languages. A uniji’ed model of the comprehension and production of sentences is drscribed in order to illustrate the central “compile and execute” metaphor underlying procedural semantics. The role of general knowledge within the lexicon, and the mechanism mediating selectional restrictions, are re-anall’zed in the light of the procedural theory.

“Procedural semantics” is an expression that gained currency first in the discussion of computer programming languages like Fortran and Algol. These artificial languages, which are used to communicate programs of instructions to computers, have both a syntax and a semantics. Their syntax consists of rules for writing well-formed programs that a computer can interpret and execute. Their semantics consists of the procedures that the computer is instructed to execute. If, for example, a programming language permits an instruction like: x and J 3, it might mean that the computer is to add the values of x and ~1,and to print the result. There are usually two steps involved in running a program of instructions written in some high-level programming language. The first step is to compile the program, which consists of translating it into the operational code of the particular machine to be used. Compilation depends heavily on the syntax of *The ideas in this paper were developed in collaboration with George A. Miller to whom I owe a very considerable intellectual debt. Professor Miller also delivered an earlier version of the paper to the Conference on Philosophy and Psychology at Cornell University, 2 April, 1976. Tom Bever and Jacques Mchler were kind enough to suggest that the paper be put into a form suitable for publication. I am indebted to Stephen Isard and Mark Steedman for many helpful comments, ideas, and criticisms, though of course they should not be blamed for my mistakes. The earlier version of the paper was also revised in the light of some of Jerry Fodor’s pungent remarks in his unpublished critique, “Tom Swift and His Procedural Grandmother”. The research was in part supported by a grant from the Social Science Research Council (Great Britain) to the Laboratory of Experimental Psychology, Sussex University.

190

Philip N. .Johnsor~-l,aird

the programming language; getting all the commas and parentheses right is a notoriously important and often frustrating task for people just learning to write programs. The output of the compilation will be a compiled program, coded in a language that the machine can recognize and execute. The second step is to take this compiled program, along with the data it is to operate on, and to run it. If all goes well, the output is the result that the programmer wanted to obtain. How a programmer determines that the output really is what he wanted to compute is the problem of procedural semantics. For simple procedures that are well understood, like the operations of arithmetic or Boolean algebra, the semantics is relatively straightforward; a programmer can determine whether his program means what he wanted it to mean by simply checking that the machine gives the same results that he obtains when he does a few sample computations himself. In large and elaborate programs, however, it is often difficult to determine whether the result of the programmed computation is actually what the programmer intended to compute. How to prove that a program does what the programmer claims it does is a difficult question and nothing more will be said about it here. The purpose of mentioning it at all is simply to indicate what “procedural semantics” means in this domain of discourse: procedural semantics deals with the meaning of procedures that computers are told to execute. This discussion moves a step closer to our present concerns when we consider how computers might be used to perform various operations as a consequence of natural language inputs - messages coded, not in Fortran or Algol, but in English or Russian. It is natural to carry over the “compile and execute” strategy to natural language processing. For example, if a program is to accept questions and to provide answers written in English, the first step would be to translate the English question into a program for computing an answer or for finding the answer in some prestored data base. That is to say, the first step is to compile the question, which entails treating English questions in the same way Fortran or Algol statements would be treated. The second step is to run the compiled program in order to obtain the information required for an answer, and the final step is to formulate it in an English phrase or sentence that the machine can type out. Obviously, natural languages like English were not designed as programming languages; to treat them as such requires considerable ingenuity on the part of a programmer. The compilation must be guided by natural language syntax, which is considerably more complicated than the syntax of artificial programming languages, and an important component of the system must be a parsing mechanism sensitive to the grammatical structure of sentences. This part of the system might be called procedural syntax, and it has been conceived in many different ways. There are parsers that try to build up ana-

Procedural semantics

191

lyses from the smallest constituents to the largest ones. There are parsers that operate more predictively and look at each point in a sentence for the sort of constituents they expect to find (as we shall illustrate below). Some parsers build up a chart of the different possible alternative syntactic analyses (Kay, 1975), whereas others check the semantic coherence of a proposed analysis before accepting it (Winograd, 1972). There are parsers that pursue syntactic and semantic analyses in parallel and exchange information between them either very freely in a heterarchical fashion (Woods and Makhoul, 1973) or through an extremely restricted communication channel (Reddy et al., 1973). There are even systems that attempt to go straight to a semantic interpretation and then check it against the syntax of the sentence (Schank, 1972). Such a wealth of possibilities should suffice to indicate that, in the realm of natural language processing, “procedural syntax” and “procedural semantics” have been given a variety of different interpretations. Psychologists are interested not in how a computer might compile and execute sentences, but in how people process them - in the cognitive operations that are performed in producing and understanding utterances. It is sometimes claimed that the work on the computer processing of sentences is unlikely to contribute to the scientific understanding of language because it lacks a concern for the principles underlying its organization and learning. There is some truth in this charge, but it may be mistaken to argue that technology cannot contribute to science or that a study is worthless because it does not elucidate some problem assumed to be central to an area. In fact, computer studies can be a useful source of hypotheses about human linguistic processing and have played a crucial part in discovering the ubiquitous role of inference in comprehension. There are a wide variety of computer systems, but they all pay allegiance in their different ways to the “compile and execute” strategy. The application of this approach to human sentence processing was first clearly formulated by Davies and Isard (1972), who pointed out that compiling and executing correspond rather naturally to stages in a person’s comprehension of an utterance. The first step that a person must perform is to compile the sentence - to translate it into a program in his or her internal mental language. This is generally an automatic and involuntary process for anyone who knows the language well. If someone says, Pass the salt or When did you last see your father? or I’m leaving for Chicago tonight, the listener usually compiles a corresponding program. Of course, an assertion may be treated merely as data rather than an actual program - computer programs often require data to operate on. Indeed, an assertion may not be compiled into an executable program at all (see Miller and Johnson-Laird, 1976, Sec. 3.5.7). However, it

192

Philip N. Johnson-Laid

is convenient to treat the sentence as compiled into a program if it is to be verified, or if information from it is to be added to memory. The nature of the program will naturally depend on the evaluation of the sentence by higher-order procedures. Once a program is compiled, the question arises as to whether the listener should run it. Should he pass the salt? Should he tell the speaker when he last saw his father? Should he add to his store of knowledge about the speaker that he is leaving for Chicago? Choosing whether or not to execute a program is a complicated business, usually under voluntary control and often dependent on complex cognitive skills. The advantages of this approach for the purpose of psychology are reasonably obvious. Psychologists are interested in how language is used to communicate: to make statements, to ask questions and to answer them, to make requests, and even to express invocations and imprecations. It is a virtue of the procedural approach that it places these diverse speech acts on an equal footing and provides a theoretical language for formulating hypotheses about the mental processes involved. In order to provide a glimpse of procedural semantics in action it will be necessary to explore some of these hypotheses in detail, but first we will make a brief digression to consider the relation of this approach to the more logical approach to semantics that begins with Frege (1892) and extends down to the present day in the work of many philosophers and logicians. At first glance the two approaches may seem totally unrelated, but further consideration reveals some interesting similarities. Take, for example, the logical distinction between extensions and intensions. In model-theoretic semantics an intension is a function from possible worlds to truth values, and an extension is the truth value for a particular world. In procedural semantics there is a similar distinction between a procedure and the result of executing it. Thus we might speak of the intension of a program as the procedure that is executed when the program is run, and of the extension of a program as the result that the program returns when it has been executed. This parallel also serves to highlight an advantage of the procedural approach for the purposes of psychological theory. Bertrand Russell once remarked that the essential business of language is to assert or deny facts about the world, a claim that reflects a logician’s emphasis on truth and falsity. This bias is certainly not without its philosophical critics; it may simply be a historical accident arising from the fact that the first great achievement in model theory was Tarski’s (1936) recursive definition of truth. But whereas model-theoretic semantics regards the extension of a sentence as its truth value, procedural semantics admits a much wider range of extensions. A truth value is but one of the possible results of executing a program; others include answers to questions, compliance with requests,

Procedural semantics

193

additions to knowledge, modification of plans, and so on. These various consequences of language use are all of interest to psychologists, who gain little insight into their problems from a semantic theory that contemplates nothing but the truth value of sentences that appear magically out of a social vacuum. A related aspect of the logical approach to language is also worthy of comment. The tradition has been to treat semantics as a branch of mathematics, and the analysis of natural language as involving elaborate meta-mathematical concepts. A psychologist must respect the rigor that such an approach provides, but it does have the effect of removing semantics beyond the reach of empirical test. Since the psychologist is interested in the particular system of cognitive operations that people employ, and since it is an empirical task to discover what that system is, model-theoretic approaches offer only indirect solutions to his problem. Perhaps the chief advantage of a procedural approach is that the “compile and execute” strategy forces the theorist to consider processes as well as structures. We can illustrate this point by considering a particular procedural model that was developed as a result of a growing dissatisfaction with theories based on linguistic theory. According to transformational grammar, the surface structure of a sentence is formally derived from an underlying level of representation, or “deep structure”, in which its fundamental grammatical relations are made explicit. The derivation is by way of grammatical transformations that permute constituents, delete them, and so on. However, grammatical transformations do not correspond to psychological processes, and it turns out that there is little or no unequivocal evidence that a representation of deep structure plays any part in understanding or speaking a sentence. Effects attributed to its role in comprehension may equally well be attributed to meaning (see Johnson-Laird, 1970; Fodor, Bever, and Garrett, 1974, p. 270). Its status in speaking is even more problematical because sentences are likely to be spoken in the order in which they are planned, whereas the order of constituents in deep structure is for many sentences markedly different to their surface order. Moreover, as Fodor et al. (1974, pp. 393 - 7) point out, it is not obvious what psychological processes could lead from meaning to deep structure, or from deep structure to surface structure. It might be argued that the concept of deep structure should accordingly be abandoned by psycholinguists. In our view, the real problem is to reconcile its linguistic necessity with the exigencies of human information processing, and a way to do so has gradually emerged in the development of a procedural theory of comprehension (Miller and Johnson-Laird, 1976; Johnson-Laird, 1977).

194

Philip N. Johnson-Laid

We will describe a model of part of this theory that was implemented in a computer program by Mark Steedman - a process that led to the introduction of some new ideas (see Steedman and Johnson-Laird, 1977). The program answers questions about a simple one-dimensional universe of discourse that consists in ‘particles’ moving to and fro, colliding with each other, and so on. It comprehends and produces sentences using a novel device, a semantic transition network (STN), which, while sensitive to deep structure relations, does not set up an explicit representation of them: in understanding a sentence, it goes directly from the sentence to its meaning, and in producing a sentence, it goes directly from meaning to the sentence. The STN is a familiar parsing device, a recursive augmented transition network, modified so that it builds up, not a syntactic representation of a sentence, but its semantic representation. Once this modification is made, it is a simple matter to use the device to produce sentences. A simple augmented transition network for parsing declarative sentences is illustrated in Figures 1 and 2 (see Thorne, Bratley, and Dewar, 1968; Figure

1 Sentence

network: Verb: check it agrees in number

Find Noun-

For transitive

verb

: >

U

(Label it Subject)

w

(Assemble analysis of sentence)

(Label it Verb)

For intransitive verb : (Jump

Figure

2 Noun

phrase

network

directly)

: (,

Adj’;) (Label it Modifier) Noun + (Label it Noun)

(Jump

directly)

(Assemble analysis of Noun phrase and send it to Sentence network)

Procedural senlantics

195

Woods, 1970; Wanner and Maratsos, 1975, for further accounts from the standpoints of linguistics, artificial intelligence, and psycholinguistics, respectively). The network in Figure 1 is for parsing declarative sentences, and the network in Figure 2 is a special subcomponent for parsing the noun phrases within them. The numbers in the nodes play no part in the actual process, but are merely simple mnemonics. The system works by making transitions from node to node: a transition can be made only if the item currently being parsed satisfies the test specified above the arc. When a transition is made, the action specified beneath the arc is carried out. It will be noted that some arcs have no test associated with them - their action is carried out without parsing a word, and some arcs have only the action of jumping directly to the next node. A computer program that implements such a system needs to keep track of where in the network it is currently operating in order to ensure that it makes appropriate jumps from one component to another. It also needs a lexicon in which the syntactic categories of words are identified. If the system is unable to make any further transition but has not come to the end of a sentence, then it halts: the sentence is, as far as it is concerned, an ungrammatical one. Of course, some blockages may arise simply as a result of taking the wrong arc out of a node from which there are several. A variety of strategies can be implemented to try alternative routes through the network. The way in which the augmented transition network parses a sentence is illustrated in Table 1. Although in this case the output of the device is a sort of surface structure decorated with terms denoting deep structure relations, an augmented transition network can equally well be devized to build up a more orthodox representation of deep structure. The STN builds up a direct semantic interpretation of a sentence. In order to describe its operation, we must first give an ,account of the way in which information about the universe of discourse is represented in the model. A typical brief history of the universe is represented by the following set of assertions in a data-base : Object Y is at location

A at time 0: [Y AT A 01

Object X is at location

B at time 0: [X AT B 01

Object Z is at location

C at time 0: [Z AT C 01

Event El consists in Y moving from location A to location B from time 1 to time 2:

[El Y MOVE FROM A TO B START 1 FINISH 21

Event E2 consists in X moving from location B to location C from time 2 to time 3:

[E2 X MOVE FROM B TO C START 2 FINISH 31

196

Philip N. Johnson-Laird

An ilhrstrativc example oj’a parsing carried out b)q the augmented

Table 1.

network depicted

Input sentence:

“L&c

brought a cake”

Active node

Test

Action carried out

Sl

Find a Noun phrase

Jump

NPI

Leslie fails test; try another

NPl

Article .~

NPz

Adjcctivc

Leslie fails test; try another

NP2

Noun _

Label I.eslie as il Noun, (NP: Noun:

-

SubJect

SZ

Verb

Verb = brought

S3

Find a Noun phrase

Jump

NF’I

Article

Article:

NP2

Adjective

Cake fails test; try another

NP2

Noun

Noun:

-

Object

NP end Sl

.to Noun phrase

Jump directly

S end

Event E3 consists in Z moving from location C to location D from time 3 to time 4:

network. BTC.

to next node.

I.eslie).

to Noun phrase

arc. I.es/ie.

i.e., Noun:

Jump

= (NP: Noun:

to Sentence

network.

Leslie). network.

a arc.

cake.

(NP: Article:

NP end s3

transition

in Figures I and 2.

a Noun:

= (NP: Article:

(Sentence

cake). Jump a Noun:

to sentence

network.

cake)

Subject

= (NP: Article:

Verb

= brought

Object

= (NP: Noun:

a Noun:

cake)

Leslie)

[E3 Z MOVE FROM C TO D START 3 FINISH 41

This data-base is implemented using the higher-order programming language, PICO-PLANNER (Anderson, 1972). PLANNER languages are extremely useful for manipulating sets of facts and for drawing conclusions from them (see Hewitt, 197 1; Winograd, 1972). Their most important feature for our purposes is that they allow a goal to be specified to find some fact in the data-base, which they then seek to satisfy either directly by finding the fact or else indirectly by deducing it from other facts in the data-base. They are accordingly indifferent as to whether information is specified in terms of semantic primitives or in terms of more complex elements of meaning. This feature reflects our intuition that the decomposition of the meanings of words into their primitive semantic constituents is not an invariable prerequisite for the comprehension of sentences.

Procedural semantics

Figure 3.

191

The syntactic tests used by the STN in the program.

Noun

Figure 3 shows the extremely simple syntactic tests used by the STN implemented in the program. As the STN analyses a sentence, it compiles a series of semantic goals, and each goal is immediately executed with respect to the information about the history of the universe. An example of the process is summarized in Table 2. Here all the goals are satisfied, and since the program is restricted to yes/no questions, it responds, “yes”. If one of the goals should fail, the program constructs a ‘helpful’ answer. There are two such sorts of answer, depending on the nature of the goal that fails.

Table 2. Input question: Active

Sl SZ

node

An illustrative example of the comprehension of a question by the STN depicted in Figure 3 “Did X hit Z at (location) Test Auxiliary

verb

Noun

C?”

Action

carried out

Jump

to next node.

Store X for use in constructing

semantic

goal. s3

Active verb

Compile Execution

SA

Noun

Compile Execution

semantic

goal:

(EVENT

of goal yields the assertion: semantic

goal:

(EVENT

of goal yields:

At fails test; try another

s6

Preposition

Store al for use in constructing

s6

Noun

C is stored as location

SS

Jump

HIT OBJ Z)

(E2 HIT OBJ Z]

Noun _

SS

HIT SUBJ X)

[E2 HIT SUBJ X]

arc.

to next node.

semantic Compile Execution

goal semantic

name in lexicon.

goal:

(EVENT

of goal yields:

All goals satisfied:

respond,

HIT AT C)

(E2 HIT AT C] “Yes”.

198

Philip N. Johnson-Laid

A failure may occur in a goal that corresponds to “given” information in the question, that is to say, the questioner has taken something for granted that is in fact false. For example, the question: Did Y hit Z at C? takes for granted

that Y hit Z, and would give rise to the answer:

No, Y did not hit Z, Y hit X. The information for this answer is obtained by noting that a goal corresponding to a “given” constituent has failed and then determining what entity Y actually hit. A failure may also occur in that part of the question that corresponds to “new” information. For example, the question: Was Z hit by X at A? would give rise to the answer: No, Z was hit by X at C. The information underlying this answer is obtained by generating a new goal to find the location of the event. In reality, the division of sentences into is a complicated business involving surface “given” and “new” information structure, intonation contour, and context. The program takes a very simple view of the matter: the distinction is recognized primarily to illustrate the feasibility of a procedural approach to it, and to explore the use of an STN in producing sentences of different sorts. The conventional theory of the production of sentences suggests that a speaker decides on what he wants to say, and then decides on how and in what order he wants to say it. This view is certainly implicit in the notion of mapping meaning onto deep structure, and then deep structure onto surface structure. Obviously, a speaker can make a cold-blooded decision about the order in which he wishes to express his thoughts, but such decisions seem to be relatively rare. An alternative hypothesis is that the order in which the content of a message is constructed determines the actual surface order of the sentence. This is the principle by which the program operates: it is able to do so only because an explicit representation of deep structure plays no part in its operation. The STN constructs its answer from the set of assertions that are produced in evaluating the question. The essential modification required to make it produce sentences is to swop round the tests and actions on each arc so that it now tests for certain sorts of semantic content and acts by producing words in an appropriate order. The process is illustrated in Table 3.

Procedural semantics

Table

3.

199

An illustrative example of the production of a ‘helpful’answer by the STN depicted in Figure 3.

The evaluation of the goals set up by the input question, to the following series of assertions: [E2 HIT OBJ Z]

“Was

Z

hit by

X at

A?” gave rise

[E2 HIT SUBJ X] [E2 HIT AT C] , as a result of an initial failure.

These are the input to the STN, the initial failure having triggered the response, “No”. Active

SI

Node

Test

Action

carried out

Auxiliary _

verb

s2

[EVENT

VERB..

s3

[EVENT

VERB SUBJ VAR]

Test fails; try another

S1

Test fails; try another Jump .VAR]

arc.

to next node.

Print value of variable,

VAR:

Z

arc.

s3

[EVENT

VERB OBJ VAR]

Print passive VERB:

was hit

S5

[EVENT

VERB SUBJ VAR]

Print & + VAR:

[EVENT

PREP VAR]

Print PREP + VAR

by X at C

s6

The complete

answer

is: No, Z was hit by X at C.

The appeal of a theory to psychologists is likely to depend on its predictive power. We shall mention just one such aspect of the present model. The STN’s use of semantic cues allows its syntactic component to be considerably simplified. A conventional transformational grammar distinguishes the underlying structure of such sentences as: The car was pushed by the police station and : The car was pushed by the driver. The first example involves a locative adverbial, whereas the second example in its more salient interpretation involves a passive by-phrase. As Figure 3 shows, the syntactic component of the STN analyses: X was pushed by B where “B” denotes

a location,

in an identical

fashion to:

X was pushed by Y where “Y” denotes

an entity.

The two sentences

are distinguished

in setting

200

Philip N. Johnsott-Laid

up their semantic goals by taking into account knowledge about “B” and “Y”. In essence, the program utilizes “selectional restrictions” to aid its interpretation of sentences: it appreciates that a location such as B cannot denote the subject of a pushing, whereas an entity such as Y can play this role. This conception of a simplified syntactic process is compatible with some recently discovered facts about grammatical transformations within the cycle (e.g. passive, dative movement). Such transformations are structurepreserving, i.e., they do not move constituents to positions that cannot be generated by the rules specifying deep structures (Emonds, 1976). Thus, for example, the passive bj)-phrase is not a novel structure: it is also specified by the deep structure rules generating locative and other adverbials. The cyclical transformations are also lexically dependent, i.e., their applicability depends on the presence of the appropriate lexical items. Thus, for example, the passive can be applied when the main verb is pay, but it cannot be applied when the main verb is cost. As a result of such observations, Bresnan (1976) has argued that cyclical transformations should be replaced by lexical redundancy rules ~ a proposal which, as she acknowledges, is particularly compatible with an ATN parsing model. In fact, it would seem to be still more compatible with an STN designed to use its semantic knowledge in analyzing structures with more than one syntactic role. The use of semantic cues in this way is contrary to the hypothesis that syntactic and semantic processes are autonomous and do not interact (Forster and Olbrei, 1973; Forster, 1976; Garrett, 1976). Recent experimental evidence suggests that syntactic and semantic variables do interact in the way predicted by the model (Steedman and Johnson-Laird, 1977). Anyone who has written a computer program knows that there are many occasions when it cannot initially be compiled. Programmers, of course, make mistakes in syntax. Speakers, too, make grammatical mistakes, yet listeners are generally able to understand what they are saying. This observation can only be explained on the assumption that the natural-language compiler is an extremely resourceful device - indeed, much more so than an STN, which is at best a simplified, but perhaps instructive, model of only a small part of linguistic performance. After this glimpse of a specific model, let us turn to some more general issues which are treated in other procedural theories. One such crucial phenomenon is that the same sentence can express different propositions on different occasions of use. The procedural semanticist accordingly recognizes that the same sentence can be compiled into different programs depending on the linguistic and situational contexts in which it is uttered. Text and context often modulate the meaning of sentences in similar ways, but we will consider them separately.

Procedural semantics

201

First, the circumstances of an utterance. There are, indeed, numerous linguistic expressions whose interpretation depends on a knowledge of when the sentence they occur in was uttered, the participants in the discourse, and what was going on in the real world. Philosophers call such expressions indexical; linguists call them deictic. Among the more obvious deictic elements are tense, personal pronouns, and certain locative expressions that depend on a speaker’s point of view. The truth or falsity of a sentence such as You are standing in front of a rock obviously depends on the time at which it is uttered, to whom it is addressed, and the relative positions of speaker, addressee, and rock at the time of reference. Even the interpretation of nouns, verbs, and other major categories of words can be influenced by circumstance. What you understand by the assertion I’ve added a new element to the group may well depend on whether the speaker was John Lennon or John von Neumann. Next, consider linguistic context. It is bound to create problems for any theory that does not look beyond the bounds of sentences. Here, for exampie, is a simple narrative: When John went to New York, he saw the Empire State Building. He saw Central Park and he visited the Guggenheim Museum. The point to bear in mind is the role of tense. The clause he saw the Empire State Building is in the past tense. One function of the past tense is to indicate that the time of an event is in the past with respect to the time of utterance, but not just at any such time: it demands a definite reference time with which it can identify the time of the event. In the example, the reference time is given by the subordinate clause: it is when John went to New York. A whole chain of sentences can be tied to a given reference time, and contextual effects can accordingly spread like a benign infection from one sentence to the next, and so on. However, a quarantine zone can be established in various ways, as Longuet-Higgins (1972) has pointed out: by changing the tense of the verb or by any other device that introduces a new reference time. For example, if the narrative continues: When he goes to New York next time, or He will go to New York in October, or He has fond memories of the city, then the listener grasps at once that the reference time has changed. Notice that exactly the same sorts of contextual effects are created when the discourse is hypothetical: If John went to New York, he saw the Empire State Building. He saw Central Park and he visited the Guggenheim Museum. Or when the discourse is plainly counterfactual: Zf John had gone to New York, he would have seen the Empire State Building. He would have seen Central Park and he would have visited the Guggenheim Museum. In both these examples each subsequent clause refers back to the hypothetical reference time established in the first clause.

202

Philip N. Johnson-Laird

We have described contextual effects as a benign infection. This, of course, is only partly true. An inappropriate reliance on context can have a disastrous effect. You may recollect the White Rabbit’s evidence in AZice in Wonderland with its complete failure to establish antecedents for its pronouns: They told me you had been to her, And mentioned me to him. . . I gave her one, they gave him two, You gave us three or more; They all returned from him to you, Though they were mine before . . . and so on, quite incomprehensibly. Yet, strangely, sometimes when the most irreparable damage seems to have been done, with whole appendages of a sentence dropping off, context can restore the missing information. For example, an utterance such as:

Did Fred? seems rather odd in isolation. But provided it is preceded by an appropriate context, the missing appendage may be readily regenerated: Did Charlie pass the exam? Yes. Did Fred? Such are some of the problems of deixis and context. What are their solutions? A solution sometimes suggested is to replace any element that depends on deixis or context with an equivalent element that does not. This stratagem turns out to be extraordinarily difficult to accomplish except in the case of such eternal verities as Two plus two equals four where it is not necessary anyway. It typically takes a sentence like He lived there at that time and yields such sentences as: Christopher Marlowe, the English poet and dramatist, resides in the Italian Town of Padova on October 15th 1582, and so on to any degree of detail that might be considered necessary to isolate the sentence from the perils of context and deixis. This example fails at once, however, because the date October 15th 1582 does not refer to a unique day in recorded history. At that time both the Julian and the Gregorian calendars were in operation in different places and were ten days out of phase. Thus, even the interpretation of dates can be deictic. If there is ever any form of inter-stellar communication, this problem will become particularly troublesome.

Procedural semantics

203

Model-theoretic semantics, whatever its virtues for handling points of reference and circumstances of use, is not the most natural way for a psychologist to pursue the epidemiology of contextual effects. A more plausible method is to keep a record of reference time, participants in the discourse, recently mentioned individuals and events, and so forth, and to ensure that this record is kept up to date. Theories of this sort, couched in procedural terms, have been advocated by various theorists (see, for example, LonguetHiggins, 1972; Isard, 1975). We have mentioned that the compiler must have access to a lexicon, but we have yet to consider what information the lexicon should contain, or how it should be organized. A proponent of model-theoretic semantics has argued that “we should not expect a semantic theory to furnish an account of how any two expressions belonging to the same syntactic category differ in meaning” (Thomason, 1974). This statement may be an exaggeration, but it is unlikely that a model theoretician will be concerned with the difference in meaning between, say, table and chair, or between move and push, Such matters are very much the business of psychologists, and we have recently advanced some detailed arguments in favour of a procedural approach to them (Miller and Johnson-Laird, 1976). We took the view that a lexical concept interrelates a word, rules governing its syntactic behaviour, and a schema. A schema is made up from both functional and perceptual information, and may well include information that has no direct perceptual consequences. Moreover, lexical concepts are interrelated to one another. They are organized into semantic fields that have a conceptual core which reflects a deeper conceptualization of the world and integrates the different concepts with the semantic field. One purpose of such an organization is to create a taxonomy that enables entities within the field to be correctly categorized and readily named. Consider how we might represent the meaning of a simple word like chair in order to fulfill such demands. The simplest possible way would be to include a list of exemplars of all possible chairs, or rather to specify a predicate that would do so. A logician postulates the existence of such predicates by fiat; a psychologist must specify the details of a procedure that will square with what is known about the processes of perceptual identification. It is feasible that one recognizes a chair of a familiar shape by matching a mental representation of a three-dimensional prototype against the visual image, perhaps in the manner described by Marr and Nishihara (1976). However, “chairness” is not a simple property. Some chairs are more prototypical than others; a simplistic hypothesis would fail to do justice to the patent absurdity of some of Carelman’s (1969) more exotic creations. A procedural definition of chair, then, is likely to involve a moderately com-

204

Philip N. Johnson-Laid

plicated schema, which will include information about the function of chairs, if only to make it possible to identify novel designs outside the perceptual repertoire of the observer. One can easily imagine some such routine that would test whether the entity in question is u stuble manmade object huving as a proper part a surface intended to support someone sitting on it und as a proper part another surface intended to provide a rest for the person’s buck. Of the many problems raised by this approach to the lexicon, just two will be considered here: the problem of knowledge, and the problem of selectional restrictions. It turns out that these two problems are interrelated and both hinge on the concept of possibility (see Miller, In press). In specifying the function of some artefact such as a chair, it is evident that one passes from the world of what is actual to the world of what is possible. An object serves the function of a chair because, among other things, it is possible to sit on it and to rest against its back. How, then, are we to accommodate the concept of possibility? A model-theoretician postulates a set of possible worlds (of which the actual world is a member) and specifies an accessibility relation between them. It then becomes feasible to carry out a semantic analysis of the modal notions that occur in natural language. Thus, to borrow an example from David Lewis (1973), the counterfactual sentence, If’ kangaroos had no tails, the-c’ would topple over means roughly: in any possible world in which kangaroos have no tails, and which resembles the actual world as much as kangaroos having no tails permits it to, kangaroos topple over. The statement of such truth-conditions is an exercise very remote from the way human beings actually evaluate counterfactuals. Once again, the logician is postulating the existence of a function bJ> fiut, and it is the psychologists’ task to specify the details of its operation. One obvious constraint is that with the exception of a few highly restricted domains such as tic-tat-toe no human being is capable of considering, let alone evaluating, anything more than a very restricted subset of possible alternatives to a given state of affairs. One constraint on the subset to be considered is obviously an individual’s general knowledge relevant to the given state of affairs. Another constraint is the nature of the heuristic that enables certain inferences to be drawn from a combination of general knowledge and specific circumstances. Such a heuristic is necessary if only because logic cannot determine which inference should be drawn, but only whether or not a given inference is valid. What we really need is an account of the mental processes - the heuristics - that allow inferences about whether something is possible. Let us consider a specific example: suppose someone asks you, Is it possible for J’OU to touch me? One part of your general knowledge that is relevant might be

Procedwal semantics

20.5

expressed by the generalization: if you want to touch someone, stand next to them and raise your arm in such a way that it makes contact with their body. Another relevant item of information will surely be that you are able to raise your arms in the appropriate manner. These two items of knowledge could be represented in memory in the following way: [ LIFTARM(EGO)]

which is a simple assertion that you are able to lift your arm in the appropriate way, and by the following PLANNER-like procedure: (GOAL(TOUCH(X,Y)) GOAL(NEXT(X,Y)) AND GOAL(LIFTARM(X))) This says roughly that if the goal is for X to touch Y, then there are two subgoals to be achieved: first, ensure that X is next to Y, and then ensure that X lifts an arm in an appropriate way. The ordering of these goals is important: it is no use raising your arm if you are not next to Y. Let us suppose that you have appraised the situation and that you are next to the speaker. This fact will be represented by the following item in your updatable record of circumstances - we can ignore here the apparatus for dealing with time: [ NEXT(EGO,SPEAKER)

I

The speaker’s question was: Is it possible for you to touch me? which might be compiled as the following goal: POSSIBLE(TOUCH(EGO,SPEAKER)). When this program is executed, the instruction POSSIBLE would elicit a procedure that would try everything it could in order to derive the goal, and, in particular, it would search all the GOAL-procedures in the knowledge base for ones that match the pattern of the desired goal. In this way it would discover the procedure we defined earlier, and the values EGO and SPEAKER would be assigned to the local variables in that procedure. This would create the subgoal: FIND(NEXT(EGO,SPEAKER)). The required assertion is in the updatable record of circumstances. Similarly, the second subgoal is satisfied by the assertion in the knowledge base that you are able to lift your arms. Hence, the test is satisfied and you would be able to answer “Yes”. In some situations Is it possible for you to touch me? would be understood, not as a yes/no question, but as an indirect request for the addressee to touch the speaker. Such subtleties can be handled expeditious-

206

Philip N. Johnson-Laird

ly by procedural systems but to develop that aspect of the theory would require too long a digression. There is a price to be paid for representing knowledge directly in a procedural form: the system demands a separate statement of each different inferential use of any given item of general knowledge (Winograd, 1975). For example, the same item of general knowledge that we have already utilized could also be used to infer that someone must be next to you because they just touched you. But, for this inference we would need to add a new procedure: (GOAL(NEXT(X,Y)) GOAL(TOUCH(X,Y)) OR GOAL(NEAR(X,Y) AND NOT(BETWEEN(X,Z,Y))) OR . . .) In order to avoid such redundancy, it is necessary to postulate three levels of representation. At the bottom level are items of genera1 knowledge expressed as assertions. At the top level are powerful procedures that take assertions as arguments and construct specific procedures out of them. In this way, it would be feasible to express the generalization ifX is lzext to Y crnd X raises at1 arm appropriatcl~~ tllc>n X toztclles Y just once in the knowledge base, and top-level procedures would construct specific procedures out of it. As always, it is easy to draw up the menu, but rather difficult to concoct the recipes; yet it is comforting that this same division between assertions and procedures seems to be necessary in specifying information in the lexicon (Miller and Johnson-Laird, 1976). Despite our ignorance of how such a system should be implemented, it is feasible to specify a general formulation of a procedural approach to possibility (see Miller, In Press): F is possible in the circumstances C if there is a set of procedures K corresponding to a body of organized knowledge such that, if the circumstances C are taken in conjunction with K, F is derived. There are a number of pertinent psychological observations to be made about such a formulation. First, a clear distinction must be maintained between general knowledge and information about transient circumstances. This distinction has already been implicitly established in the decision to keep an updatable record of the circumstances of utterances. Second, a clear distinction must be made between the derivation of F and its actual execution. It is one thing to say that one can touch someone, and quite another to do so. The imperative “Touch me!” would compile the program : ACHIEVE (TOUCH(EG0, SPEAKER)). If you decide to comply, then running this

Procedural semantics

207

program would convert subgoals into states of affairs to be achieved. Three sets of circumstances are relevant to the construction of this program: (a) if you are already touching the speaker, the request was probably not felicitous - a higher-order procedure might lead to the response I alread?, am touching you; (b) if you are not touching the speaker but are next to him, the program involves nothing more than simply lifting your arm, as already described; and (c) if you are not near enough to reach him, the subgoal: ACHIEVE (NEXT(EGO,SPEAKER)) must be evaluated, and your store of knowledge would be consulted for information about procedures required to achieve this goal. If the circumstances are not suitable for a direct execution of a procedure to move closer, then a further sub-goal might be created to modify those circumstances. If this exploration of possibilities eventually leads to a compound procedure all of whose components are known to be possible, you can conclude that it is possible to comply. Let us refer to such a possible procedure as a ‘plan’ (Miller, Galanter, and Pribram, 1960). Whether or not you wish to comply - whether you decide to execute the plan - would depend on other circumstances extraneous to your comprehension of the command. The point, however, is that the process of determining whether or not something is possible corresponds to the development of a plan for achieving it. Conversely, if you can construct a plan for achieving something, you will believe that its achievement is possible. This formulation is not completely adequate as a description of how possible and its cognates are used in ordinary speech. One can certainly imagine a situation in which a person is unable to construct an appropriate plan for achieving a particular goal - either he lacks the information or fails to make the right inference - yet nothing in his knowledge base leads him to conclude that it would be impossible. In such situations people are usually inclined to say that the goal is possible or, if they are being cautious, that it may be possible - they do not know any reason why it is not possible. Moreover, one can also imagine a situation in which it is necessary to achieve the goal, in the sense that the person cannot find any circumstances in which it could be avoided. In ordinary language it is not customary to say that something which is unavoidable is merely possible - it sounds odd to say you can obey the laws of gravity when you must obey them. These refinements do not pose real difficulties for procedural proposals, although they do lead to some complications in the formulation of possible: if a goal cannot be avoided under any circumstances, it is not possible, but necessary; if a goal cannot be achieved under any circumstances, it is impossible; if a goal can be achieved under some circumstances but not under others, it is possible; and if no circumstances are known under which it would be impossible, it is possibly possible.

208

Philip N. Johnson-Laid

This phrasing is deliberately intended to parallel the kind of formulation a psychologically motivated possible-worlds theorist might adopt, but with circumstances-under-which-a-person-knows-what-to-do substituted for possible worlds. The purpose here is not to refute any logical formulation, but rather to propose some plausible psychological mechanisms whereby the obvious logical relations might be realized. If one imagines that a person who has found his goal unattainable under the existing circumstances continues to search through his stored knowledge for circumstances under which it would be attainable, the output of a successful search would be a set of circumstances under which he would know how to proceed, and that output sets a new subgoal. The ontological commitment of the procedural approach is more conservative, and room is left for a person to fail to formulate a plan for achieving his goal. This latter point is of considerable importance, since human failures in solving complex problems are notorious and must be accounted for in any plausible psychological theory. Several sources of such failures are apparent in the procedural formulation. An individual may lack the relevant general knowledge. He may misperceive or be misinformed about the actual circumstances. Most important of all, he may have the relevant information but be unable to derive a plan. The available heuristics that enable human beings to draw conclusions from premises are not such as to guarantee that derivations will always be found even when they exist. There are a number of classic experimental demonstrations of subjects’ inability to grasp what is possible. Maier (1931) showed, for instance, that people are particularly inept in appreciating that an object with one well known function ~ a pair of pliers, say - might be used for an entirely different purpose ~ as the bob on a pendulum. The evaluation of possibilities plays as important a role in procedural semantics as does the evaluation of truth. When sentences are compiled as programs, the obvious question to ask is whether those programs are executable -- whether it is possible to carry them out. It is natural, therefore, that questions of possibility should arise at many points in the system. We have already indicated that the intension of the word chair must include a statement of function along with a description of perceptual properties; even at this level it is necessary to evaluate whether the function is possible with a candidate instance as well as whether the perceptual description is true of it. These lexical questions of possibility are not limited to nouns, of course. They also arise with many verbs. Consider, for example, a verb like lift which, in one of its uses means that the logical subject of the verb does something that causes the logical object of the verb to travel upward. A causal relation between the entities denoted

Procedural semantics

209

by the subject and object is a critical component of this meaning of lift, and the evaluation of such causal relations raises more questions of possibility. If what the subject did caused the object to travel upward, then it must have been the case that, if the subject had not acted as it did, the object would not have travelled upward, other things remaining the same. This latter paraphrase is, of course, a counterfactual, and counterfactuals live in the realm of possibility. If one encounters the sentence The mouse lifted the clephunt, considerations of possibility immediately assign it to a realm of toy animals or animated cartoons. But how do such considerations operate? Let us take the following sentence as an example: The Srnitlzs saw the Rock?* Mountains flying to California. The problem is to explain how one knows immediately that it is not the Rockies that were flying to California. The fact that mountains do not fly is so obvious that you may feel that there could not be any difficulty here. But if you go through the standard account of how the sentence is to be disentangled, you will see that there really is a problem. The standard theory that we owe to Katz and Fodor (1963) would suggest that the verb to fly imposes a semantic restriction on its subject argument, that is to say, it takes only certain sorts of subjects. Such rules are generally called ‘selectional restrictions’. A transparent example of a selectional restriction is the verb to love, which plainly demands that its logical subject denote, at the very least, an animate being. We want to place an analogous restriction on the underlying subject of fly. What should it be? We could start with a simple list: birds can fly, planes can fly, bats, wasps, bees, flies, kites, rockets, locusts, flying fish, and so on, can all fly. But then, of course, plates can fly through the air, or any other sort of object a person chooses to throw. It appears that there is a distinction to be drawn between those objects that can fly of their own volition or are self-propelled and those that are propelled by other forces. This distinction is all very well, but it does not explain how we grasp that mountains do not fly. A mountain is not a bird or a self-propelled vehicle, but how do we categorize the class of entities that can be thrown, carried, or projected through the air? We might start marking all relevant items in the lexicon, but this strategy does not work because it depends on factual knowledge: a plane can fly, but a wrecked plane cannot. And what about people - can people fly? Well, it depends what is meant: they cannot fly by flapping their arms (outside fairy stories, that is), but of course it was the Smiths who were flying to California. An alternative answer, perhaps, is that we simply have as a fact in our knowledge base that mountains do not fly. The trouble with this approach is that it seems to require enormous amounts of negative information to be laid down in memory: walls don’t fly, lawns don’t fly, and

210

Philip N. Johnson-Laird

so on. Multiply this example by all other properties and relations for all selectional restrictions and the system becomes unthinkable. Negative information is probably not stored in the lexicon except for a few obvious correctives to misconception such as that spiders are not insects, or whales are not fish. It seems to be a mistake to concentrate on mountains, so let us reconsider flying. Its meaning can be decomposed into the following components: FLY(X):

(THROUGH(TRAVEL))

(X,AIR)

The critical concept is plainly TRAVEL, since mountains cannot swim, either. What we really need to evaluate is the expression: POSSIBLE (TRAVEL(MOUNTAINS)). Thus, once again, possibility enters the semantic system. There are probably a number of items of general knowledge that will suffice to determine that mountains cannot fly: a mountain is a natural elevation of the earth’s surface having considerable mass (American Heritage Dictionary); massive parts of the earth’s surface are fixed relative to other such parts except in the case of earthquakes; if X is a fixed part of Y then X travels when Y travels, but X does not travel with respect to Y. This analysis nzus~ be on the right lines if only because it proves the opposite of what we want: mountains cc(y1fly, because they fly through space as part of the earth. The inferential system must establish that, if X is flying to California, then X is traveling with respect to a fixed part of the earth’s surface; consequently, X cannot be a fixed part of the earth and therefore (barring earthquakes) X cannot be the Rocky Mountains. Notice that if the circumstances C were to involve massive earthquakes, or other similar world-shattering or mountainunfixing events, they could modulate the evaluation of the function and allow the other interpretation of the participle. It may not be readily apparent that this device of introducing possibility into the assignment of values to a verb’s variables has a profound effect on the status of selectional restrictions. An extreme statement of this effect would be to say that selectional restrictions become totally unnecessary in a way that meets Savin’s (1973) criticisms of them as theoretical entities. Savin pointed out that selectional restrictions seem to be arbitrary excresences tagged on to the semantic representations of lexical items. They are also fixed and determinate. That is to say, if they indicate that there is something anomalous about A chair loved a table, special measures will have to be taken to guard against the same evaluation of the perfectly sensible question: Could a chair love a table? Their determinate nature also insulates them from the effects of circumstance, which makes it difficult to explain why, in a context where the dish ran away with the spoon, there may be nothing

Procedural semantics

211

anomalous about a chair falling in love with a table. The inferential mechanism that has been introduced seems to eliminate the need for selectional restrictions: their apparent effects would arise naturally from the semantic decomposition of verbs. Thus, for example, if one sense of the verb to lift is to do something that causes some other thing to travel upward, then a putative subject is tested to determine whether it could clo something, and a putative object is tested to determine whether it could travel upward. Hence, the selection of arguments for a verb is a direct consequence of the components of its meaning. Likewise, the process is not fixed and determinate but dependent on the circumstances of an utterance. The total elimination of selectional restrictions, however, is too extreme. It would be very inefficient to keep having to make the same inferences over and over again. A more sensible arrangement would be to keep a record of the classes of noun phrase that are invariably accepted as the values of a verb’s arguments. In this way a child might come to learn genuine selectional restrictions and to appreciate that ordinarily, for example, love demands an animate subject and hence that there is something very odd about a sentence like Sincerity loves Richard Nixon. Thus, the present theory might well be construed not so much as replacing the conventional account of selectional restrictions as providing an explanation of how they could be learned in the first place, and how it is that people can cope when a sentence or a circumstance is sufficiently unconventional to fall outside the scope of what has been learnt. One other cautionary note should be sounded: there may well be certain restrictions on the arguments of verbs that are stylistic rather than semantic, and so could not be inferred in terms of possibilities based on general knowledge. So much for our brief glimpse into the operation of a procedural semantics. The emphasis has been on the process of compiling because it raises the question of a procedural analysis of possibility - an approach that should provide an instructive contrast with other modes of semantic analysis. The contrast between model-theoretic semantics and procedural semantics in many ways resembles the contrast within Artificial Intelligence between those who favour a purely declarative knowledge base and those who favour a purely procedural knowledge base. The literature contains extreme examples of the declarative approach. Thus, McCarthy and Hayes (1969) argued at one time for the representation of all knowledge by a set of assertions. These assertions might be mobilized and utilized in problem-solving by some powerful, uniform, proof procedure. At the other extreme, Hewitt (cited by Winograd, 1975) has recently argued for an almost complete representation of knowledge in terms of procedures (or Actors as he now calls them). Neither of these two extremes seems to us particularly plausible as a

212

Philip N. Johnson-Lair-d

basis for psychological theorizing. Hence, we have attempted to argue for a declarative knowledge base coupled to procedures that can convert its constituents into procedures. One indirect consequence of such an approach is that it becomes possible to accommodate an idea that a number of theorists have begun to urge (e.g. Fodor, Fodor, and Garrett, 1975). It may be a mistake to consider that in the normal course of events the psychological representation of the meaning of a sentence exists as a single integral entity. Its integrity is threatened from a number of directions. First, many constituents of the sentence are likely to be directly mapped into pre-existing knowledge. Second, as Isard (1975) has emphasized, it may be necessary to run a program corresponding to one part of a sentence in order to compile a program for the rest of the sentence. This process could well occur in understanding a question such as: Is that man over there the Archbishop of Canterbury? A listener may decide not to look over there, that is, he may decide not to run the program required to find that man over there. But if he does go along with the speaker, then once he has identified the relevant individual he is free to compile the rest of the question and perhaps to attempt to answer it. Finally, it should be emphasized that procedural semantics is more a methodology than a specific theory. There is considerable disagreement among its practitioners even about such fundamental issues as whether or not there are semantic primitives into which meanings of words are decomposed. Nevertheless, the procedural method seems to be particularly suitable for developing psychological theories about the meanings of words and sentences. It has two principal advantages. First, theories lying within its conceptual framework can be readily modeled in the form of computer programs: nothing quite so concentrates the mind as having to build such a model, and the process often leads to new ideas about the theory itself or how it should be tested. Second, it forces the theorist to consider processes. This is a signal virtue in comparison to model-theoretic and linguistic approaches to meaning that tend naturally to emphasize structure at the expense of process. Psychological processes take place in time, and so, too, do the operations of computers. Perhaps the metaphor can be pushed no further than that, but there does not seem to be any other equally viable alternative. References Anderson, D. B. (1972) Documentation Edinburgh University. Bresnan, J. (1976) Towards a realistic Convocation on Communications,

for LIB PICO-PLANNER. model of transformational M.I.T.

School grammar.

of Artificial Paper

Intelligence,

presented

at the

Procedural semantics

213

Carelman, J. (1969) Catalogue d’objets Introuvables. Paris, Andre Balland. Davies, D. J. M. and Isard, S. D. (1972) Utterances as programs. In D. Michie (ed.),Machine Intelligence 7. Edinburgh, Edinburgh University Press. Emends, J. E. (1976) A Transformational Approach to English Syntax: Root, Structure-preserving, and Local Transformations. New York, Academic Press. Fodor, J. D., Bever, T. G. and Garrett, M. F. (1974) The Psychology of Language. New York, McGraw-Hill. Fodor, J. D., Fodor, J. A., and Garrett, M. F. (1975) The psychological unreality of semantic representations. Ling. Inq., 6, 515-531. Forster, K. I. (1976) The autonomy of syntactic processing. Paper presented at the Convocation on Communications, M.I.T. Forster, K. I. and Olbrei, 1. (1973) Semantic heuristics and syntactic analysis. Cog. 2, 319-347. Frege, G. (1892) Uber Sinn und Bedeutung. Zeitschrift fur Philosophie und philosophische Kritik, 100, 25-50. Translated by M. Black as “On sense and reference” in P. Geach and M. Black. Translations from the Philosophical Writings of Gottlob Frege. Oxford, Blackwell, 1952. Garrett, M. F. (1976) Word perception in sentences. Paper presented at the Convocation on Communications, M.I.T. Hewitt, C. (1971) Description and theoretical analysis (using schemas) of PLANNER: a language for proving theorems and manipulating models in a robot. Ph.D. dissertation, M.I.T. Isard, S. (1975) What would you have done if.. .? J. Theoret. Ling., I, 233-255. Johnson-Laird, P. N. (1970) The perception and memory of sentences. In J. Lyons (ed.), New Horizons in Linguistics. Harmondsworth, Middx., Penguin. Johnson-Laird, P. N. (In press) Psycholinguistics without linguistics. In N. S. Sutherland (ed.), Tutorial Essays in Psychology, Vol. I. Hillsdale, N.J., Erlbaum. Katz, J. J. and Fodor, J. A. (1963) The structure of a semantic theory. Lang., 39, 170-210. Kay, M. (1975) Syntactic processing and functional sentence perspective. In R. C. Schank and B. L. Nash-Webber (eds.), Theoretical issues in natural language processing: an interdisciplinary workshop in computational linguistics, psychology, linguistics, and artificial intelligence. Supplement, pp. 12-15, M.I.T. Lewis, D. (1973) Counterfactuals. Cambridge, Mass., Harvard University Press. Longuet-Higgins, H. C. (1972) The algorithmic description of natural language. Proc. Roy. SOC. Lond. B., 182, 255-276. Maier, N. R. F. (1931) Reasoning in humans. II. The solution of a problem and its appearance in consciousness. J. Comp. Psychol., 12, 181-194. Marr,

D. and Nishihara, H. K. (1976) Representation and recognition of the spatial organization of three-dimensional shapes. A.I. Laboratory, Memo 377. Cambridge, Mass., M.I.T. McCarthy, J. and Hayes, P. (1969) Some philosophical problems from the standpoint of artificial intelligence. In D. Michie (ed.), Machine Intelligence 4. Edinburgh, Edinburgh University Press. Miller, G. A. Practical and lexical knowledge. In Rosch, E., and Lloyd, B. (Eds.), Human categorization. Hillsdale, N.J.: Erlbaum. In press. Miller, G. A., Galanter, E. and Pribram, K. (1960) Plans and the structure of behavior. New York, Holt, Rinehart, and Winston. Miller, G. A. and Johnson-Laird, P. N. (1976) Language and perception. Cambridge, Mass., Harvard University Press. Reddy, D. R., Erman, L. D., Fennell, R. D. and Neely, R. B. (1973) The HEARSAY speech understanding system: an example of the recognition process. Third International Joint Conference on Artificial Intelligence. Stanford, Stanford Research Institute, 185-193. Savin, H. B. (1973) Meaning and concepts: a review of Jerrold J. Katz’s Semantic Theory. Cog., 2, 213-238.

214

Philip N. Johnson-Laid

Schank,

R. C. (1972) Conceptual dependency: a theory of natural language understanding. Cog. Psychol., 3, 552-631. Stcedman, M. J. and Johnson-Laird, P. N. (1977) A programmatic theory of linguistic performance. In P. Smith and R. Campbell (eds.), Proceedings ofthe Stirling Conference on Psycholinguistics. London, Plenum. In press. Tarski, A. (1936) Der Wahrheitsbegriff in den formalisiertcn Sprachen Stud. Phil., I, 261-405. Thomason, R. H. (1974) Introduction to R. H. Thomason (cd.), Formal philosophy: selected papers of Richard Montague. New Haven, Yale University Press. Thorne, J., Bratley, P. and Dewar, H. (1968) The syntactic analysis of Isnglish by machine. In D. Michie (ed.), Machine Intelligence 3. Edinburgh, Edinburgh University Press. Wanner, E. and Maratsos, M. (1975) An augmented transition network model of relative clause comprehension. Mimeo, Harvard University. Winograd, T. (1972) Understanding natural language. New York, Academic Press. Winograd, T. (1975) F’ramc rcprescntations and the declarative/procedural controversy. In D. G. Bobrow and A. Collins (cds.), Representation and understanding: studies in cognitive science. New York, Academic Press. Woods, W. A. (1970) Transition network grammars for natural language analysis. Communications of the Association for Computing Machinery, 13, 591-606. Woods, W. A. and Makhoul, J. (1973) Mechanical inference problems in continuous speech understanding. Third International Joint Conference on Artificial Intelligence. Stanford, Stanford Research Institute, 200-207.

RPsurnP Lc but de cct article cst de p&enter le schdma d’une thkorie Gmantique fond& sur l’analogie cntre le langage nature1 et le langage de programmation de l’ordinateur. On d@crit un mod& unique de compr@hcnsion ct dc pcrccption de phrases pour illustrer la m&aphore centralc “compiler et executer” qui sous-tend les s&antiques dcs mCthodes. On ri.analyse a la lumi&c de la thGorie des m&hodes (procedural theory) le rble de la connaissance gdn@rale interne au lexique et du m&anisme arbitrant lcs restrictions Glectives.

Cognition, 5 (1977) 215 - 233 @Elsevier Sequoia S.A., Lausanne

2 - Printed

in the Netherlands

Culture-boundedness

Referential relativity: of analytic and metaphoric communication * t W. PATRICK University

NAOMI University

DICKSON

of Wisconsin

MIYAKE

and

TAKASHI

MUTO

of Tokyo

Abstract Japanese college students were presented an array of 16 abstract referents and asked to describe each figure so that another student could correctly identify it in the arra.y. Each referent was described twice.. once in a metaphoric and once in an analytic mode. A test set of 64 of the descriptions was translated into English. Two groups of Japanese college students responded to the Japanese version and two groups of American college students responded to the English version. The students were asked to choose the referent in the set of the 16 which each description best fit. The two Japanese groups showed highly similar patterns of response to the descriptions, as did the two American groups, but comparisons between the two cultures revealed very different patterns of response. Both metaphoric descriptions and analytic descriptions produced different patterns of response in the two cultures. These results are interpreted as evidence of ‘referential relativism’, which is the effect of language and culture upon referential meaning. The use of referential communication tasks to measure the culture-boundedness of communication is discussed.

*The fist author wishes to thank the Japan Society for the Promotion of Science which provided the grant under which he was a Visiting Scholar at the University of Tokyo during 1974 - 75, and Dr. Hiroshi Azuma who sponsored his visit. This study is one of several carried out by members of a collaborative workshop for research on communication skills which met during this time. Members of the workshop included, in addition to the authors, Giyoo Hatano, Yoshio Miyake, Yuriko Oshima, Naomi Sakata, and Naoki Ueno. All members of the workshop were active in designing and carrying out this experiment. The workshop was an intellectually stimulating experience for all participants. Thanks also go to M. J. Subkoviak for suggestions concerning data analysis. Requests for reprints should be sent to W. Patrick Dickson, Child and Family Studies, University of Wisconsin, Madison, Wisconsin 53706. +We wish to thank R. M. Krauss for furnishing the complete set of abstract figures from which the 16 figures were selected.

216 W. Patrick

Dickson,

Naomi Miyake

ard

Takashi Muto

This is a study of the effects of language and culture on referential communication. Three points are made in this introduction. First, recent work demonstrating considerable universality in the perception, memory, and naming of color does not preclude the possibility of linguistic and cultural effects in other domains. Second, effects of language and culture on thought within culture bear no necessary relationship to effects of language and culture on communication between cultures. Third, referential communication tasks with translated messages may be used to study the effects of language and culture on communication. Brown (1976) has given a lively account of one line of research on the Whorf-Sapir hypothesis in the color domain. He concludes his review by saying, “The fascinating irony of this research is that it began in a spirit of strong relativism and linguistic determinism and has now come to a position of cultural universalism and linguistic insignificance” (p. 152). But this conclusion is based primarily upon research on color, and “far from being a domain well suited to the study of the effects of language on thought, the color space would seem to be a prime example of the influence of underlying perceptual-cognitive factors on the formation and reference of linguistic categories” (Heider, 1972, p. 20). Given the diverse interpretations placed upon the Whorf-Sapir hypothesis, it is important that the reader understand that this paper is primarily concerned with the effects of language upon communication. Whether the Whorf-Sapir hypothesis was meant to apply directly to communication is subject to conflicting opinions, even between the foreword and introduction to the collection of Whorf’s (1956) writings. In the foreword Stuart Chase states that Wharf “flatly challenges” the view that “a line of thought expressed in any language could be translated without loss of meaning into any other language” (p. vii). A few pages later in the introduction, John Carroll gives a different interpretation: “Surely, at any rate, it would have been farthest from Wharfs wishes to condone any easy appeal to linguistic relativity as a rationalization for a failure of communication between cultures or between nations” (p. 27). Whatever the “true” interpretation of the intention of Benjamin Whorf, suffice here to say that his writings have had considerable influence upon research concerned with the question of possible effects of linguistic relativity upon communication. In absolute form this question may be phrased, “Can we never really understand or communicate with speakers of a language quite different from our own because each language has molded the thought of its people into mutually incomprehensible world views?” (Rosch, 1974). Of course, almost no one (including Rosch) really expected such an absolute effect. Indeed, such all-or-none phrasing may divert us from the

Referential Relativity 217

more researchable question: To what extent and in which domains is communication between cultures likely to be inaccurate or inefficient? Anything which can be expressed in one language probably can be expressed in another language, at least for referential expressions, but “what is expressed easily, rapidly, briefly, uniformly, perhaps obligatorily in one language may be expressed in another only by lengthy constructions that vary from one person to another, take time to put together, and are certainly not obligatory” (Brown, 1976, p. 129). The assumption that different world views would preclude accurate communication between those world views deserves challenge. Although differences in language or thinking increase the probability of miscommunication, effects of language on thought bear no necessary relationship to effects of language on communication. Two cultures (or individuals) may have different world views (or more narrowly, different referential meanings for given expressions), yet be able to communicate accurately. Conversely, identical world views will not be mutually comprehensible in the absence of communication systems to map them onto each other. An awareness of the lack of congruence in meanings is an essential prerequisite for the development of exact systems of translation, and one motivation for research on linguistic relativity is the desire to expose such incongruence. But if the color domain has not revealed substantial effects of language on the “formation and reference of linguistic categories,” what domains might be more sensitive? In the referential domain (that is, things which can be pointed at), two hypotheses may be put forth. First, cultural and linguistic effects are more likely with referents which are cultural artifacts such as symbols, tools, and pictorial representations (Arnheim, 1969). Second, such effects are more likely with referents which are informationally complex (Garner, 1962) because, if for no other reason, little variance in referential performance may be expected with simple referents. The trick, from an experimental standpoint, is to find referents which are not culture bound and look for differences in referential meaning when linguistic tools which do not appear on the surface to be culture bound are deployed upon them in a referential communication situation. Abstract line drawings have this property of being culture-free referents because they are novel in all cultures, and a cross-cultural adaptation of referential communication tasks provides a way of obtaining language samples which appear culture free. Referential communication tasks require one person to describe a referent (object, symbol, or behavior) such that another person will be able to select that referent from the set of possible referents. Referents have ranged from pictures with readily named characteristics (Shantz & Wilson, 1972) to

2 18 W. Patrick Dickson, Naomi Miyakc and Takashi Muto

abstract referents (Krauss & Glucksberg, 1969; Dickson, 1974; Dickson, Nagano, Miyake, & Oshima, 1976). These tasks have the advantage of making meaning overt in terms of the response of the listener. Research on referential communication has been recently reviewed by Glucksberg, Krauss, and Higgins (1975). This method may be adapted for research on communication between cultures. If a description of a particular referent in an array of referents is produced in one language and translated literally into another language, then the responses of a sample of listeners in the two cultures can be compared. (In simplest form, the proportion of listeners choosing the intended referent would be compared.) Where the pattern of responses produced was significantly different, this might be taken as evidence of “referential relativity”, which is the effect of language and culture upon referential meaning. Fundamental to this approach is the definition of “literal” translation. Obviously, translation of some kind is essential for communication between different languages. A literal translation is one which seeks to maintain oneto-one correspondence in the translation of information bearing words, while permitting those changes in word order and syntax necessary to make the translation conform to the grammatical requirements of the language. It is important to emphasize that this definition of literal translation has a quite different purpose than the more usual attempt at finding “translation equivalent” expressions (Osgood, May, and Miron, 1975, p. 15). Rather than seeking translation which is referentially equivalent, the use of literal translation is directed toward a prior issue: the identification of messages requiring non-literal translation in order to produce equivalent responses in the original and translated forms. More specifically, this approach is an attempt to measure how much messages which appear to use a translationally identical lexicon differ in referential meaning. The degree to which a given referential message is “culture bound” may then be quantified in terms of the extent to which a literal translation of the message produces a significantly different proportion of intended responses in one culture than the original message produced in the culture of origin, controlling statistically for overall performance. The need to control for overall performance results from recognition that different cultures cannot be easily compared on overall measures. This problem can be circumvented by abandoning “research designs whose emphasis is ‘main effects’ of culture per se” (Rosch, 1974, p. 107). Therefore, this approach uses a large set of translated messages within which to look for specific messages which are culture bound, regardless of overall level of performance in the two cultures. The degree to which messages are culture bound may depend upon the style of encoding. Two commonly used styles of encoding are metaphoric

Referential Relativity 219

and analytic (Heider, 1971; Werner & Kaplan, 1964). Metaphoric descriptions tend to be holistic and inferential and refer by saying what the refeuse technical descriptions of the rent “looks like”. Analytic descriptions parts of a figure such as “it has points at the top and a straight line at the bottom.” Metaphoric messages, drawing on the concrete experience of individuals, might be expected to be more culture bound, while analytic messages might be expected to be more universal. In the present study a set of analytic and metaphoric messages are translated from Japanese into English and presented to two groups of listeners in each culture. The patterns of response within and between cultures are examined with a view to identifying specific messages which appear culture bound.

Method Materials

As a part of another experiment, 16 abstract figures (Figure 1) originally used by Krauss and Weinheimer (1966) were presented to 32 college students in Japan. The students were asked to write a description of each of the 16 figures “so that another student would be able to choose the one you describe”. The students described each figure in two modes: ‘analytic’ and ‘metaphoric’. In the analytic mode the figures were described in such terms descriptions as ‘lines’, ‘points’, ‘curves’, and so on. The metaphoric characterized the figures in terms of what the figures ‘looked like’. The 1024 messages generated by this procedure were then decoded by 20 different Japanese college students, each description being decoded by 5 students. Based upon these decodings, each description could be given an effectiveness score of from 0 to 5 according to how many of the decoders successfully identified the intended referent. From the pool of 1024 messages a set of 64 descriptions (2 analytic and 2 metaphoric for each referent) was selected so as to form a test composed of messages which had been decoded correctly, on average, by approximately 50 percent of the decoders, an optimal level of difficulty for a sensitive test of decoding performance (Dickson, 1976). These 64 descriptions were translated into English on the basis of careful discussions among several Japanese fluent in English and the American author. The goal was a literal translation, especially avoiding the addition of information. In some cases this was straight-forward as in the message translated as “A shutter of a camera”. Other descriptions were more difficult to translate for several reasons. First, the descriptions were deliberately selected on the basis of having been difficult to decode accurately in Japanese. Conse-

220 W. Patrick Dickson, Naomi Miyake and Takashi Muto

Figure 1.

Abstract figures referent set.

many of the messages were ambiguous and incomplete in Japanese. Second, the Japanese language does not use plurals and articles, although a Japanese listener can frequently infer implicit plurals and articles from the context. In order to avoid unnatural English, articles and plurals were added where implied in the Japanese. For example, the literal “two circle on top” was translated as “two circles on the top”. The 64 descriptions were then used to prepare a Japanese and English version of the test booklet (Dickson, Miyake, & Muto, 1975). quently,

Subjects

A total of 150 students from four colleges, two in Japan and two in the U.S., served as subjects. The two Japanese colleges are located in the Tokyo

Referential

Relativity

221

metropolitan area (43 and 67 students), and the two American universities are in California and Wisconsin (20 students each). These students were given the test booklet in their native language and asked to record on an answer sheet which of the 16 figures they thought each of the 64 messages described best. The task required about 40 minutes to complete.

Results

The results will be presented in three sections: overall between-country comparisons, item analysis, and examination of selected items showing culture-boundedness. Ovrrull betwectl-country

compurisons

The overall performance of the groups in the two countries is not a central issue in this paper. We need to examine the overall level of performance only to consider whether it is similar enough to justify a more detailed examination of individual items. If an overall difference in level of performance is found between the two countries, then we must control for this difference before looking at differences on individual items. We are, however, interested in the predicted interaction of type of description (analytic versus metaphoric) with culture. Performance was scored according to whether the decoder chose the referent intended by the encoder: an intended response was scored as 1 and other responses as 0. Summing across items, the overall performance could range from 0 to 64, while performance on the two types of descriptions could range from 0 to 32. The mean scores for Japan and the U.S. respectively were: overall (34.47 and 32.40), analytic messages (17.79 and 17.42), and metaphoric messages (16.68 and 14.98). Analysis of variance for country with type of message as a repeated measure showed an overall tendency for messages to be decoded as intended more often in their original Japanese, F( 1,148) = 6.55, p < 0.00 1. Although not a central concern of the study, the similarity of overall level of performance suggests that the overall informational content is roughly equivalent. One might have expected rather substantial differences between the two language versions, insofar as difficult messages regarding abstract figures were translated with the conscious attempt to avoid adapting the messages. The similarity of the effectiveness of the messages suggests an underlying similarity of the information processing requirements in the two cultures. Analytic messages were decoded more accurately than metaphoric, but no significance should be attached to this result because the messages were not

222 W. Patrick Dickson, Naomi MiJ)ake and Takashi Muto

sampled randomly from the larger pool of 1024. Some evidence of the predicted interaction of message type with culture was found: Compared to analytic messages, metaphoric messages were somewhat more likely to be decoded in the culture of origin than in translation, F( 1,148) = 3.39, p < 0.07. Item

atlall*sis

Independent of the question of overall performance is the question of culture-boundedness of individual messages. Are some messages more accurately decoded in one culture than another? One way to examine this question has been described by Angoff and Ford (1973). Basically, one correlates the proportion of subjects choosing the intended response for each item in two groups. (Angoff and Ford used an adjusted score which was not used here.) These proportions can range from 0.0 to 1.0. If the two groups are responding similarly to each item, then the correlation of these item scores should be high. If not, the correlation should be low. The correlation is 0.93 between the two Japanese samples and 0.91 between the two American samples, indicating that the two groups within each country are responding quite similarly to the 64 referential messages. The scatterplot for the two Japanese samples is shown in Figure 2. The scatterplot for the two American samples (not shown) is virtually indistinguishable from Figure 2. A very different result is produced when we plot the proportions of intended responses for the combined groups in each country (Figure 3). The contrast between Figure 2 and Figure 3 is striking evidence that these referential messages are eliciting different responses in the two cultures. The dispersion of points in Figure 3 is the visual representation of the loss of meaning in translation. The correlation for Figure 3 is only .75. Items lying furthest from the best fitting line for Figure 3 are items with the greatest degree of culture-boundedness. Before proceeding to examine individual messages, we should discuss one possible explanation for the apparent referential relativism displayed in Figure 3 which might be thought to reflect simply a lack of fidelity in translation. Although translator should be included as a factor in an ideal experimental design, translation does not appear to account for the present results. During the initial translation of the messages, two independent translations were prepared. The two translations were so similar that the expense of including both forms seemed unwarranted. Further, those items which were found to show strong evidence of referential relativism were translated by a third translator. Again, the translations were almost identical: “a shutter of a camera” is “a shutter of a camera.”

Referential Relativity 223

Figure 2.

Plot of 64 items on proportion of intended responses in two Japanese samples.

1.0

13 I

1 I

9

I

I 1

I

.a

I 1

I 1

.7

I

d

1 I 1

P z

I

1 1

2 g

1

I

1 r(

I

1

.6

1

I

11

I

% 4

.5

2

1 I

b ‘u

I 1

g

1

I I

.4

1

1

11

.k! :: & .3

2 I

i .2

I

I

II

1

I

.I 2 0.0

0.0

.I

.2

.3

.4

.5

.6

.7

.a

.9

1.0

Proportions for Japanese Sample 2

Examination

of selected

items showing culture-boundedness

In order to identify those messages which differed significantly between cultures, controlling for overall performance, the following procedure was used. First, the mean proportion of intended responses for each item for each individual was subtracted from each item, yielding a within-person deviation score. Betweencountry t tests were then performed on these deviation scores for each item. Those items found to differ between the two countries at the .Ol level were considered to show evidence of culture-boundedness. In effect this approach identifies those items in Figure 3 which lie furthest off the best fitting line.

224

W. Patrick Dickson, Naomi Miyake and Takashi Muto

Figure 3.

Plot of 64 items on proportion of intended responses in Japanese versus American combined samples. 1

I

,

I

1.0

.9

31 I

2

cg

11

.a

I i

G: g

1

.7

z % g

II 1 1

.6

1

I

$ 5

.5

I

8 " 2 k

I

1

1 I

.4

2

2 .4 u 8 e p

.3

I

I

I 1

.2

6:

I

1

I

.I

2 0.0

0.0

1

I

1

1

I

1

I

1

I

.I

.2

.3

.4

.5

.6

.7

A

.9

I

1.0

Proportions for Combined American Sample

A total of 15 of the 64 descriptions, 7 analytic and 8 metaphoric, were found to be culture bound according to this criterion. In contrast, the same test, applied to the within-country comparisons showed only 1 description out of 64 to differ significantly between within-country groups - exactly what would be expected by chance. At this point we would ask the reader to look again at Figure 1 and choose which referent looks like “the upper part of a rose looked down on.” Also find “the shutter of a camera”. You may compare your responses with those presented in Table 1 for the culture bound messages. Beside each message is the intended referent and the percent choosing this response.

Referential Relativity 225

Other responses chosen by more than 5 percent of the students are also shown. “The upper part of a rose looked down on” was seen most often as referent 3 in the U.S. and referent 5 in Japan. Choices other than 3 or 5 were infrequent in either culture. For some of the messages a plausible explanation comes readily to mind. Many Japanese children wear uniforms with “school badges” on them and the “abacus” is more common in Japan. Similarly, a “small wine cup” (or sake cup) has a shape peculiar to Japan. Not surprisingly, these examples of what might be called “obviously” culture-bound descriptions elicit more accurate responses in Japan than in the U.S. Less readily explained is the fact that “a shutter of a camera” is responded to more accurately in the U.S. than in Japan. Cameras in the U.S. look remarkably similar to Japanese cameras, frequently down to the brand name. Nor is it easy to understand why an environmental universal such as a “broken egg shell” would evoke different metaphoric associations in the two cultures. Post hoc discussions have suggested plausible explanations for some of these “subtly” culturebound descriptions. In the U.S. soft boiled eggs are sometimes eaten in egg cups so that broken egg shells may be seen in the upright position more often there. In contrast, boiled eggs are usually peeled and eaten by hand in Japan in such a way that an upright, intact half of an egg shell may be less often seen. Similarly, the archetypical “stomach” in the U.S. may be strongly influenced by an often repeated television advertisement for a headache medicine, though it is arguable which came first, the advertisement or the “stomach”. The occurrence of large differences in response patterns to the analytic messages shown in Table 2 is more puzzling. Some of these differences may result from changes in translation due to the addition of articles and plurals in the English version. For example, “a crooked triangle whose bottom part is missing” unambiguously refers to one triangle, whereas the original Japanese could refer to a number of triangles because articles and plurals are not normally used in Japanese. The fact that many Japanese chose referent five with its three incomplete triangles supports this interpretation. It is doubtful however that all of the differences may be explained this way. Perhaps the archetypical “triangle” or “curved line” is different in the two cultures, but further research is clearly needed. The most important point to make about these observed differences, however, is not that they cannot be explained post hoc, but rather that many of them could not be readily predicted a priori, even by individuals with considerable experience in both cultures. Many models of referential communication performance include an editing phase during which a speaker considers the probable responses of the listener to a potential message before

(8) 91

aql II! Op pue uvdel

u! 0 I I 30 ino ~uara3a1 sly1 %ysooqa

(11) 9 (SZ) 9

(LO E (or) E

(SP) II (EC) II

(8) Zl

(LL) L (SP) L

(OS) Zl (EL) ZI

(9Z) E I ts1 El

(9s) PI (88) VI

ueder

mder

-gn

(11) El (01) EI

(L) 8 (81) 8

(ZE) 9 w 9 (61) 9 (OE) I

(9L) s (EE) s

(SI) s (80 S

uedef

uedef

3.n

xn

Xn

(9) z (01) z

(61) Z (OZ) L

wdef

(01) 9

(PI) 6 (OZ) 6

COP) 21

(8) PI

(8) P &I) El

(9) P (OZ) 6

(8) L &I) P

(LI) 01 (8) L

(PI) E COP) E

sa%ssauc +oydviaw

iuamd

IIOIUS

~SIa%lJ

u! am%y

,,J!M

~2%UIpIOH

OMJ

‘naqs %%a e a*I i%uqiamos

s! sasaqwmd

XI[M

uayorq

dnj

v

le paqooI

wd%uyawM pappg

v

Memaps

e 30 rallnqs

mawe3

B

‘UO UMOP pay001 as01 30 ped raddn aql

aarql

.I. 9IVL

ys$lu~

1ooq= v

‘speaq smeqe

ql!Ma8peq

uo~ppllu~

01 suAailvd asuodsax

CClWlO3

3.n

-sm

-3-n

(91) z (8) 01

(9) I (01) I (01) 91

I

uvdef tss1

(99) E (SZ) E (01) 6

tpy~

&,s<) sasuodsar uounuo~ layjo

pm mdvf uaahwq il~lum@i~~s la&lp

papuaw

(OE)I

(01)91

(8) 11 (8) Z

(El) L

ayl

(9) 6

‘g’fl

Rejkrential

Table 2.

Relativity

221

Response patterns to analytic messages which dijrer significantIll between Japan and the U.S.

English Translation

Country

Three things whose heads are round each stick out.

U.S. Japan

1 (88) 1 (63)

9 (20)

16 (IO)

It is symmetrical and curved lines are drawn upward from both bottom ends.

U.S. Japan

2 (40) 2 (21)

8 (30) 8 (32)

14 (13)

7 (8)

7 (15)

14 (13)

Nearly a regular triangle and three shapes are identical and the bottom of each triangle has no line.

U.S. Japan

5 (88) 5 (36)

7 (8) 1 (58)

It is symmetrical. The top part is somewhat round. The bottom part has convexes on its left and right.

U.S. Japan

8 (40) 8 (79)

6 (38) 2 (8)

1(10) 6 (6)

2 (8)

A crooked is missing.

U.S. Japan

10 (73) 10 (19)

7 (13) 5 (40)

16 (8) 7 (34)

From top to bottom there are two halfcircles. The lower part is a straight line and cut.

U.S. Japan

14 (43) 14 (85)

4 (18)

16 (18)

8 (JO)

9 (8)

There are two things sticking out loosely on the left and right sides on the top and there is a thing sticking out loosely on the bottom.

U.S. Japan

15 (55) 15 (29)

4 (15) 8 (28)

13 (13) 13 (13)

9 (10)

4 (8)

Figure

this rcfercnt

triangle

in parentheses

whose bottom

is percent

part

choosing

Intended

Other common responses (>5%)

out of 110 in Japan

6 (8)

and 40 in the U.S.

deciding to speak it (Flavell, Botkin, Fry, Wright, & Jarvis, 1968; Glucksberg et al., 1975). The different response patterns produced by these messages (both within and between countries) make it seem unlikely that speakers could accurately assess the probable responses of a group of listeners to messages such as these.

Discussion Some literally translated referential messages mean different things in different cultures. “A rose” is not “a rose”, at least when “looked down on”. The lexical equivalents appearing in two-language dictionaries do not

228

W. Patrick Dickson, Naomi Mi)~akc ami Takashi Muto

map identically onto each other. “A shutter of a camera” does not call forth the same associations in two cultures in which cameras are common. For some messages the explanation is obvious, for others subtle or unknown. How may these results be related to the findings of linguistic insignificance in the domain of colors? First, referential relativism deals with communication rather than thought. The present technique provides a rather precise way of studying this influence. Communicability scores, in addition to the theoretical advantages discussed by Brown (1976, p. 14.5), are more reliable measures on purely psychometric grounds than measures based upon responses of single individuals. Second, this study uses complex abstract line drawings for referents in contrast to color and simple geometric forms often used. One would predict greater influence of culture upon those domains which are cultural products, where lexical differentiation would appear to be strongly influenced by culture. Not only do cultures exist in natural worlds which differ, but also they increasingly modify those natural worlds in culturally determined ways, especially by means of the visual media (Arnheim, 1969). Third, the language being mapped onto these abstract line drawings is more complex in some sense than is the language of color names. Metaphoric descriptions do not relate to the abstract referents along fixed dimensions the way color names relate to the color space. Analytic descriptions call for the application of a sort of mental tracing which appears also to differ from color naming. These sets seem even fuzzier than has been dreamt of in fuzzyset theory (Anglin, 1977, p. 7). An analogy may be drawn to the distinction between computer hardware and software. Research on color perception (to mention only one area) has well established that human beings around the world have much the same “hardware”: Our eyes and brains seem to be wired up identically. Similarly, this research tends to support the view that the “software” of color naming does not substantially influence such hardware functions as perception. But it may be going too far to conclude that such research demonstrates linguisic insignificance. The software of a culture (lexical and algorithmic), at the very least, influences the efficiency with which members of culture can communicate in different domains. And cognitive processes such as memory may, to some extent, be seen as intra-individual communication. The relationship of this study of cross-cultural communication using abstract referents to research on linguistic relativity using color chips may be placed in a broader framework. Both language and referents may be categorized according to the extent to which they are culture bound. Certain words and things which are obviously culture bound (common in one culture, rare in the other) may be distinguished from those which appear to be

Referential

Relativity

229

culture free (similar frequency of occurrence in the two cultures). But the apparently culture-free category may include instances of subtle cultureboundedness, given the impossibility of proving anything culture free in all domains. Thus, the culture-free category must always be tentative. Within the culture-free category one may distinguish between those instances which occur with equal frequency because they are novel in both cultures and those which occur with equal frequency because they are common in both cultures. A conceptual scheme relating referents and language according to these categories is presented in Table 3 which includes examples of types of referents and language. Several observations may be made by inspection of Table 3. First, cells 1 and 2 are of little use in research on linguistic or referential relativity. Noone questions that reference is easier with a culturally shared nomenclature (cell 1) or that reference by a culture-bound metaphor will translate only with difficulty (cell 2). Research on linguistic relativity has dealt with the impact of culture-bound language on cognitive processes with culture-free referents in such areas as memory and classification (cell 3). The present study is located in cells 5 and 8 of Table 3 which contain the intersection of culture-free referents and culture-free language. The logic of research in these cells is as follows: Take referents which appear on the surface to be culture-free, select encodings which appear on the surface to be culture-free, and translate these encodings into a second language. Different patterns of response in this supposedly culture-free communication process are then taken as evidence that the surface appearance of being culture-free was misleading for either the referents, the language, or both. Space does not permit complete discussion of the suggestions for further research implicit in other cells in Table 3. For example, cells 4 and 7 point to a design in which Japanese and American speakers would be asked to communicate about a set of Japanese ideographs “without naming them but so that another speaker will know which one you mean.” Different theoretical models of communication would yield different predictions as to whether greater familiarity would facilitate or interfere with performance. Several other extensions of this work would seem likely to bear fruit. Individual differences in response patterns deserve attention. All too often cross-cultural research is concerned with group differences, and this focus causes individual differences within culture to be neglected. Yet cultures are ensembles of individuals. The dispersion of choices for such expressions as “a shutter of a camera” in both cultures points to the greater within-culture than between-culture variance for some messages. Research using color names might benefit from greater attention to individual differences. Rosch (1974, p. 107) suggests looking for two cultures

I

-1

r

Table 3.

i

t

-

L

l-

: “two small lines in the corner pointing up”

7

Language: “has a straight line on the bottom!

Language: “like a broken egg shell”

4

Language : “eyelashes on the left”

Language

Language: “like a kappa’s hand”

1

Language : “kappa” (mythical Japanese elf unique to culture)

2

8

5

red color

in

Language: “a darker color and more intense”

Language: “like blood”

9

6

3 Language: “red” (where one language lacks words for “red”)

chip

Referent:

i’j

Referent:

common

(abstract)

Equally

Bound?

Referent:

Culture

(ideographs) b

novel in both

or Subtly

both cultures.

Equally

Culture-Free

r cultures.

g

in one culture,

Common

rare in the other.

Culture-Bound

Obviously

REFERENTS

Conceptual scheme for the interaction of language and referents

Referential

Relativity

23 1

which have different numbers of color terms for two different regions in the color space and examining effects on memory. Given the large individual differences within one culture, it might be simpler to test a large sample of people and select individuals with different patterns of association to color names. Shifting our focus from single messages to individual performance over the set of 64 messages, we find large individual differences in total number of messages responded to as intended (“correctly”, in one sense). Why is it that some individuals in both cultures are better able to go “beyond the information given” (Brunei-, 1957/73)? These individual differences may result from an unconscious stochastic sampling of experience (Rosenberg & Cohen, 1968) or more effective information processing strategies. In discussions of the instrument with the individuals who served as subjects in this study, the authors were struck by the surprise expressed by some individuals at the choices of other individuals. Spirited discussions ensued as individuals sought to explain why they thought their choice looked like the description - explanations whose receptions ranged from “Ah, ha. I see”. to incredulity. (One curious implication of the fact that many of the associations could not be anticipated by most individuals is that the best communicators might simply be those individuals with the most modal associations. Pity the creative or eccentric person!). Better instruments are needed for research on communication skill. One might attempt to measure the skill with which individuals were able to predict the decodability of a set of descriptions, a process related to social cognition (Shantz, 1975). A person whose choices of effective messages correspond most accurately with the actual responses of a sample of the population might be a person who would also be skillful in referential communication. Extending this to cross-cultural communication, a translator who demonstrated not only an ability to identify effective messages but also to make distinct choices for two cultures might be thought of as one who was “culturally decentered” (Brislin, 1976, p. 23; Werner & Campbell, 1970). Systematic application of this technique to a large sample of messages about a broad sample of domains might lead to a theory of domains in which two cultures were likely to produce mutually culture-bound encodings. Extension of this work to several cultures might permit some quantitative estimate of the relative culture-boundedness among them, leading to some assessment as to the relative influence of language and referents upon referential relativity. Study of developmental trends in culture-boundedness of communication might yield interesting insights into whether meaning systems converge or

232

W. Patrick Dickson, Naomi Miyake and Takashi Muto

diverge between cultures from childhood to maturity (See Anglin, 1977). Research in progress is directed at this question of whether children’s or adults’ encodings show a greater degree of culture-boundedness. Whatever the relationship of language to thought at the deepest level, it is clear that culture and language taken together do influence what is thought about, the connotative and denotative associations of the words with which such thought is communicated, and the efficiency with which reference can be made. Language is essential to communication. Neither individuals nor cultures communicate perfectly about ideas which are complex enough to be interesting. But they do communicate, and better understanding of how individuals and cultures come to have differentiated lexical associations may lead to better understanding. Finally, it should be stressed that the data in this paper should not be interpreted as evidence of uniquely large differences between the Japanese and American cultures. Comparable differences would probably be found between other cultures. Also, the emphasis in this paper on culture bound messages should not cause us to overlook the fact that most of the 64 messages did not turn out to be culture bound. Given that these were ambiguous messages about abstract referents, the similarity of response patterns to most of them is extraordinary. Insofar as students in different cultures responded almost identically to messages such as “a little bird flying” and “a frog without eyeballs” we need not fear that cultures might be mutually unintelligible.

References Anglin, J. M. (1977) Word, object, and conceptualdevelopment. Angoff, W. H., and Ford, A. 1:. (1973) Item-race interaction

New York, Norton. on a test of scholastic

aptitude.

J. educ.

Measure. IO, 95-106. Arnheim, R. (1969) Visual thinking. Berkeley, University of California Press. Brislin, R. W. (1976) Introduction, In R. W. Brislin (Ed.), Translations: Applications and research. New York, Wiley. Brown R. (1976) Refcrcnce: In memorial tribute to Eric Lenneberg. Cog., 4, 125-153. Bruner, 5. S. (1973) Going beyond the information given. In J. S. Bruner, Beyond the information given. (J. M.Anglin, Ed.) New York, Norton, (Reprinted from J. S. Bruner, eta/., Contemporary approaches to cognition, Cambridge, Mass.: Harvard, 1957). Dickson, W. P. (1974) The development of interpersonal referential communication skills in young children using an interactional game device. (Doctoral dissertation, Stanford University, 1974). Dissertation Abstracts International, 35, 3511-A. (University Microfilms No. 74-27,008). Dickson, W. P. (1976) A definition of “effectiveness” of messages in referential communication. Unpublished manuscript, University of Wisconsin. Dickson, W. P., Miyake, N., & Muto, T. (1975) Decoding descriptions of abstract referents test. Unpublished test booklet. Available from first author.

Referential

Relativity

233

Dickson, W. P., Nagano, S., Miyake, N., and Oshima, Y. (1976) Cultural and institutional differences in communication styles. Paper presented at the meeting of the American Educational Research Association, San Francisco, April 1976. Flavell, .I. H., Botkin, P. T., Fry, C. L., Jr., Wright, J. W., & Jarvis, P. E. (1968) The development of role-taking and communication skills in children. New York, Wiley. Garner, W. R. (1962) Uncertainty and structure as psychological concepts. New York, Wiley. Glucksberg, S., Krauss, R. M., and Higgins, E. T. (1975) The development of referential communication skills. In I:. D. Horowitz (Ed.), Review of child development research (Vol. 4). Chicago, University of Chicago Press. Heider, I.. R. (1971) Style and accuracy of verbal communications within and between social classes. J. Person. social Psychol., 18, 3747. Hcider, E. R. (1972) Universals in color naming and memory. J. exp. Psychol., 93, 10-20. Krauss, R. M., and Glucksberg, S. (1969) The development of communication: Competence as a function of age. Child Devel., 40, 255-266. Krauss, R. M., and Wcinheimer, S. (1966) Concurrent feedback, confirmation and the encoding of referents in verbal communication. J. Person. social Psychol., 4, 343-346. Osgood, C. I?., May, W. H., and Muon, M. S. (1975) Cross-cultural universals of affective meaning. Urbana, Illinois: University of Illinois Press. Rosch, E. (1974) Linguistic relativity. In A. Silverstein (Ed.), Human communication: Theoretical explorations. New York, Wiley. Rosenberg, S., & Cohen, B. D. (1968) Referential processes of speakers and listeners. Psychol. Rev., 73, 208-231. Shantz, C. U. (1975) The development of social cognition. In E. M. Hetherington (Ed.), Review of child development research (Vol. 5). Chicago: University of Chicago Press. Shantz, C. U., and Wilson, K. E. (1972) Training communication skills in young children. Child Devel., 43, 6933698. Werner, H., and Kaplan, B. (1964) Symbol formation. New York, Wiley. Werner., O., and Campbell, D. (1970) Translating, working through interpreters, and the problem of decentering. In R. Naroll and R. Cohen (Eds.), A handbook of method in cultural anthropology. New York, American Museum of Natural History. Whorf, B. L. (1956) Language, thought and reality. (J. Carroll, Ed.), Cambridge, Mass.: M.I.T. Press.

Resume

On a present& i des etudiants Japonais, une serie de 15 referents abstraits, en leur demandant de dkcrire chaque fiiurc de telle sorte qu’un autre Ctudiant puisse correctement l’identifier dans la serie. Chaque referent etait ddcrit deux fois: une fois, de facon metaphorique et l’autre fois, de facon analytique. Une suite test comprenant 64 de ces descriptions a et& traduit en anglais. Deux groupes d’etudiants Japonais voyaient la version japonaise, et deux groupes d’etudiants Americains voyaient la version anglaise. Les ktudiants avaient i choisir parmi les 16 referents, celui qui correspondait le mieux a la description pr&ent&. Les deux groupes de Japonais ont donnk dcs patterns de rkponses t&s semblables, et les deux groupes d’Americains ont fait de meme. Cependant, la comparaison entre les deux cultures, met en evidence des patterns de reponses t&s differents. Les descriptions metaphoriques comme les descriptions analytiques, induisent des patterns de rkponses differents dans les deux cultures. Ces resultats sont interpret& comme la preuve d’un “relativisme referentiel” c’est-a-dire comme un effet de langage et de culture sur le sens refkrentiel. Les auteurs discutent de l’utilisation des tlches de communication refdrentielle comme mesure des demarcations culturelles de la communication.

Cognition,5(1977)235 -250 @Elsevier Sequoia S.A., Lausanne

of lexis and syntax

3 - Printed

in the Netherlands

On the young child’s use in understanding locative instructions

ROBERT GRIEVE ROBERT HOOGENRAAD DIARMID

MURRAY

University of St Andrews

Abstract

The abilit?? of two and three year old children to comprehend ifi, on and under was tested in five contexts. In the first context, where responses did not drpend on the child manipulating the experimental objects, responses were invariably correct except for some difficultly with under in the youngest subjects. In the other four contexts, which did involve manipulation of the objects by the child, responses varied as a function of the noun phrases used to refer to the experimental objects which themselves remained the same across different contexts. The results suggest that the young child’s comprehension of instructions involves an interaction between aspects of the instruction’s lexis and syntax and the child’s construal of context.

1. Introduction When the child is learning his first language, the meaning of what he hears, or some of what he hears, will not be immediately transparent. Yet when we talk to the child, whether informally in the every day commerce of play asking him things, telling him things, talking about matters of common interest and so on - or in the circumstances of a study in developmental psycholinguistics, it is customary for him to interact with the adult, despite his incomplete understanding of the vehicle frequently used to initiate and sustain that interaction, language. Here we wish to consider how that interaction proceeds with reference to the child’s comprehension of instructions involving the locative prepositions in, on and under. Three recent studies, Clark (1973), Wilcox and Palermo (1974) and Baldwin (1975), investigated the strategies children use when responding to such instructions. In these studies young children, ranging in age from about 1%

236

Robert

Grieve, Robert

/ioogenraad,

Diarmid Murryv

years to 5 years, were presented with two objects, one a ‘movable’ object (e.g., a small toy animal, a carpet, a boat), the other a ‘stationary’ object (e.g., a table, a bed, a bridge), and they were asked to put the former in, otz or urzder the latter. For example, the child was instructed to Put the dog OH the tub&, and, with the same pair of objects, to Put the dog w&r tllc tubk. In the studies by Wilcox and Palermo, and by Baldwin, an attempt was made to balance the experimental design for the bias inherent in such instructions in favour of the usual way in which the objects are placed relative to one another. For instance, the child was told to Put tlw mutcll inlou the rnatcll bos, which could be expected to produce a bias in favour of irl, but he was also told to Put tile stump in/on the letter, expected to favour OU. The results can be described qualitatively as follows: the children especially the youngest ones, showed what we will tentatively describe as a response bias. If the child did not get both of a pair of instructions right, he would show a tendency to give the same response to both instructions, as if they meant the same thing, so that he got one right, the other wrong. For example, asked to Put tllc dog onlundcr tllc table, the child would put the dog on the table both times. In both Clark’s and Baldwin’s studies the nature of the bias was predictable and uniform, and Baldwin was able to replicate part of Clark’s study with considerable success. Though they differ in their emphasis, the results of these three studies were interpreted by their authors in terms of a non-linguistic strategy. Clark (1973) suggests that the strategy is based on the perceptual properties of the (stationary) object; Wilcox and Palermo (1974) suggest it is based on the relation in the every day world between the objects represented in the experiment; and Baldwin (1975) favours the child’s belief about the proper, or canonical, relation between the objects. A somewhat different explanation has been explored in Hoogenraad, Grieve, Baldwin and Campbell (1977). From an examination of the data and procedures of the above studies, it was argued that the results were consistent with the process that must be presumed to operate in the normal comprehension of utterances in context, so that it is not properly regarded as a strategy. The function of this process is to relate utterance meaning and apperception of context, which in mature understanding tends to lead to a r?urtuul assimilation of context and utterance (Campbell and Bowe, 1977). But the young child, of 1% or 2 years say, has reason to be generally less confident of his understanding of language than he is of his apperception and construal of the extralinguistic world. This may lead him to neglect his understanding of an utterance (to the extent that he does understand it of course) when it seems to him to conflict with what he believes, on the basis of his construal of the context, to be an appropriate interpretation of what is

Understanding locative instructions

231

required. One consequence of this is that a correct response to an instruction involving in, on or under cannot be taken as proof of understanding of the preposition, since the response may be based on non-linguistic factors, as Clark (1973) has pointed out. But also note that an incorrect response cannot be taken as proof of a lack of understanding, since it may result from a conflict between the child’s understanding of the preposition and what he believes to be an appropriate response in that situation. This conflict he resolves by ignoring the preposition’s meaning, of which he is less confident, in favour of his understanding of the context, of which he has reason to be more confident. We will return to these questions when we discuss the results of the study to be reported here. This study addresses itself to two problems arising from the previous work. First, in those studies the child’s comprehension of the locative preposition was tested by requiring the child to manipulate the experimental objects. The problems which such a procedure may involve are well attested to by the work of Huttenlocher (Huttenlocher and Strauss, 1968; Huttenlocher, Eisenberg and Strauss, 1968; Huttenlocher and Weiner, 197 l), and it seemed desirable to find an alternative method. In the five sections of the present experiment this is done in Task A, where the child is presented with three alternative arrangements of a small box and a larger box, and asked to indicate the arrangement specified in the instructions. For comparison, in Task B the child is presented with a small box and a larger box, and asked to put one in/on/under the other. Thus in Task B, as in earlier studies, the child’s comprehension is assessed by his ability to place one object relative to another, while in Task A he is free from that constraint. Tasks B, C. D and E address the second problem. In the previous studies various pairs of objects were used: toy teapots and tables, cups and saucers, cars and tunnels, boats and bridges, roads and trucks, and so on. Now it is implicitly assumed in the interpretation of the results of these studies that it is the objects per se which determine the child’s response in some way. However, two observations made in the course of a longitudinal study* gave us reason to believe that the noun phrases used to refer to the objects were an important factor in determining how the child would interpret the presented array of objects, and that it was this that formed the basis for the child’s response. The episodes are given in full in Grieve and Hoogenraad (1976). Graham, aged 2;4, was shown a box placed on its side, opening

*‘Studies in semantic development Grieve and Robin Campbell, assisted

in young children’: by Robert Hoogenraad

SSRC (London) grant and Theresa Bowe.

HR 2516

to Robert

238

Robert

Griew, Robert Hoogmraad,

Diarrnid Murray

facing forward, and asked to put a toy animal in the stable. He placed the animal on the box, announcing that he had placed “a horsie a table”. A similar observation was later obtained with Jane, aged 2;2, in an informal experiment. She also placed the toy animal on the table, but in addition she showed considerable uncertainty about the preposition used, questioning it several times, trying in and 017, and finally announcing that she had put the anitnal it? the table, though in fact she had placed it on the box. We had reason to believe that neither child then knew the word stable, and it seemed reasonable that they had assimilated it to the word table which they knew well, and then had placed the animal where it seemed appropriate - on the table. Graham may have also assimilated in to 011 - or perhaps these were not different words for him, given their phonetic similarity. Jane showed the conflict mentioned earlier, and seemed to finally decide that she must have had the meaning of in and OIZ confused. The basis of the responses of both children seemed to be that the reference point object was regarded by them as a table, not as a stable or a box. Baldwin (1975) tested one aspect of these observations by systematically varying the objects used. The second part of the present study addressed another salient aspect of the observations, that the noun phrase used to refer to the box formed the basis for the child’s construal of the task. In Task B the small box and the larger box were referred to as the little box and tllr big box. In Tasks C, D and E the noun phrases were systematically varied, the boxes being referred to respectively as the chair and the table, the bab.1, and the bath, the cup and the table. Thus across Tasks B through E, the pair of objects remains the same, controlling for the effects of their physical properties and the child’s knowledge of the objects themselves. All that is changed is the way in which the objects are referred to.

Method and Procedure Two groups of children took part in the experiment. The younger group ranged in age from 2;0 to 2;6 years (mean 2;3), and the older group from 3;0 to 3:9 years (mean 3;4). Details of age and sex appear in Table 1. All the children were tested individually by the same experimenter (DM). Half were tested in a local pre-school playgroup, the rest in their own homes. Child and experimenter sat side by side on the floor, or at a table. The experimental objects consisted of pairs of pliable boxes, cubic with one open face, constructed from light-guage brown cardboard. The set of small boxes measured 6 cm along each edge, the set of larger boxes 10 cm.

Understanding locative instructions

239

Six different arrangements of the boxes were recorded as responses; these are shown schematically in the Key to Table 1*. The experiment consisted of five tasks, designated Tasks A, B, C, D and E, and carried out in that order. In each task there were five instructions, presented in random order. In Task A the child was presented with three pairs of boxes, arranged in a row in no special order, so that there was a small box in a larger box (arrangement c), a small box oy1 a larger box (arrangement 6) and a small box under a larger box (arrangement 0). Clearly such an array also includes a larger box on a small box (arrangement o) and a larger box under a small box (arrangement 6). Arrangement c was used for the in relationship of the boxes rather than u, so that there would be no doubt that the child could see the small box. The child was then given an instruction such as Can you show me the little box under the big box?, to which he was to respond by pointing to, or touching one of the arrangements c, 6 or o. The five instructions given in Task A were as follows: Can you show me the little box in/on/under the big box Can you show me the big box on/under the little box In the subsequent tasks, Tasks B, C, D and E, the child was presented with a pair of boxes, one small, the other larger, placed beside each other in no special relation, which he was to arrange in accordance with the experimenter’s instructions. In Task B, the boxes were referred to as little box and big box, and the child was asked: Can you put the little box in/on/under the big box Can you put the big box on/under the little box In Tasks C, D and E, the terms used to refer to the boxes were systematically varied. Thus in Task C the child was asked to put the chair (for the small box) in/on/under the table (for the larger box), etc. In Task D the terms for the small and larger boxes were baby and bath respectively, while in Task E the terms were cup and table respectively. Task C was prefaced by showing the child the pair of boxes and asking: If these are a table and a chair, which is the table and which is the chair?.

*We have adopted the fairly ease difficulty in interpreting the orientation of the big box not indicated). The little box umlaut in arrangements ii and the upturned edge in ii.

transparent notation for the arrangements shown in the Key in order to which arrangement is being referred to in the text. The letter indicates (for the two arrangements o and 6 the orientation of the open face is goes inside for arrangements u and c, and underneath for o and n. The ii indicates that the little box goes on top of the big box, balanced on

240

Robert Grieve, Robert Hoogenraad, Diarmid Murray

If the child answered appropriately, referring to the small box as chair and the larger box as table, the experimenter then proceeded with the instructions described above*. Tasks D and E were prefaced by equivalent questions. It is important to note that the tasks were presented in the order A, B, C, D and E, but that within each task the five instructions were presented in random order. Responses were recorded in terms of the final arrangement of the pair of boxes, except in Task A where the pair of boxes which the child indicated from the three pairs presented was recorded.

Results Since we wish to allude to the pattern of results for individual children as well as consider the overall pattern of results, data are shown in detail in Table 1, which shows the children in order of age, and their response (if any) to each question. Entries in the table show two things. First, the arrangement of boxes in the response is indicated, where the symbols are explained in the Key to Table 1 (see also footnote on page 237). Arrangements u, c, ij and o accommodate the bulk of the responses. Two other arrangements were also produced: n, where the larger box was inverted over the small box; and ii, where the child attempted to balance the small box on the upturned edge of the larger box, and finding this difficult, held it in place. Entries marked indicate that the child failed to respond. These all occur in Task A, for the three youngest children; while this may arise because Task A was the first task in the experiment, it is perhaps of significance that these failures to respond all occurred for instructions involving under. Second, we have indicated responses which appear to be definitely incorrect with x. Responses which appear to be less than optimal are indicated with ?; for example, arrangement c for little box under big box would be an adequate description under certain circumstances, say if the big box was a very large and immovable object like a bus-shelter; arrangement n for big box on little box would be better described by over, but it could result if the child had the larger box in the wrong orientation, with the opening

*Two of the children initially chose the larger box as the chair and the small box as the table perhaps by analogy with a large armchair and a low coffee table? But they then changed their minds as in these instances the fist instruction was: Can you puf the chair in the tuble.

KEY:

11 12 13 14

8 9 10

u

bd

F F M F M M M

b

3;5 3;5 3;6 3;9

3;0 3;o 3;l

2;o 2;l 2;2 24 2;4 2;5 2;6

M F M M F F F

2 3 4 5 6 I

1

Age

Sex

0

n

s

nrl

6 0 6 0 6: 0 6 0 B 0 0; 0 L-i 0

c c c c c c c

H

c? 0 0 0 0

6 a 6 6 ij a ii

c c c c c c c

big box

0

0 0 0 0 0 0 0

0 0 0 0 0 0 0

n

bl

6 6 6; 6 6 6; B

ox ii 6 6

little box

-in on under on under --

big box

: show me

little box

Task A

u

ii

u u u u u u u

u u u u u u u

n

o a ii ii 6 6 6;

6 ii B B 6 6 6

big box

0 0 0 o 0 0 0

0 0 0 0 0 0 n?

_

no response

0 0 0 o 0 0 w

Llx WI lLx w ux LX 6x

!

chair

table

Task C: put

c

c

ii

6 6

c c c

0 0 0

! 0 0

0

! w ux

0

ux 0

0 0 6x

?

u u

ux 0

I nx

ox ox rLX u?

6

u?

baby

n ii 0 LLX 0 ii

6

non-optimal response

6

c-x cx ox

u u u

U

Lx

u

cx

ux 6

X

0

u u

bath

incorrect response

c

0

6x

0

0

c

c

c

c

0

c

baby

bath

Task D: put

-A in on under on under --

ox

0

o?

6

6x

B 6 6

cx ox

0

c

chair

c

c

6 ox

c

C

c

table

-in on under on under --

correction (see text)

ox ij 6 s 6 6 ii

u? u? ox u? ox ox 6

little box

under on under

big box

0in on --

little box

Task B: put

Individual responses for each child

No

Children

Table 1.

table

6

6

0

u? 6x

6

6 6 6

6

ii

c

c

c

c

c

c

c

table

c

c

c

c

c c

o?

c

6x

6x

ii

cx

ax

cx

CT

c-x

ox cx

B 6

0

6x 6x

0

0

0

0

cup

-in on under on under --

cup,

Task E: put

242

Robert Grieve, Robert Hoogenraad, Diurmid Murray

facing downwards, when he placed it on the small box; arrangement u for big box under little box would be correct if the larger box was a shallow tray say, but it could also result from the child placing the smaller box ‘where it ought to go’, inside the large box; and similar considerations apply to u for bath under baby. Arrangement o for chair under table and cup under table, and arrangement u for cup in table, would be correct responses if the objects were being treated as boxes, rather than as they were referred to. The table entries marked ! indicate a response that is intriguing and which has alarming implications for experimental work with young children. They involve child 12 in Task D, who would produce an arrangement incorrect in terms of the instruction given by the experimenter, but correct in his own terms. For example, asked to Put the baby on the bath he would put the ‘baby’ in the ‘bath’, and then announce that this was what he had done, thus effectively correcting the experimenter’s instruction. This raises the interesting possibility that instances of apparent error elsewhere in the experiment with this child, and more generally throughout the experiment with other children, are responses to their corrections of the presented instructions, corrections which they fail to announce. In this regard, note child 12’s errors in Tasks C and E, where, possibly, he is correcting the instructions without announcement. That children in experiments may answer, correctly, questions related to but different from those presented by the experimenter has been observed in other experimental contexts (Grieve, 197 1; McGarrigle, Grieve and Hughes, forthcoming). Awareness that this is what is happening (by inference from additional information spontaneously offered by the child or elicited from him) leads us to modify our assessment of what superficially appears to be the child’s straightforward error resulting from his lack of knowledge, or from a slip in performance. But lack of awareness that this is what is happening has serious consequences for our view of the child. In light of the implications of this point for both method and theory in study of child development, it requires urgent consideration. Blank entries occur for children 1 to 5 for Tasks C, D and E. Recall that Task C was prefaced with the question If these are a table and a chair, which is the table and which is the chair ?. The younger children did not answer this hypothetical-conditional question, and the experiment was not further pursued with them. With the remaining children, it is in itself interesting that children as young as 2;5 years are able to deal successfully with such a question. It was probably unnecessary to preface Tasks C, D and E with such hypothetical-conditional questions. Recalling our observations of Graham and Jane, we might expect that the child could simply be told to Put the chair on the table, or whatever. A blank entry also occurs for child 14 in Task D who failed to complete this instruction.

Understanding locative instructions

243

Discussion of results Tasks A and B WC begin with a comparison of the results of Tasks A and B. Recall that these were designed to test the effect of the child having to manipulate the experimental objects: clearly, in Task A, where the child was presented with an array of three arrangements, c, ii and o, one of which he was to point to, this constraint did not operate, while in Task B, where he had to produce the arrangement asked for, it did. Since there are no errors for instructions involving in and 0~2, these are unrevealing. There is a (minor) problem involved in dealing with the errors produced for under: in Task A, c, and not u, was one of the alternative arrangements in the array presented to the child. Now in Task B, arrangement u is a frequent erroneous response, and we should allow the possibility that those children who responded correctly in Task A, but produced an erroneous u response in Task B, responded correctly by default in Task A because the response they WOUM have made was not an alternative. But even if we allow for this, there arc clear cases where the child responds erroneously in Task B but correctly in Task A (children 5, 6, 7 and 8); or else a child (child 3) refuses to respond in Task A, presumably indicating that he truly did not understand the instruction, but responds erroneously in Task B. Only child 4 shows, if anything, the reverse tendency. The erroneous responses (including the sole error in Task A) fall into two clear categories. First, there is the u response, the most frequent erroneous response, especially for the youngest children. Now this is the arrangement one might expect the child to produce if he were asked SIzow me where the little box ought to go, or some such question: u is the most proper arrangcment for boxes: in Hoogenraad et al. (1977) such responses were referred to as canonical, to stress that the child was thought to be responding in a way that he believed to be standard or orthodox. It was shown there that a canonical arrangement produced as an erroneous response cannot be taken as clear evidence that the child does not understand the instruction: it may merely indicate his lack of confidence of his understanding of aspects of the instruction, the sort of lack of confidence in her understanding of the preposition shown by Jane in the observation referred to earlier. We can imagine that the child in Task B considers: “Perhaps under doesn’t mean that after all, I must have got it wrong; he can’t have meant me to put it there, I’ll just put it where it ought to go, inside the big box”. But such considerations are not very plausible in the present instance, for the children did after all produce the 6 and o responses, and not u, to instructions involving on. This provides strong, though not conclusive, evidence that children 1, 2 and 3 do not understand under at all, or understand it only in the vaguest way. It

244

Robert Grieve, Robert Hoogenrmd,

Diarmid Murray

was shown in Hoogenraad et ul. (1977), using evidence from Clark (1973) and Baldwin (197.5). that though a proportion of their youngest children appeared to show no understanding of urzder, nonetheless understanding of the three prepositions subsequently developed together.* The second category of erroneous responses consists of the 6 and o arrangements: it is as if the child had confused the subject and object of the preposition. Why does this occur in Task B, but only once in Task A? Considerations about the relation and interaction between the syntax of the instructions and the process involved in the response lead to suggestions about the essential differences between these tasks. In Task A, the child is asked Show me X under Y, where the most natural interpretation is that under Y is a restrictive prepositional phrase on X, the direct object of the verb show. Thus the child’s task is to find X, the X such that X is under Y, and the presented array can act as a form of concrete aide-memoire in his search. (Under a less likely interpretation the full prepositional phrase X under Y is the object of the verb slzow: the child’s task is now less clearly defined in small component sequences: he has to form a prior conceptualization of the relation X under Y, and the array will be less useful as an aidemetnoire.) In Task B on the other hand, the child is asked to Put X under Y, where X is the object of the verb put, while under Y is the complement. Thus the child’s task is to put X somewhere, namely under Y. He is instructed to handle X, and he also has to conceptualize the required relationship, X under Y, since there is no aide-memoire. Now it is possible, and plausible, that the child understands the lexical meaning of mdcr, that something goes under something else, but is less certain of the syntactically defined direction of the relationship: is it X under Y, or Y under X? He has in his hand X: if he places it on Y he has one of the alternatives he has in mind; to produce the other he would also have to pick up Y. Thus under this interpretation, based on how the syntax of the instruction is expected to influence the way the child will organize the component parts of the synergy of actions that comprise the response, the difficulty of Task B inheres not so much in the fact that the child has to handle the boxes, but arises from conflict inherent in the way the syntax of the instruction influences the process of the response, and the way it specifies the goal of that response, in the case of

*Ku&a and Francis (1967) give the frequcncics of the prepositions (per thousand words of written text) as approximately 21.3 for in, 6.7 for on and .71 for under. More revealingly, perhaps, in and on occurred in all 500 of their 2,000 word samples, while under occurred in only two thirds of them. And casual observation suggests that the young child is more likely to meet in and on in contexts where he could deduce or infer their meaning than is the case with under.

Understanding locative instructions

24s

instructions involving under. (On the basis of this explanation we might expect that if the child was first told to pick up X before being instructed to put Y under X, performance might well change as a function of the child’s understanding of the instruction.) Tasks B, C, D and E

We come now to discussion of the results of Tasks B, C, D and E, which were designed to test the effect on the child’s responses of systematically varying the way the boxes were referred to. Of the two 2 year old children who completed Tasks C, D and E, child 6 shows a clear tendency to perseverate in Tasks D and E, and these results are not considered further. (Interestingly, his responses in Task D, where the boxes are referred to as baby and bath, are all arrangement u: the baby in the bath; and in Task E, where the boxes are referred to as cup and table, all but one response is 6: the cup on the table. In both cases these are canonical responses, given how the boxes were referred to.) Two of the arrangements, 6 and o, were unfortunately not further differentiated for the orientation of the opening of the larger box when the responses were recorded. But for those responses where the orientation of the larger box is known, the influence of the way the boxes were referred to in the instructions is overwhelmingly clear. In all but one instance (child 13, Task E), when the larger box is referred to as box or bath it is oriented u or n; when referred to as table it is oriented c; i.e., the orientations were appropriate given how the boxes were referred to. Confining ourselves now to children 7 to 14, virtually all the errors were in response to just three instructions involving under: table under chair/cup and bath under baby. Now this cannot arise from a failure to understand under, for these same children get the reverse instructions (chair/cup under table, baby under bath) correct. Nor can we appeal to the explanation used above for the errors of the younger children with under, for not only do these older children get the reverse of the under instructions correct as we have just pointed out, but they also make but a single error between them for big box under little box. Inspection of the three instructions - table under chair, bath under baby and table under cup - suggests that the reason for the errors is that these instructions make an unusual choice of subject for the preposition. This is not because the arrangements asked for are in themselves strange, since the same children make only one error for chair/cup on table. Nor is it because the instructions are ungrammatical or odd per se: for example in the context of loading a van in a furniture removal, the instruction Put this table under that chair could be perfectly reasonable. The explanation can be found in Halliday’s theory of the function of the

246

Robert Grieve, Robert Hoogenraad,

Diarmid Murra_v

grammatical subject as the subjective “starting point” of the communication (see for example Halliday, 1970), and in the oddity of regarding a supporting object (e.g., a table) as the “figure” in a figure - ground relation. In short, WC can explain these results if we suppose that the child understands the lexical content (the meaning) of under, but is less certain about the role of word order with respect to the preposition, and is inclined to reverse the subject and object of the preposition if that will give a less odd instruction. Not, it should be noted, a less odd response in these cases, for all the erroneous responses are ones which were correct responses to the reverse instructions. Similar considerations apply to errors with instructions involving OYZ,but here the pressure on the child to reverse the subject and object arises from the strangeness of the final arrangement asked for: for instance, for tuble on cup, the child produces arrangement 6, the cup on the table. (Such a semantic role reversal cannot occur with instructions involving in for the objects used in this experiment, but if we had used say a small bag and a larger bag, we might have found such reversals, for instance for Put the cut i,l the mouse’s stomach,) The remainder of the errors with on and in can be explained on similar grounds, if we note that on and in are phonetically similar in unstressed position in ordinary speech, and therefore easily confused when context fails to disambiguate (or gives conflicting expectations, as in these instances). The sole error not accounted for is the error of child 7 for chuir on tabfe: he is one of the two children who initially chose the larger box as the chair (see footnote on page 239) - perhaps he really did put the chair on the table.

Conclusions

The results of the experiment reported here can be summarized as follows. First, the results suggest that normally there will be a period when the child understands the prepositions in and on but not yet under: other work (see Hoogenraad et al., 1977) has shown that subsequently the understanding of these three prepositions tends to develop together. Second, the effect of using a different test for comprehension of the prepositions in, on and under has indicated that when the child is asked to point to the correct arrangement in an array of alternatives, rather than being asked to produce the correct arrangement, he shows his comprehension of

Understanding locative instructions

241

the prepositions to better advantage. An explanation for this has been proposed in terms of the role of the syntax of the instruction in organising the synergy of acts which comprise the child’s response. Third, the effect of the way the experimental objects are referred to has been investigated by holding the objects used in the experiment constant, but referring to them in different ways. Besides affecting the form of the final arrangement of the objects, it has been shown that this affects the likelihood of the child arriving at a correct response, since some ways of referring to the experimental objects leads to a more ready conceptualization of the final arrangement required than do others. If the child is asked to produce an arrangement which is in some way odd, or if the instruction itself is odd in that context, he tends to amend the question in a motivated way, either by means of a phonetic transformation or by means of a semantic role reversal of subject and object of the preposition. Walkerdine and Sinha (1977) have shown that even adults-may on occasion revert to such a strategy in unfamiliar contexts. Fourth, it has been shown that in prepositional instructions there are other aspects of linguistic structure which influence its comprehension than just the preposition itself. More generally, the implication of this is that if we want to test some aspect of language, we must consider all the structure that is in the instruction, for this may well influence the child’s understanding of the particular element which we happen to be studying. Fifth, the results obtained indicate the effectiveness of a technique which employs constant ‘neutral’ objects, which are named in different ways, in order to estimate the effect of certain aspects of language. This technique proved effective with children from age 2% years, and, as indicated above, though it failed to be effective with children below that age, this might well be readily overcome with a minor change in the procedure used to introduce the technique. It can also be noted that the effectiveness of the technique depends on the young child operating, in a sophisticated manner, with representation. At the beginning of this paper it was suggested that interpretation of locative prepositions may be crucially dependent on the way the objects concerned are referred to in the instructions. The present study provides clear evidence that the way 2 and 3 year old children interpret the prepositions in, on and under does not arise simply from the physical characteristics of objects, or from the child’s knowledge of the customary relationships of objects. Rather we must look to an interaction between such factors and the way the objects concerned are referred to in the instructions. The results of the present experiment, and the studies of Clark (1973), Wilcox and Palermo (1974) and Baldwin (1975) now allow us to attempt to

248

Robert Grieve, Robert Hoogcnraad,

Diarmid Murray

specify the general nature of the interaction between the language of the instructions and the context in which they occur, and the development of this interaction as the child grows in experience. In the absence of full understanding of the prepositions, such as is to be expected in children involved in the process of language acquisition, we can expect the child to make errors. In the total absence of any instruction, we might expect the child to play with the proffered objects, and to do so in characteristic ways. In the course of such play he might place one object in, on or under another. What he can do is of course physically constrained by the objects, and also by his skill, but more importantly it is constrained by how he has decided to view the objects. While this is constrained by the physical characteristics of the objects it is not determined by them: the important constraint is almost certainly his experience of the material, partially determined by the cultural environment. Thus he will place babies in baths, or occasionally on the edge of the bath, cars on bridges but boats under them, and so on, subject to the caveat that he will tend to do these things to the extent that he has some prior experience of them, and on what he has decided the objects arc to represent. (In the Museum of Childhood in Edinburgh there is on display a worn old shoe once used as a doll: the sole is the face, and an old rag draped around the instep is a dress.) If the child does not yet understand language at all, he is effectively in this position even when he is instructed, and we might expect customary relationships between objects to predominate in his responses. But as he begins to understand the language of the instructions the situation changes. Now in the questions typically asked in experiments like the present one there are various features that might affect the child’s response. First there is the verb (note in Table 1 the low incidence of arrangement n: but what if the verb had been hide?): although we used two verbs, slzow and put, we cannot claim to have investigated their effect, as the context largely constrained the response in that we set up the situation so that the child would be expected to either indicate or create the relevant array. Next there is the preposition, and there are two aspects which are important: the lexical meaning of the preposition, and the grammatical meaning of word order. That is, it seems likely and is not contradicted by our results, that the child may understand the lexical content of a given preposition without yet understanding the grammatical meaning of word order with respect to the preposition (e.g., given on, the child may appreciate that one object should be placed on the other, but fail to appreciate which should go on which). There is also the possibility that at first the child will tend to confuse in and on through their phonetic similarity. Now there is immediate conflict between two sources directing the child’s response: there is the ontogenetically prior (and we might assume, more

Understanding locative instructions

249

secure) response based on what the child thinks is a proper response given the proffered objects; and there is the preposition in the instruction. In Task A of the present experiment the former effect was minimised, and the child’s understanding of the preposition dominates his response, while in the subsequent tasks it is apparent that the former effect is dominant, but grows less dominant as the child’s understanding of the prepositions grows more secure. In addition, and this is perhaps the main drive of our results, the noun phrases used in the instruction to refer to the experimental objects affect the response, for this is an important determinant of how the child will view the objects, and its importance grows in measure as the child gains experience. And for the older child, as for the prelinguistic child, it remains true that it is not in the first instance the physical characteristics of objects that determine the response, but rather how the child views them, what he understands by them, how he interprets them; and the effect of his growing understanding of language will be that the experimenter’s instruction (rather than the child’s experience alone) can begin to influence how he views them. Of course, if the terms used to refer to the experimental objects accurately name them, then this effect of language will not be immediately apparent, and it may well seem, if the experimental objects are suitably chosen, as if it were their characteristics that determined the response. The effect of the physical characteristics of the objects used merely tends to constrain how the child can, or is likely to, view them. With reference to this, note how difficult it is to find objects whose customary relationship is under, and which are also likely to be within the child’s experiental ken: cups go oy2 saucers, not saucers under cups; and while carpets can be described as being under beds, they are customarily described as being on floors; etc. While the juxtaposition of two objects may be coded in the language in different ways, these different codings are rarely genuine alternatives in a functional sense. This sort of knowledge is part of what the child has to acquire while he is learning the language, and his relatively greater difficulty with under may reflect his greater uncertainty of how this locative works, resulting from his less frequent experience of it. Finally, some evidence has again been presented that indicates the hazards of assuming experimental control of any one of the child’s responses taken in isolation (see Grieve, 1976; Grieve and-Hoogenraad, 1976). In order to assess whether the child’s response is appropriate to what we are trying to elicit for the purposes of the experiment, it may be necessary to interrogate a variety of sources of data about the child. This is because the child enjoys autonomy, and if he decides to amend our questions to ones more to his own liking, without telling us what he is doing, there is nothing we can do about it.

250

Robert Griew, Robert Hoogenraad, Diarmid Murraj~

References Baldwin, P. (1975) On the acquisition of meaning of in, on and under. Unpublished M. A. Thesis, University of Stirling. Campbell, R. N. and Bowe, T. (1977) 1:unctional asymmetry in early language understanding. In C. Drachman (Ed.) Salzhurger B&rage zur Linguist&, III (Salzburg Papers in Linguistics, Vol. III). Gunter Narr, Tubingcn (in Press). Clark, 1:. V. (1973) Non-linguistic strategies and the acquisition of word meanings. Cognition, 2, 161182. Grieve, R. (1971) Some studies of language use and class inclusion. Unpublished Ph.D. Thesis, University of Edinburgh. Grieve, R. (1976) Problems in the study of early semantic development. In G. Drachman (Ed.) Salzburger B&rage zur Linguistik, II (Salzburg Papers in Linguistics, Vol. II). Guntcr Narr, Tubingcn, 139--152. Grieve, R. and Hoogenraad, R. (1976) Using language if you don’t have much. In R. J. Wales & E. C. T. Walker (L:ds.) New Approaches to Language Mechanisms. North Holland, Amsterdam, l-28. Halliday, M. A. K. (1970) Language structure and language function. In J. Lyons (Ed.) New Horizons in Linguistics. Penguin, Harmondsworth, 140-165. Hoogenraad, R., Grieve, R., Baldwin, P. and Campbell, R. N. (1977) Comprehension as an interactive process. In R. N. Campbell and P. T. Smith (lids.) The Stirling Psychology of Language Conference, Vol. I: Language Development and Mother-Child Interaction. Plenum Press, London (in press). Huttenlocher, J., Eiscnbcrg, K. and Strauss, S. (1968) Comprehension: relation between perceived actor and logical subject. Journal of Verbal Learning and Verbal Behavior, 7, 527-530. Huttenlocher, J. and Strauss, S. (1968) Comprehension and a statement’s relation to the situation it describes. Journul of Verbal Learning and Verbal Behavior, 7, 300--304. Huttenlocher, J. and Weiner, S. L. (1971) Comprehension of instructions in varying contexts. Cognitive Psychology, 2, 369.. 385. Ku&a, II. and I’rancis, W. N. (1967) Computational Analysis of Present-day American English. Brown University Press. McGarriglc, J., Grieve, R. and Hughes, M. (forthcommg) Interpreting inclusion. A contribution to study of the child’s cognitive and linguistic development. Walkerdine, V. and Sinha, C. (1977) The internal triangle: language, reasoning and the social context. In 1. Markova (Ed.) Language and the Social Context. Wiley, New York (in press). Wilcox, S. and Palermo, D. S. (1974) In, on and under revisited. Cognition, 3, 245-254.

Rtsumi On a test6, dans 5 contextcs situationnels diffbrents, Ia capacitC d’enfants de 2 et 3 ans ‘a comprendrc les tcrmes dans, sur et sous. Dans le premier contexte, Ies rdponscs ne dkpendent pas de Ia manipulation par l’enfant des objets cxpCrimentaux. Les don&es montrent que ces r6ponses sont toujours correctes - mises j part, quclqucs difficult& avec sous, chez les tr& jeunes enfants. Les quatre autrcs contextes impliquent la manipulation des objets par l’enfant, et Ies don&es montrent que les rCponses varient en fonction des syntagmcs nominaux utiIis& pour se rdfkrer aux objets expbrimentaux qui, eux, sont rest&s les m&mes dans Its diffbrcnts contextes. Ccs don&es suggirent que la comprkhcnsion dcs instructions par lcs jeunes e&ants, met en jcu une interaction cntrc lcs aspects lexicaux et syntaxiques de ccttc instruction et Ia construction du contcxte par I’enfant.

Cognition, 5 (1977) 251 - 263 @Ekevier Sequoia S.A., Lausanne

4 - Printed

in the Netherlands

Attention VIMLA

and

cognition

*

P. VADHAN

DANIEL

W. SMOTHERGILL

Syracuse

University

Abstract Forty 4-year-old children were subjects in an experiment designed to determine whether learning to attend to relevant cues is a sufficient condition for acquisition of length and number conservation. Three groups of non-conservers were trained by means of an oddity9-problem procedure to attend to the cues specifying either length, number, or both length and number. They were subsequently tested along with an untrained control group on tasks of length, number, mass and continuous quantity conservation. Some improvement in conservation was found, but it was neither impressive in magnitude nor specific to the cues of training. For example, the group trained to attend to length cues conserved length about 25% of the time, the same rate at which this group also conserved number, mass, and liquid quantitya. This non-specificity’ is contrary to an attention hypothesis and suggests instead that training, to the extent it was effective, induced a general, abstract quantitative knowledge. Although interest in attentional processes in children is increasingly evident (see Pick, Frankel, and Hess, 1976 for a review) there appears to be relatively little concern with questions about the nature of attention or its relations to other psychological processes. Such questions have always been recognized as important for a unified theory of mental processes. Recently, however, the many demonstrations that attentional variables, such as stimulus salience, exert a strong influence on the performance of nominally cognitive tasks (Odom, 1977) have served to indicate more concretely the pressing need to consider such questions explicitly. The present research was aimed at answering the following question. In what sense is it true that conservation is acquired as a function of learning *The research on which this article is based was conducted as part of a doctoral dissertation by the first author. The second author directed the dissertation and prepared this article. Reprints may be requested from: Daniel W. Smothergill, Department of Psychology, Syracuse University, Syracuse, New York, 13210, U.S.A.

252 Vimla P. Vadhan and Daniel W. Smothergill

to attend to relevant cues? The question is phrased in this manner because there appears to be substantial agreement among theorists that attention is involved in conservation acquisition. The issue on which disagreement centers is the precise role of attention. Two broadly different points of view can be distinguished. One, which we shall call the weak attention hypothesis, acknowledges the fact of attention in behaviors such as eye movements and selection of parts of a stimulus for detailed analysis, but asserts that such attention normally results from, or is in the service of, more fundamental psychological processes. Piaget (1952), for example, has described the process of acquiring conservation of liquid quantity in terms of 4-step attentional model. Attention is thought to be directed at first to the height dimension, then to the width dimension, then to both dimensions successively, and finally to both dimensions simultaneously*. While thisclearly is an attentional model in that patterns of centration are specified, Piaget does not intend to propose by it that attention causes conservation acquisition. Rather, his position, while not entirely unequivocal, is that attention reflects whatever mental operations are possessed at particular stages of development. A somewhat similar view of the relation between attention and cognition in adults has recently been expressed by Neisser and Becklen (1975). Attention is ascribed a more genuinely causative function in the process of cognitive change by a position with roots in the learning theories of recent years (e.g., Gibson, 1969; Trabasso & Bower, 1968; Zeaman & House, 1963). There are major differences among these theories on many issues but three areas of agreement can be identified which provide a basis for formulating what we shall call the strong attention hypotheses. First, there is agreement that learning is largely a matter of attention to relevant stimuli (or cues). Second, attention tends to be regarded as an explanatory process in its own right; there is little or no need for recourse to more fundamental mental operations. And third, a very clear legacv from learning theory, attention is thought to be malleable. This strong attention hypothesis leads to a straightforward prediction regarding conservation acquisition. It is that non-conservers should learn to conserve if taught to attend to the cues relevant for problem solution. This idea, suggested in a variety of sources (Gibson, 1969; Zimiles, 1966), has been articulated most clearly by Gelman (1969); “If 5’s learn to attend, and respond to, quantity and not other relational cues, then they should conserve when transferred to standard conservation tasks” (p. 170). *It should be noted with this developmental

that evidence progression.

from

a cross-sectional

study by P. Miller (1973)

is not consistent

Attention and cognition 253

Contemporary interest in the role of attention in conservation acquisition can be traced to Gelman’s study. Non-conserving 5-year-olds were taught in that study to solve oddity problems which required response to length cues on half the trials and to number cues on the other half. The critical length and number cues were sometimes redundant with and sometimes independent of such cues as spacing, arrangement, alignment, etc.: the cues thought to mislead non-conservers on conservation tests. Thus, attention to length and number cues, per se, was necessary to solve the training task. Conservation tests were administered after training. Three findings are of present interest. First, the oddity problems were learned very quickly. Second, virtually perfect transfer occurred to tests of length and number conservation as well as substantial non-specific transfer to tests of conservation of mass and liquid quantity. Third, follow-up conservation tests administered 2 to 3 weeks after training revealed no decline in performance over that span. These results are impressive but not clearly supportive of the strong attention hypothesis. As Graeme Halford (1970) pointed out, since subjects were trained to attend to both number and length cues (not to one or the other), the study did not demonstrate that training on a specific set of cues resulted in conservation of the related concept. Halford suggested that training afforded the opportunity to learn certain of the “constraints” between length and number which he believes must be understood in order for conservation to occur. He further suggested that a better test of the attention hypothesis for acquisition of number conservation “... would seem to be to train Ss exclusively to respond to the number of objects . ..” (p. 3 15). The type of experiment suggested by Halford had actually been carried out just prior to the appearance of his paper. Christie and Smothergill (1970) administered the 96 of Gelman’s oddity problems requiring response to length cues to a group of non-conserving 4-year-olds. Surprisingly, none of the children made even a single correct response on tests of length conservation administered after training. While this finding would appear to present the strong attention hypothesis a serious problem, its significance is in doubt because the subjects in this study did not attain the same high level of oddity-problem performance as Gelman’s subjects. That is, a reasonable corollary of the strong attention hypothesis is that attention to relevant cues must be well learned for conservation to result (Lipsitt & Eimas, 1972). Operationally, this means training subjects to a performance criterion rather than simply giving a fixed number of training trials. It is important in light of the interpretation often given to Gelman’s results to point out that Gelman herself did not view them as support for a strong attention hypothesis. Noting her subjects’ rapid learning during train-

254

Virnla P. Vadhan and Daniel W. Smothergill

ing and nearly errorless conservation performance Gelman concluded as follows: “The five-year-old child apparently does have to respond consistenti)) to quantity and not be distracted by irrelevant cues, but does not have to learn, de novo, to define quantity and invariance” (p. 185). And, “It would appear to be that these responses (quantity) are present in a child’s repertoire, but are dominated by strategies under the control of irrelevant stimuli” (p. 185). Rather than a strong attention hypothesis, these conclusions are more in line with a competence-performance model (Flavell and Wohlwill, 1969) in which conservation “competence” is presumed to exist prior to training, but is not manifest in behavior because “performance” factors interfere (distraction by irrelevant cues). Once the conditions interfering with performance are removed (i.e., attention is trained appropriately) conservation should appear. Mille (1975), noting that 5-years-olds are not far removed from the age at which conservation normally appears, has also raised the issue of preexisting abilities in Gelman’s subjects. The foregoing suggests that neither the Gelman nor the Christie and Smothergill studies provide a rigorous test of the strong attention hypothesis. We attempted to provide such a test by incorporating within the design of the present study several features intended to lessen the plausibility of alternative interpretations of the results. First, three separate groups were trained. One was trained to attend to length cues, another to number cues, and a third to both length and number cues. The first two groups provide tests of whether attention to specific cues is sufficient for conservation; the third group provides a test of Halford’s proposition that subjects must be trained on both length and number problems for conservation of number (or length) to be acquired. Second, it was the goal that all three groups be trained to a high performance criterion so that if conservation did not result it could not be argued that the reason was insufficient training of attention. Third, 4-year-olds were used as subjects in an effort to decrease the likelihood that conservation competence was present prior to training. It is not at all clear how to assess competence, c1priori, or even whether it is possible to do so. But it would seem that as a general rule the younger the age at which a cognitive skill appears as a result of training the less plausible it becomes to argue that training removed performance constraints on pre-existing competence.

Method

Subjects Forty selected

children, half of each sex, between 3.5 and 4.5 years old were as subjects from an original sample of 84. Children were excluded

Attention and cognition 255

from the original sample on the basis of (a) being unable to count to 5 (N = 17); (b) making a correct judgment on one or more conservation pre-test trials (N = 24); and, (c) absence during training (N = 3). The first criterion was established to make the study comparable to the Gelman study and the second to insure that only true non-conservers were studied. Pre-test An 8-trial pre-test consisting of 2 trials of conservation of mass, liquid, number, and length was administered to each child. Tasks and procedures followed as closely as possible those used by Gelman. One trial of each conservation test was given on one day and the second trial the following day. A child was classified as a non-conserver only upon making no correct judgments at all. The 40 children included in the final sample were assigned at random to one of four groups balanced evenly by sex. Train ing The training problems used were those devised by Gelman. As the group names denote, the Length group was trained on only length problems, the Number group on only number problems, and the Length and Number group on both types of problems. Gelman’s training procedure consists of 16 length problems and 16 number problems of six trials each. On the first trial of any problem the relevant cues (e.g., length) are redundant with the potentially misleading cues (e.g., alignment). On trials 2 through 5 the relevant and misleading cues are opposed to each other. On trial 6 the misleading cues are, in effect, eliminated. An example of the six trials defining one length problem is presented in Figure 1. Procedure The procedure for each child in the Length group and the Number group was the following. Six problems selected at random from the pool of 16 problems were presented consecutively. Within each problem the six trials followed in order and the experimenter’s question alternated haphazardly between, “Show me the one which is a different length (number)” and “Show me the two which are the same length (number).” Feedback was given after each trial and stimulus rearrangements between trials were carried out in full view of the child. The criterion of learning was set at one or no errors on trials 2 - 5 over the last two problems. These trials were chosen on the basis of being the ones in which concept-relevant cues are pitted against misleading cues. If criterion was not achieved on the first set of problems, six more were chosen at

256 Vimla P. Vadhan and Daniel W. Smother@

Figure 1.

Stick arrangements defining each of 6 trials of one length problem.

PROBLEM

TRIAL

1

2

4 I

I

-

5

6

random from the remaining ten problems and administered at the same session. If criterion was not achieved on this second set, the child was brought back the following day at which time a third, and if necessary a fourth, set of problems were administered. No further training was given if criterion was not reached on the fourth set. The same basic procedure was followed in the Length and Number condition, the only difference being that there were six problems of each type and problems of each type were presented alternately. If the criterion of learning was achieved on one type of problem before the other, the other was then presented by itself. As in the other training conditions, four attempts to

Attention and cognition 251

Table 1.

Number of subjects reaching criterion in each group (N = IO per group)

Group

Test Length

Length Number Length and Number

achieve criterion were allowed, ed in this condition.

Number

10 1 8

10

so a maximum

of 48 problems

were present-

Post-test

Conservation post-tests were conducted one day after the end of training. Four trials of each of four conservations (length, number, mass, and liquid) were administered with the order of tests random for each subject. Tests were constructed and administered exactly as described by Gelman. Explanations were required for all answers.

Results Train ing

The number of children reaching criterion in each training condition is presented in Table 1. All subjects solved the length problems and most did so quickly. Inspection of individual data records revealed that every subject in the Length group reached criterion on the first set of problems and that 93 percent of the responses made by this group were correct. It is also of interest to note in Table 1 that while only one subject in the Number group reached criterion, eight of the ten subjects trained on both length and number reached criterion on number. Conservation

post-tests

The transfer data were examined in two ways. First, the number of correct responses was tallied and analyzed by a 4 (Group) X 4 (Type of Conservation Test) X 2 (Sex of Subject) ANOVA. The main findings were these: (a) Sex of Subjects was not significant and did not enter into any significant interactions; (b) Group was a significant factor, F = 2 1.87,3,32 df, p < 0.01; (c) Type of Conservation Test was not significant; and, (d) the interaction of

258

Vim/a P. Vadharl and Daniel W. Smothergill

Table 2.

Number of Conserving Probability of Adequate

Group

Control Length Number Length and Number

Judgments (max. =40 per cell) and Conditional Explanation (in parentheses)

Conservation Test Length

Number

Mass

Continuous Quantity

1 11 18 21

2 12 18 21

0 12 (0) 8 (0.13) 19 (0.32)

6 12 12 19

(0) (0.18) (0.05) (0.52)

(0) (0) (0.28) (0.10)

(0) (0) (0) (0.47)

Group X Type of Conservation Test was not significant. The number of correct responses in each cell of the non-significant Group X Type of Conservation Test is presented in Table 2. Also in Table 2, in parenthesis, is the conditional probability for which adequate explanations accompanied correct responses. Application of the Newman-Keuls method to determine the source of the significant Group effect revealed that each of the three training groups performed better than the Control @ < 0.01) and that the Length and Number group performed better than both the Length group (p < 0.01) and the Number group (JJ < 0.05). In summary, two major findings emerged from the ANOVA. First, attention training produced a positive effect on conservation performance and training on both number and length cues was more effective than training with either cue by itself. Second, training did not have attention-specific effects on conservation. Training on length cues, for example, resulted in the same level of performance on tests of liquid, mass, and number conservation as it did on tests of length conservation. The foregoing analyses have to do with frequency of conserving judgments and do not indicate how many subjects learned to conserve. An answer to the latter question requires that a performance criterion be adopted. The strictest criterion that could be set would be perfect judgments ~_‘lusadequate explanations on all trials of a particular conservation. By that criterion it would be concluded that no subjects acquired any of the conservations. Indeed, as might be surmised from Table 2, if adequate explanations for even some judgments were included as part of the criterion for being a conserver virtually all subjects would be excluded. In Table 3 is presented the number of subjects classified as conservers on the basis of two criteria involving correct judgments only.

Attention and cognition

Table 3.

Number cell)

Group

of subjects classified conservers

No. of correct judgments

Control Length Number Length and Number

259

by two criteria (max. = 10 per

Conservation Test Length

Number

Mass

Liquid

3of4

4of4

3of4

4of4

3of4

4of4

3of4

4of4

0 2 3 4

0 1 0 2

0 2 4 3

0 2 1 3

0 3 0 5

0 1 0 2

0 2 2 3

0 1 1 2

Relation between training performance and conservation If conservation acquisition is a matter of learning to attend to relevant cues, then performance in the two domains should be positively related. The findings offer no support for this hypothesis. Large group differences evident during training were not paralleled in conservation. Although ten subjects in the Length group as compared to only one subject in the Number group met the training criterion, the conservation performance of the two groups was quite similar. Also, as we have already noted, training did not transfer specifically from a set of cues to the related conservation. Finally, correlations were carried out between two measures of performance in training (number of trials and number of correct responses) and number of conserving responses. No significant relations were found for any of the training groups separately or for all groups combined.

Discussion These results while unexpected in some ways answer at least several of the questions at which the study was aimed. For one thing, the strong attention hypothesis would seem to have received a fair test but very little support from the data. All subjects in the Length group reached the criterion for attention to cues specifying length yet only 2 of 10 achieved the weak criterion for length conservation and just one met the stronger criterion. On the other hand, only one of 10 subjects in the Number group reached the training criterion but 4 of them conserved number by the weaker criterion and one by the stronger criterion. More generally, there was no correlation

260

Vitlda P. Vadhatt and Dam?/ W. Smotlwrgill

between training performance and conservation for any of the training groups. Thus, attention was trained in some cases and not in others without concomitant effects on conservation. It might be argued that the number of problems presented in training was too few to expect transfer to conservation; that ‘overlearning’ is required. Gehnan’s subjects received 16 problems of each type (length and number) while subjects in the analogous condition of this study could have received as few as 6 problems of each type. Two objections can be raised against this point. One is that in our previous study (Christie and Smothergill, 1970) the same number of length problems were presented as in Gelman’s study, yet conservation of length did not result. A second consideration is that examination of individual data records revealed that 4 of the 10 subjects in the Length and Number condition actually received more problems than Gelman’s subjects yet these subjects accounted for proportionately no more conservation responses than the 6 subjects who received fewer trials than Gelman’s subjects. The conclusion that attention and conservation were found to be unrelated in this study is supported by another aspect of the results. Frequency of correct responses is shown in Table 2 to be no greater for conservation tests directly related to training than for conservations completely unrelated to training. The Length group, for example, responded correctly to 11 of 40 questions testing length conservation and to 12 of 40 questions testing conservation of continuous quantity. Overall, the absence of a significant Group X Type of Conservation Test means that training did not transfer specifically to the related conservation. Since the strong attention hypothesis holds that conservation is acquired by learning to attend to the relevant cues this finding seems directly contrary to the hypothesis. It should also be pointed out that Gelman too found that performance improved on conservations unrelated to training. On tests of conservation of continuous quantity given the day after training Gelman found about 50% correct responses; and, on the same tests given 2 - 3 weeks later that figure increased to better than 70%. Thus, the present findings of non-specific training effects are consistent with Gelman’s findings and neither are in keeping with expectations from the strong attention hypothesis. The present findings, in particular, indicate that specific training results in increased knowledge about a variety of quantitative concepts. A recent study by Siegler (1976) concerned with the difference between encoding and knowledge is relevant to the present findings. In a series of experiments on children’s understanding of the principles governing balance scales, it was revealed that 5-year-olds did not adequately encode (attend to) the relevant dimensions (number of weights and distance from the fulcrum).

Attention

and cognition

261

Upon training by means of modeling and explanation, the children’s encoding improved but their understanding of balance principles did not. Understanding of the principles improved if subjects were first given encoding training and then allowed to observe the outcome of placing various weights at different distances. Outcome observation without encoding training was ineffective for 5-year-olds. Siegler’s finding that training subjects to encode the relevant dimensions did not result in improved knowledge is paralleled by the present finding that attention training did not result in conservation. An important aspect of both studies is that encoding (attention) was measured independently of knowledge. In discussing his results in light of other research, Siegler observes that it has sometimes been suggested in the literature that improved attention results in improved knowledge (the strong attention hypothesis) while his own findings do not support such a view. Siegler suggests this synthesis: “It seems likely, then, that improved encoding can either produce direct changes in knowledge or can produce the conditions necessary for such change. Which type of phenomenon occurs in a specific situation may provide an index of how far children are from acquiring the relevant concept. If changes in encoding produce new knowledge directly, then children might be presumed to be reasonably close to inducing the knowledge on their own; if intervening experience is also necessary then the acquisition may be more removed from the child’s level” (p. 5 15 - 5 16). This suggestion provides a framework for integration of Gelman’s findings and the present ones. Gelman’s subjects, a year older than those in this study, acquired new knowledge (i.e., conserved) as a result of attention training while the younger subjects of this study, farther away from acquiring conservation on their own, learned to attend to the relevant dimensions but did not show much improvement in conservation. Although this interpretation is post-hoc, the conceptural framework on which it is based has some interesting and testable: implications. For example, 4-year-olds would be expected to show marked improvement in conservation if given appropriate experience with conservation problems but only if such experience were given subsequent to attention training. The same experience given without prior attention training should be ineffective. It is, of course, difficult to say just what might constitute ‘appropriate experience’ but indications from a number of studies suggest that the observation of reversibility might be a good candidate (Brainerd & Allen, 197 1). It should be emphasized that Siegler’s formulation is an attempt at synthesis and has not, to our knowledge, been verified empirically. There are some portions of Siegler’s own results that do not seem consistent with it.

262

Vinda P. Vadhan and Daniel W. Smother-gill

Eight-year-olds, also included in the study, showed improved knowledge as a result of experience with the balance problem but not as a result of encoding training as Siegler’s formulation would seem to predict. It remains to be demonstrated in a single study that dimensional training results in increased knowledge for those who are more ‘ready’ but not for those who are less ready to acquire the knowledge naturally. Finally, the present findings offer some support for Halford’s thesis that true understanding of number requires that the constraints between number and length be learned. Eight of 10 subjects trained on both number and length reached the training criterion on number while only one of 10 subjects trained alone did so. Also, the conservation performance of the Length and Number group, while modest, was significantly better than that of the separate training groups. While these findings could be the result of the more extensive training given to the Length and Number group, the absence of correlations between training and conservation suggest that Halford’s basic idea may be correct.

Brainerd, C. J., and Allen, T. W. (1971) Experimental inductions of the conservation of “first order” quantitative invariants. Psychol. Bull., 75, 128-144. Christie, J. I;., and Smothergill, D. W. (1970) Discrimination and conservation of length, Psychon. Sci., 21, 336-337. Flavell, J. Il., and Wohlwill, J. F. (1969) Formal and functional aspects of cognitive development. In D. Elkind & J. H. Flavell (Eds.), Studies in cognitivedevelopment: Essays in honor of Jean Piaget. New York, Oxford University. Gelman, R. (1969) Conservation acquisition: A problem of learning to attend to rclcvant attributes. J. exp. Child Psychol.. 7, 167-187. Gibson, E. J. (1969) Principles of perceptual learning and development, New York, Appleton-Century. Halford, G. S. (1970) A theory of the acquisition of conservation. Psychol. Rev., 77, 302-316. Lipsitt, L. P., & Eimas, P. (1972) Developmental psychology. In P. H. Musscn & M. R. Rosenlwcip (Eds.), Annual review, of psychology. Palo Alto, Calif., Annual Reviews. Miller, P..H. (1973) Attention to stimulus dimensions in the conservation of liquid quantity. Child Devel., 44, 129-136. Miller, P. H. (1975) The development of attention and conservation. Ann Arbor, Michigan: Report #?73, Developmental Psychology Program, University of Michigan, Septcmbcr 1975. Neisser, V., & Becklen, R. (1975) Selective looking: Attending to visually specified events. Cog. Psycho/., 7, 480-495. Odom, R. D. (1977) The decatage from the perspective of a perceptual salience account of devclopmental change. In D. W. Smotherpill (Chair), Attention and Cognition. Symposium presented at the meeting of the Society for Research in Child Development, New Orleans, 1977. Piaget, J. (1952) The child’s conception ofnumber. New York. Humanities.

Attention

and cognition

263

Pick, A. D., Hess, V. L., and Frankel, D. G. (1976) Children’s attention: The development of selectivity. In E. Mavis Hetherington (eds.), Review of Child Development Research. Vol. 5. Chicago, University of Chicago Press. Sicgler, R. S. (1976) Three aspects of cognitive development. Cog. Psychol., 8, 481-520. Trabasso, T., and Bower, G. (1968) Attention in learning: Theory and research. New York, Wiley. Zcaman, D., and House, B. J. (1963) The role of attention in retardate discrimination learning. In N. R. Ellis (Eds.), Handbook of mental deficiency. New York, McGraw-Hill. Zimiles, H. (1966) The development of conservation and differentiation of number. Mono. Sot. Res. Child Devel., 31, Whole No. 6.

Rdsumd Quarante enfants de 4 ans ont et8 les sujets d’une experience dcstinde i Studier si le fait d’apprcndre i rclever les indices pertinents suffit pour acquerir la conservation du nombrc et cellc de la longueur. On a entrain& trois groupes d’enfants non-conservants avec des cxercices dans lesquels l’attention des enfants Ctait attiree sur lcs indices spccifiant soit la longueur, soit le nombre, soit le nombre et la longueur. Ces enfant, ainsi qu’un groupe de controle, ont dti? ensuite testes avec des t&hes de conservation de la longueur, du nombre, du poids et dcs quantitcs continues. On a remarque un certain progres mais qui n’dtait d’un ordre de grandeur concluant ni spdcifiquement lit aux indicts utilises lors dc l’entraincment. Ainsi, le groupe qui avait cu un apprcntissage avec des indices de longucur conserve cellc-ci a 25%, au mitmc taux que lc nombre, Ic poids ou Ic liquidc. Cettc non-specificitc n’appuic pas I’hypothese du role de l’attention ct suggcre, par contre qu’un cntrainement, s’il est cffectif, induit i une connaissance quantitative abstraitc g&n&ale.

Cognition, 5 (1977) 265 - 283 @Elsevier Sequoia %A., Lausanne

5 - Printed

in the Netherlands

Language and reasoning: A study of temporal factors* J. St. B. T. EVANS** S. E. NEWSTEAD Plymouth

Polytechnic

Abstract The paper is concerned with the testing of psycholinguistic hypotheses by the use of deductive reasoning tasks. After reviewing some of the problems of interpretation which have arisen with particular reference to conditional rules, an experiment is presented which measures comprehension and verification latencies in addition to response frequencies in a truth table evaluation task. The experiment tests a psycholinguistic hylpothesis concerning the different usage of the logically equivalent forms of sentence: If p then q and p only if q with respect to the temporal order of the events p and 9. It is proposed that the former sentence is more natural when the event p precedes the event 9 in time, and the latter more natural when the opposite temporal relation holds. Although significant support is found for the hypothesis in the analysis of the latency data, it is only distinguished from an alternative explanation by detailed analysis of response frequencies, thus indicating the general usefulness of the paradigm adopted.

In recent years there has been an upsurge of interest in the study of deductive reasoning as a means of testing psycholinguistic hypotheses. There is, however, a logical difficulty pointed out by Smedslund (1970) which he describes as a circular relation between understanding and logic. That is, one can either investigate a subject’s understanding of the sentences forming the premises of an argument by assuming he reasons logically, or else one can ascertain whether the subject reasons logically by assuming the nature of his *The authors would like to acknowledge the assistance of Paul Pollard in the collection of data. **Requests for reprints should be sent to Dr J. St. B. T. Evans, School of Behavioural & Social Science, Plymouth Polytechnic, Plymouth PL4 8AA, Devon, England.

266 J. St. B. T. I:‘vansand S. I:: Newstead

Table 1. Truth Table Case

TT (ml 1-1’(~3 FT GW FF (PrD

T = true

Some

alternative truth tables for the rule If p then q Truth Value of Rule (a) Material implication

(b) Defective implication

(c) Material equivalence

(d) Defective equivalence

T

T

T

T

F

F I I

1:

F

F

I; 1

T T

F = false

T

I = irrelevant

understanding. The former approach, advocated by Henle (1962) has been very influential, as may be judged, for example, by inspection of a number of the contributions to a recent collection of papers on deductive reasoning (Falmagne, 1975). That such an approach is oversimplified has been argued by Evans (1972a) who points out that while interpretational factors clearly influence reasoning performance, one must also take account of nonlogical operational factors (response biases, etc). Evans recognises the circularity problem and argues that while interpretational and operational influences cannot be distinguished within a single paradigm, they can be separated by a multiple paradigm approach. An area of deductive reasoning which has received considerable attention has been the study of reasoning with conditional sentences of the general form Ifp then q, which have been traditionally assumed by logicians to reflect a relationship known as material implication. This relation is defined formally in logic by a truth table (see Table l(a)). Traditional logic permits only two truth values, true (T) or false (F). Effectively, a proposition is considered true unless proven false. In the case of material implication only one combination of values is supposed to falsify the rule, a true antecedent in combination with a false consequent (the TF case). One of the available paradigms for studying conditional reasoning consists of examining people’s tendencies to make or withold four basic inferences. These are the following: Modus Ponens: If p then q; p therefore q. Denial of the antecedent: Ifp then q; not p therefore not q. Affirmation of the consequent: If p then q; q therefore p. Modus Tollens: If p then q; not q therefore not p. Given a truth table for material implication only two of these inferences, Modus Ponens and Modus Tollens are valid while the other two are falla-

Language and reasoning 261

cious. This is because the truth table forbids only the combination p and 4 (not y), so given p we know that q must occur (Modus Ponens) and given 4 we know that p may not occur, therefore we conclude /3 (Modus Tollens). On the other hand, we learn nothing from p which may be paired with either 4 or 4, or from y which may be paired with either p or p. A study of conditional reasoning by Taplin (197 1) adopted implicity the Henle hypothesis and attempted to characterise the subjects’ interpretations of conditional sentences in the form of a truth table. The truth table was not measured directly but inferred by the experimenter from the subjects’ tendencies to make the four basic inferences described above. Taplin found, in line with most earlier studies, that the majority of subjects tended to make all four inferences including those which are fallacious under material implication. He concluded that most subjects tend to interpret the rule as a biconditional, which has a truth table defined in logic as muterial equivalenw (Table l(c)). However, a large number of subjects failed to reason consistently in relation to any truth table. Taplin and Staudenmayer (1973) replicated the Taplin (197 1) study using abstract materials (in the original study they were thematic). In a second experiment, however, a slight change in procedure produced a dramatic change in the reasoning patterns, which now conformed more generally to a truth table of material implication (a result which emphasises the dangers of extrapolating from single paradigms). Further complications for the interpretation of conditional reasoning are occasioned by the introduction of negative components. Roberge (1971, 1974) and Evans (1972b, 1977a) have shown that the frequency of all inferences except Modus Ponens varies significantly as a function of introducing negative components. The most striking factor in these studies is an apparent response bias against making arguments whose conclusions involve the denial of a negative component. That this is an operational rather than interpretational factor is strongly suggested by the fact that the effect is not limited to one kind of inference, it occurs on Modus Tollens, Denial of the Antecedent and also as Evans (1972~) has shown, on Reductio and Absurdum arguments. Nor is it limited to conditionals: in a recent study of reasoning with exclusive disjunction arguments by Roberge (1976) the same response bias was observed. It can be seen from the above considerations that studies of inference patterns do not yield the kind of clear cut information about the interpretation of propositions as supporters of the Henle hypothesis would claim, since results depend upon the exact paradigm used and are subject to nonlogical response biases. An alternative general paradigm lies in the attempt to take direct measures of ‘psychological’ truth tables. Wason (1966) proposed that subjects’ truth tables for a conditional rule Ifp then q may have an addition-

268

J. St. B. T. Evans and S. E. Newstead

al third truth value to that of traditional logic, namely ‘irrelevant’. Specifically, he proposed that subjects have a ‘defective’ implication truth table (Table l(b)) in which subjects regard any case in which the antecedent is false as irrelevant rather than true. However, in an attempt to measure a psychological truth table by asking subjects to evaluate contingencies, Wason (1968) found more commonly the truth table shown in Table l(d) which we shall term defective equivalence. This differs from material equivalence in that the FF case is irrelevant. However, the truth table task in Wason (1968) was embedded in a complex design following a related logical task. A study specifically aimed at investigating psychological truth tables was that of JohnsonLaird and Tagart (1969) who found support for Wason’s original hypothesis, i.e. defective implication, for the rule If11 then y and to a lesser extent the logically equivalent NCVCYp hvithout cl. On the other hand two other linguistic expressions, regarded as material implication in formal logic, showed no consistent pattern. Hence this study shows, importantly, that alternative linguistic formulations of the same logical relation may produce different kinds of reasoning behaviour. Further evidence for the defective implication truth table for Zj’p thw q was reported by Evans (1972d) using a construction rather than evaluation task. However, he also found that as in the inference task paradigm the introduction of negative components produced a strong response bias which was termed ‘matching bias’. Subjects are more inclined to classify a case as ‘irrelevant’ if the values negate or fail to match those named in the rule, irrespective of the presence of negatives. Matching bias is also observable when an evaluation truth table task is used (Evans, 1976), and has been used successfully to predict performance on Wason’s selection task (Evans & Lynch, 1973). (Revlis, 1975, has found evidence of a response bias in syllogistic reasoning which appears to parallel matching bias.) In comparing results from the two alternative paradigms we should first of all ask what inference patterns would result from the subject ‘possessing’ the defective truth tables given in Table 1. We have already noted that Modus Ponens and Modus Tollens only are valid for material implication because the TF combination is the only forbidden case. The same is true for the defective implication truth table and hence the same two inferences are valid. Under material equivalence the FT combination is also forbidden, so that the Affirmation of the Consequent and Denial of the Antecedent inferences also become valid. (This is because PQ may not occur, hence p implies 4 and (/ implies p.) Since the defective equivalence truth table also forbids this case, it too must require all four inferences to be valid. Hence, while inference patterns may indicate underlying equivalence or implication they cannot differentiate whether or not these are defective.

Language and reasoning 269

Ignoring for the moment the problem of response biases, we note that experiments within both paradigms yield an ambiguous picture of the interpretation of conditional rules in terms of the implication/equivalence distinction. (On the whole, inference pattern studies more often suggest equivalence and direct truth table measures more often suggest implication.) One interpretation of these unclear results is that the syntactic form Zfp then ~7is actually ambiguous when presented in abstract form in a reasoning experiment. Logicians use the biconditional If and only, if p then q to express equivalence. Natural language users, on the other hand, use Zf p then q to express either relation, the appropriate interpretation being indicated by semantic constraints. For example ‘If it’s a dog then it is an animal’ would be taken as implication, whereas ‘If he is over 18 then he is entitled to vote’, would be taken as equivalence. In reasoning experiments these usual semantic constraints are normally missing, so the subject guesses at the intended meaning. This would explain why small changes of procedure, with subtly different demand characteristics, can produce quite different results (as in Taplin and Staudenmayer, 1973). Whichever paradigm is adopted response biases cannot be eliminated, although they may be controlled and taken into account. Such an attempt was made in a study by Evans (1977a) in which an interpretational hypothesis was tested in relation to two alternative linguistic formulations of material implication both of which may be regarded as conditional. The two forms are ‘If...then...’ (IT) and ‘...only if...’ (01). To take affirmative examples the rule 1f p then q has an identical truth table in formal logic to p only if q although they do not appear intuitively to mean the same thing. A simple hypothesis is that subjects convert p only if q into Zf q then p. This conversion hypothesis is not too convincing in practice: both on a truth table evaluation task (Evans, 1976) and on the Wason ‘selection’ task (JohnsonLaird, Legrenzi & Legrenzi, 1972) very similar patterns of reasoning have been observed. Also on a priori grounds why should the ‘...only if...’ form exist if its meaning can be exactly expressed by an ‘If...then...’ sentence? Evans (1977a) using an inference pattern paradigm, tested the hypothesis that the IT form emphasises the sufficiency of the antecedent for the consequent and hence the Modus Ponens inference, while the 01 form emphasises the necessity of the consequent for the antecedent and hence the Modus Tollens inference. While overall significant support for these hypotheses was found, several problems arose in the interpretation of the results. For one thing, the effect of varying the presence and absence of negative components interacted with the rule form. There were significant response biases of the kind mentioned earlier on the IT but not the 01 rules. Also, there was some evidence suggestive of some conversion of the 01 rules. Although response latencies were measured these were not particularly helpful since total reason-

210 J. St. B. T. Evans and S. E. Newstead

ing time was used with no attempt to separate the component used simply for comprehension. Some surprisingly interesting information was, however, obtained by asking subjects to make up thematic examples of IT and 01 sentences. A clear linguistic difference was observed by Evans (1977a) with respect to temporal relationships. Whenever the antecedent and consequent referred to separate events the temporal order for these was different for the two rule forms. If the antecedent event preceded the consequent event in time the IT form was always used, e.g., ‘If it rains on Tuesday then I shall go swimming’. Whenever the consequent event preceded the antecedent event in time the 01 form was used, e.g., ‘The game will take place only if the weather improves’. Logically equivalent but linguistically anomalous are the sentences ‘It will rain on Tuesday, only if I go swimming’ and ‘If the game takes place then the weather will improve’. Note, however, that both the IT and 01 forms in their ‘natural’ usage express a relationship of material implication. The IT sentence is falsified only if it rains on Tuesday and I do not go swimming. Similarly, the 01 sentence is only falsified if the match takes place and the weather does not improve. From this analysis the following psycholinguistic hypothesis was deduced: while IT and 01 are both used to express material implication, the IT form is natural when the event referred to in the antecedent temporally precedes the event referred to in the consequent, and the 01 form is natural when the consequent event temporally precedes the antecedent event. The experiment to be reported in this paper tests this hypothesis on a deductive reasoning task. It was a truth table evaluation task in which direction of temporal relations was varied on abstract IT and 01 rules. Although clear differences in response frequency are to be expected on this kind of task, it is by no means clear that these data will permit a suitable test of the hypothesis. For this reason response latencies are measured and separated by a method similar to that used by Trabasso, Rollins and Shaugnessy (1971) into comprehension and verification times. It is intended to analyse these two latency measures and response frequency separately, and then to consider certain relationships between these different dependent measures. The following a priori hypotheses were formulated: 1. The order of temporal relationship between antecedent and consequent will interact significantly with the rule form in the comprehension time analysis. 2. Negative components should produce larger latencies in both the comprehension and verification analyses. 3. The analysis of response frequencies should, in accordance with previous literature, (Evans, 1972d, 1976) show a significant ‘matching bias’ effect.

Language and reasoning 211

4. The analysis of response frequencies convert the 01 rule into a converse IT rule.

may

indicate

a tendency

to

Method The experiment involved a multifactorial design in which all factors were tested on a within subject basis. Four factors manipulated linguistic aspects of the rules with which subjects were required to reason. The factors were: (1) Rule form: IT or 01 (2) Temporal order: antecedent event precedes consequent event (Tl) or consequent event precedes antecedent event (T2). (3) Antecedent: affirmative or negative (4) Consequent: affirmative or negative For the Verification Time and Response Frequency analyses, however, a fifth factor is involved, namely the Truth Table Case presented for verification (TT, TF, FT or FF). The subjects’ task was to evaluate the truth value of a conditional rule in the light of particular evidence present. The materials are best illustrated by examples of the rules used. Thus consider an IT rule with negated consequent in time order Tl : IF THE FIRST LETTER IS G THEN THE SECOND LETTER IS NOT R The following T2:

example

is an 01 rule with negated

antecedent

in time order

THE SECOND LETTER IS NOT Z ONLY IF THE FIRST LETTER IS T linked to two The problems were presented on a three field tachistoscope external timers. A measure of comprehension time (CT) was taken by asking the subject to press a key which displayed the sentence only and press it again when he was ready to perform the reasoning task. The interval between presses being taken as CT. The instructions emphasised that the subject should endeavour to make quite sure he understood the sentence at this stage in order to save time on verification. Following the second key press, the subject was shown two capital letters one after the other (employing the remaining two fields of the tachistoscope). Each letter was presented for one second - well above recognition threshold. During presentation of the letters the sentence disappeared, but subsequently reappeared to reduce problems of memory load. For each type of sentence there was a (separate) presentation of a pair of letters corres-

272

.I. St. B. T. Evans and S.I<. Newstcad

Tllc mat1 ~otllprcllcr1siorl times it1 seconds jbr the 16 cells (n = cell)

Table 2.

R de fbrm Temporal

IT order.

16 in each

01

TI

T2

6.27 7.53 7.02 8.63

7.86 8.26 8.44 10.87

TI

T2

Ncgativc combinations* AA AN NA NN

*AN = At’firmatlvc

antecedent

and nqative

7.81 9.00

10.88 13.53

7.67 9.14 9.85 11.95

conscqucnt.

ponding to each truth table case. Thus for the first example of a rule given above, a G followed by an R would be TF; a G followed by a letter other than R would be TT and so on. For time order T2 (as in the second example) it is the letter presented second that determines the truth value of the antecedent and that presented first which determines the truth value of the c011sequent. The subject was instructed to decide whether the pair of letters presented either conformed to the rule, contradicted the rule or was irrelevant to it. On making his decision he pressed a second key and announced his decision as ‘true’, ‘false’ or ‘irrelevant’. The interval between the second pair of key presses was taken to be the verification time (VT). Each subject was given eight practice trials with a task essentially similar to that used in the experiment proper, except that the rules were of a disjunctive form. The main sequence itself consisted of 64 problems: 16 sentence types, times four truth table cases. The materials were constructed by random allocation of letters and numbers to each of the 64 problems, with the restriction that no combination occurred more than once. Presentation order was controlled by constructing four independent random orders of the 64 problems. For each order two subjects were run using the sequence originally generated, and two were run with the sequence in exact reverse order. Sixteen staff and students of Plymouth Polytechnic served as subjects on a paid volunteer basis and were tested individually. A random mixture of both sexes was employed.

Language and reasoning

213

Results Responx

Latrncies

The comprehension times were submitted to a logarithmic transformation to correct positive skew and submitted to a four way within subject analysis of variance. The appropriate cell means appear in Table 2. For brevity, the presence or absence of negative components is indicated as follows: AA = affirmative antecedent and affirmative consequent; AN = affirmative antecedent and negative consequence; NA = negative antecedent and affirmative consequent; NN = negative antecedent and negative consequent. The main hypothesis of an interaction between time order and rule form was significant (F = 11.56, p
*All F ratios quoted

on CT and VT ax tested on 1 and 16 degrees of freedom.

274

J. St. B. T. Evans and S. 17. Newstead

Table 3. Rule

The mean verification times in seconds for 64 cells (n = I6 in each cell) IT

form:

Temporal order:

Tl

Negative combination

Truth Table Case

AA

TT TF FT FF

3.51 4.22 1.55 5.61

AN

TT TF FT FF

3.20 3.60 5.11 6.19

NA

TT TF FT FF

3.21 3.65 5.12 1.69

NN

TT TF FT FF

5.31 5.03 8.96 6.91

OI T2

4.06 4.24 6.63 4.22

T2

Tl

3.11 4.09 6.81 5.21

4.40 4.03 6.48 4.52

4.31 4.34 6.56 9.32

4.85 4.81 6.66 5.86

3.52 4.12 5.26 9.11

4.38 5.51 5.55 8.00

4.49 6.52 4.85 8.32

6.18 5.50 1.67 1.46

6.13 8.63 8.20 10.31

5.61 1.39 1.89 8.28

3.11 5.16 4.40 10.92

Response frequencies For each truth table case under each type of rule it is possible to count the frequency with which the 16 subjects generated each of the alternative responses: true, false and irrelevant. This frequency count gives rise to the ‘psychological’ truth tables shown in Table 4. On casual inspection it can be seen that TT is the most commonly classified verifying case on all rules. There is a clear but weaker tendency for TF to be given as falsification. The classification of FT and FF cases appears much more variable. Evans (1972d) found that the first two (logically correct) tendencies competed with a strong ‘matching bias’ tendency. The effect of matching bias is illustrated for the present experiment in Table 5. As Evans (1972d) has pointed out, matching affects the probability of regarding the case as relevant, rather than the direction (T or F) of classification. Hence the hypothesis is that the number of irrelevant classifications will rise as the number of mismatches increases.

TI

TT

16 0 0

T F ?

T F 3

T F ?

T F ?

Temporal order

Negative combinations

AA

AN

NA

NN

16 0 0

16 0 0

16 0 0

IT

1 15 0

0 10 6

0 16 0

0 16 0

TF

2 7 7

0 12 4

3 1 15

0 10 6

FT

6 5 5

3 4 9

7 3 6

0 2 14

FF

Psychological truth tables

Rule form:

Table 4.

14 2 0

14 1 1

16 0 0

16 0 0

TT

T2

3 12 1

2 9 5

2 14 0

1 15 0

TF

4 6 6

1 12 3

1 3 12

0 10 6

FT

7 5 4

7 5 4

5 4 7

1 1 14

FF

13 0 3

15 1 0

15 0 1

16 0 0

TT

T1

OI

1 13 2

2 8 6

0 16 0

0 15 1

TF

3 10 3

1 14 1

0 6 10

1 11 4

FT

6 5 5

9 4 3

2 4 10

3 1 12

FF

13 0 3

14 1 1

11 3 2

14 0 2

TT

T2

3 8 5

2 7 7

0 15 1

0 14 2

TF

4 10 2

0 15 1

1 8 7

0 12 4

FT

6 6 4

7 5 4

6 3 7

2 2 12

FF

276

J.

Table 5.

Matching values

St. B. T. Evans and S. A’.Newstead

Mean number of ‘irrelevant’ classifications contingency and logical case

as a function

of matching

Logical case TT

TF

P4

0.50

0.50

2.25

4.50

1.94

P4

0.75

0.75

4.50

5.00

2.75

P4

0.50

2.00

5.00

7.50

3.75

p4

1.50

6.00

11 .oo

13.00

7.88

FT

FF

Overall

An analysis of variance of the frequencies of irrelevant classifications (computed over truth tables) revealed a highly significant main effect of matching in both the antecedent (F 1s48 = 68.27, p
Language and reasoning

211

Each of these predictions can be tested eight times, that is for each of the four negative combinations, on each of the two temporal orders. By inspection of Table 4 it can be seen that six comparisons are in the direction predicted by hypothesis (1) and one against (p = 0.062, one-tailed binomial test). All eight of the predictions of hypothesis (2) are in the correct direction 01 = 0.004). Taken jointly, 14 of the 16 predictions are in the expected direction and only one against @
Discussion The analysis of comprehension times strongly supports the predicted interaction between Rule Form and Temporal Order. That is the IT rule is processed quicker when the antecedent refers to an earlier event than the consequent. Further support for the hypothesis may be derived from the finding of a significant interaction of the same type in the verification analyses. In formulating the main hypothesis it was assumed that both IT and 01 rules express material implication but are used ‘naturally’ to express one or the other time order of events. An alternative explanation which is entirely consistent with the latency analyses is that subjects in this experiment are simply converting all the 01 rules into converse IT rules. This would explain the observed crossover interaction since what has been classified as T2 for 01 rules would, in fact, be Tl for the converted rule. Hence, in effect, data on both rules would be consistent with the hypothesis that Tl (antecedent prior to consequent) is easier to process. Further, consistent with the conversion hypothesis is the fact that 01 rules were processed significantly slower than IT rules on the comprehension but nor the verification time analysis. This is consistent with the idea that an additional component of ‘conversion time’ is added for 01 rules prior to verification. Although these alternative hypotheses cannot be distinguished by the latency analyses we have additional data available in the form of response frequencies (Table 4). The analyses have revealed a significant tendency to convert the 01 rule. The effect appears, however, to be relatively weak, since the observed totals differ only by one or two in most cases. Evaluation of the strength of this conversion tendency is clearly essential to the interpretation of the comprehension time results. For this reason an analysis was carried out to see whether the 01 truth tables match the IT truth tables better on the assumption that the former are or are not generally the result of converted rules. Table 6(a) shows the degree of match between IT and 01 rules

278

J. St. B. T. Evans and S. E. Newstead

Matching scores between IT and OI rules on the assumption (a) that the rules are interpreted in the same way and (b) that OI rules are converted

Table 6.

(4

lb)

SAME assunlption

(IT to 01)

IT rule

01 rule

TTto

AA, AA, AN, AN, NA, NA, NN, NN,

AA, AA, AN, AN, NA, NA, NN, NN,

0 4 2 10 2 0 6 6

Tl T2 Tl T2 Tl T2 Tl T2

Tl T2 Tl T2 Tl T2 Tl T2

TT

TF to TF

FT to FT

FF to FF

4 4 10 10 2 2 8 8

6 4 10 2 12 0 0 2

CONVERSION

assumption

(IT to OIJ

IT rule

OI rule

TT to TT

TF to FT

FT to TF

FF to FF

AA, AA, AN, AN, NA, NA, NN, NN,

AA, AA, NA, NA, AN, AN, NN, NN,

4 0 10 2 4 2 6 6

12 8 8 2 14 4 10 4

12 6 20 6 8 2 12 4

4 4 2 6 10 4 2 2

Tl T2 Tl T2 Tl T2 Tl T2

T2 Tl T2 Tl T2 Tl T2 Tl

assuming them to be interpreted in the same way, i.e. a straightforward comparison of IT and 01 rules by their original classification in Table 4. The matching score is the sum of the absolute differences between the frequencies in each of the three categories. For example, consider the matching score for AA rules with temporal order Tl on the case FT. From Table 4 we see that the IT rule had the classification T = 0, F = 10, ? = 6, while the 01 rule had the classification T = 1, F = 11, ? = 4. The absolute differences between each category are 1, 1 and 2 respectively, leading to a total score of 4. Clearly the lower the score, the greater the degree of match. Let us now consider what happens if the 01 rule is converted. For example, suppose the subject interprets p only if q as Zfq then p. As stated earlier, what has been classified as TF would be FT and vice versa; TT and FF would, on the other hand be unchanged. In addition, any converted rule

Language and reasoning 279

that was classified as time order Tl would in reality be T2, since the consequent has become the antecedent and vice versa. Similarly T2 sentences would, in reality, be Tls. Rules with a single negative are also affected. For example, an AN rule by conversion becomes NA, e.g. p on& if nor q becomes If not q then p. Similarly, NA rules become AN rules. AA and NN are not altered by conversion. Table 6(a) shows matching scores between IT rules and those cases of 01 rules which would be equivalent if’ the fatter were converted. Thus, for each case of IT rule we have matches to an 01 rule on the assumption of no conversion (Table 6(a)) and on the assumption of conversion (Table 6(b)). Of these 32 cases, 19 show a better match on the ‘same’ assumption, and 8 a better match on the ‘conversion’ assumption with 5 scoring equal. The predominance of ‘same’ matches falls just short of significance @
280 J. St. B. T. Evans ad S. E. Ntwstcad

decide what response is ‘correct’ for the cases FT and FF, but all proposed truth tables (Table 1) agree that TT should be classified as true and TF as false. Comparing the frequency of these correct classifications for IT rules (Table 4) we see that six are more frequent in the Tl than the corresponding T2 form, with two equal (p = 0.032, two-tailed binomial test). For 01 rules seven comparisons are more ‘correct’ again for the Tl than the T2 orders @ = 0.016, two-tailed test). Thus, there is no corresponding crossover effect on the response frequencies, but rather a significant overall benefit of the Tl order. This result is more damaging to the conversion hypothesis than the original interpretational hypothesis of the effect of temporality on IT and 01 rules. In the original hypothesis which assumes that rules are not converted, there is no reason why a main effect of temporal order cannot be observed in addition to an interaction. The conversion hypothesis, however, must predict that the effect of time order on IT rules will be reversed on 01 rules. Support for the general inadequacy of the conversion hypothesis as an explanation of the difference between reasoning on IT and 01 rules is also found in a recent experiment by Rips and Marcus (1977). On a related logical task (using affirmative rules only) Rips and Marcus find that reasoning with IT sentences differs from that on 01 sentences, but that the difference is mt consistent with an illicit conversion of the 01 rule. Passing on to other aspects of the data, it was noted earlier that logical and matching tendencies operated in a similar manner to that observed previously (Evans, 1972d; 1976). That is, there is an overall logical tendency to classify TT as true and TF as false. The significant overall matching bias tendency is apparently weakest on these two cases (see Table 4) suggesting some form of competition between the two tendencies. A formal stochastic model of reasoning has been postulated by Evans (1977b) and applied to data on Wason’s (1966, 1968) ‘selection task’. Although psychologically distinct as a paradigm from the truth table task used in the present paper, the selection task data are also subject to similar competing logical and matching tendencies. The model proposed by Evans (1977b) generates a probability of responding for each individual subject, which results from a weighted,addition of logical and matching tendencies (which are presumed to correspond to the main interpretational and operational factors in this situation). Without going into the details of the model, a simple parameter-free prediction can be derived and tested on the data of the present experiment. The important point about the Evans model is that it assumes that behaviour is probabilistic at the level of individual subjects, and thus opposes notions that subjects are in different states of ‘insight’ or adopting different strategies. Thus if the group frequencies are split fairly evenly over available

Language and reasoning

281

response categories the model would suppose that each individual subject has similar probabilities for each choice. Thus an observed variability of responding may be interpreted as conflict within individual subjects. This leads to the hypothesis that the greater the observed variability of responding in the frequency data, the higher should be the associated mean verification latency. In order to test the above hypothesis the following measure of variability of the 64 truth tables presented in Table 4 was taken: the absolute difference between the highest and lowest category frequency. With 16 subjects split between three cells this measure can range from 16 (minimum variability) to 1 (maximum variability). These 64 scores were correlated with the 64 mean verification latencies shown in Table 3. Highly significant support was found for the hypothesis (Kendalls tau B = -0.604, 2 = 6.84, p
1).

Finally, it is worthwhile to consider the general usefulness of the paradigm employed in the present experiment - especially in view of the difficulties of interpretation in this area, discussed at the outset. The paradigm is novel in that latency measures are taken on a task known to produce large variations in the frequency of responding. It is argued that the latencies do produce very useful additional data which can be interpreted sensibly. Of particular importance is the distinction between comprehension and verification latencies. The CT measure is particularly useful for distinguishing interpretational from operational factors: it may be regarded as a ‘pure’ measure of interpretation in that it is measured prior to the commencement of any reasoning operations. VT is harder to interpret owing to the concurrent variations in response frequency, although we have observed that a systematic relation between VT and response frequencies can be of theoretical importance. The parallel findings of significant Antecedent and Consequent main effects and Temporal Order times Rule Form interaction in both CT and VT analysis suggest that the latter may partly reflect similar interpretational factors. On the other hand the significant effect of Truth Table Case on VT clearly arises in the operational stage. The usefulness of taking both CT and response frequency measures is clearly supported by the present study. The psycholinguistic effects of temporality were demonstrated on CT and not on the response frequencies. On the other hand, detailed analysis of the response frequencies has aided considerably the problem of distinguishing two alternative interpretations of this main finding. It is concluded that the present paradigm has much to offer towards the better investigation of this difficult but interesting field of research.

282 J. St. B. T. Evans and S. E. Ncwstead

References Evans,

J. St. B. T. (1972a) On the problems of interpreting reasoning data: logical and psychological approaches. Cog., I, 373-384. Evans, J. St. B. T. (1972b) Reasoning with nepativcs. &if. J. Psychol., 63, 312-319. Evans, J. St. B. T. (1972~) Deductive reasoning and linguistic usage. Ph.D. thesis, University of London, unpublished. Evans J. St. B. T. (1972d) Interpretation and matching bias in a reasoning task. Quart. J. exp. Psl’chol., 24, 193-199. Evans, J. St. B. T. (1976) On interpreting reasoning data - a reply to Van Duyne. Cog., 3, 387-390. Evans, J. St. B. T. (1977a) Linguistic factors in reasoning. Quart. J. exp. Psychol., 29, 297-306. Evans, J. St. B. T. (1977b) Toward a statistical theory of reasoning. Quart. J. exp. Psychol., (in press). Evans J. St. B. T. and Lynch, J. S. (1973) Matching bias in the selection task. &if. J. Psychol., 64, 391-397. Falmagne, R. J. (cd.) (1975) Reasoning: representation andproccss. New York: Wiley. Henlc, M. (1962) On the relation between logic and thinking. Psychol. Rev., 67, 366-378. Johnson-Laird, P. N, LeprcnTi 1’. and Legrenzi, M.S. (1972) Reasoning and a sense of reality. Brit. J. Psychol., 63, 395400. Johnson-Laird, 1’. N. and Tagart, J. M. (1972) When negation is easier than affirmation. Quart. J. exp. Psychol., 24, 87-9 1. Revlis, R. (1975) Syllogistic reasoning: logical decisions from a complex data base. In: F. J. Falmagne (ed .) op. cit. Rips, L. J. and Marcus, S. L. (1977) Suppositions and the analysis of conditional sentences. In: P. A. Carpenter and M. A. Just (eds.) Cognifive processes in comprehension, (in press). Roberge, J. J. (1971) Some effects of negation on adults’ conditional reasoning abilities. Ps~&ol. reports, 29, 839444. Roberge, J. J. (1974) Effects of negation on adults’ comprehension of fallacious conditionals and disjunctive arguments.J. gen. Psychol., 91, 287-293. Robergc, J. J. (1976) Reasoning with exclusive disjunctive arguments. Quart. J. exp. Psychol., 28, 419427. Smedslund, J. (1970) On the circular relation between logic and understanding. Stand. J. Psychol., II, 217-219. Taplin, J. E. (1971) Reasoning with conditional scntcnces. J. I/erh. Learn Verb. Behav., 10, 219-225. Taplin, J. E. and Staudcnmaycr, H. (1973) Interpretation of abstract conditional sentences in deductive reasoning. J. Verh. Learn. Verh. Behav.. 12, 5 30-542. Trabasso, T. Rollins, H. and Shaugnessy, 1;. (1971) Storage and verification stages in processing concepts. Cog. Aychol., 2, 239-289. Wason, P. C. (1966) Reasoning. In B. M. Foss (ed.) New Horizons in Psychology I. Harmondsworth: Penguin. Wason, P. C. (1968) Reasoning about a rule. Quarf J. exp. Psychol., 20, 273-281.

Dans le p&sent article, on teste des hypothi.ses psycholinguistiques en utilisant des tlches de raisonnemcnt ddductif. Apr&s avoir pas& en revue certains problkmes d’interpritation, qui se sont essentiellemcnt post% pour les rkgles du conditionnel, on presente l’exp&ience. Cclle-ci mesure, en plus de la frdquence des rkponses, la comprdhension et la latence de la vkrification, dans une t&he d’&aluation de tables de &rite’.

Language and reasoning

28 3

L’expdrience a pour but de tester une hypothese psycholinguistique concernant l’utilisation differente de formes de phrases, logiquement Bquivalentes telles que: si p alors 4 et p seulement si q; en fonction de l’ordre temporel des 6vdnements p et q. Les auteurs suggerent que la premiere forme est plus naturelle lorsque l’dvenement p pr6cede l’evknement q et que la deuxieme l’est davantage pour l’ordre temporel inverse. Si l’analyse de l’absence tend a prouver significativement l’hypothese, i’analyse detaillee des fr& quences des rdponsesne permet cependant que de la differencier d’une autre explicative. Ceci confiime I’utilite g&&ale du paradigme adopti.